mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-06-13 23:36:45 +03:00
18cdb3819b
* fix(ec): make ecx-journal fold and shard rebuild crash-safe Two EC rebuild paths could silently lose or corrupt data: RebuildEcxFile folded the .ecj deletion journal into .ecx (in-place WriteAt tombstones) and then unlinked the journal without flushing the .ecx writes first. A crash could persist the unlink ahead of the tombstones, resurrecting deleted needles on the next load. It also read journal records with a bare n!=size break, so a torn tail silently dropped the remaining tombstones before the unlink. Now: read records with io.ReadFull (io.EOF ends cleanly, a torn tail aborts and leaves .ecj in place for retry), fsync .ecx before removing the journal. rebuildEcFiles treated a zero/short ReadAt as a clean end-of-input and discarded the read error, so a truncated or unreadable input shard produced truncated regenerated shards that were then published as restored redundancy; the regenerated shards were also never fsynced on the no-sidecar path. Now: derive the expected shard size from the present inputs up front (rejecting a divergent/zero-size input), drive the loop by that size, fail on any short read or short write, and fsync every regenerated shard before it is mounted/renamed. Rust volume server mirrors the rebuild fix: rebuild_ec_files now checks the read_at byte count (it previously discarded it, the same truncation bug). The Rust ecx fold already synced .ecx before removing the journal. Custom EC ratios are unaffected: the shard size derives from the input shards and the loop uses the .vif-resolved data/parity counts, never a hardcoded 10+4. * storage: close ecx journal files via defer in RebuildEcxFile Per review: a single deferred Close per file replaces the per-error-path manual closes, so new early returns cannot leak descriptors. The journal is still closed explicitly before its unlink since Windows cannot delete an open file; the deferred second Close is a harmless no-op.