mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-06-13 23:36:45 +03:00
f724828bcb
* fix(ec): never delete recoverable shards on startup/reconcile (size-direction + byte-exact .dat)
EC startup validation and the cross-disk reconcile could delete the only
copy of distributed-EC shards whenever a non-empty .dat sat beside them.
This is the same data-loss class as the empty-.dat-stub fix, now for a
real (non-empty) stale or partial .dat.
validateEcVolume: the discriminating signal is the shard size relative to
the .dat's full encode, not the shard count.
- shards smaller than expected: an interrupted local encode left partial
shards and the .dat is the complete source -> reclaim the .dat.
- shards equal to expected: a valid (or still-distributing) EC volume ->
keep; the shards may be the only copy.
- shards larger than expected: the .dat is the stale/partial side (e.g. an
interrupted decode left a half-written .dat next to the real shards) ->
keep.
Previously any size mismatch, a low shard count beside a .dat, or a
transient stat error returned "delete", wiping sole-copy shards. Now every
ambiguity (size mismatch in either direction, inconsistent shard sizes,
transient I/O error, partial shard set) keeps the data; only a credible
full source .dat with no partial set to lose is reclaimed.
handleFoundEcxFile: a shard load failure (corrupt/locked .ecx, EMFILE
during a mass restart, transient I/O) no longer deletes the EC files when a
.dat exists -- it only unloads and keeps the files for retry. All deletion
authority now flows through validateEcVolume.
pruneIncompleteEcWithSiblingDat: count shards NODE-WIDE (a set split across
sibling disks summing to >= dataShards is independently recoverable and is
left alone), and require the sibling .dat to byte-exactly match the size
.vif recorded at encode time before deleting -- the prior "at least this
big, or bigger than a superblock" gate could trust a stale .dat and wipe
sole-copy shards. EC encode records the source size in .vif, so this gate
works for real volumes; older volumes without it fail safe (kept).
Rust volume server mirrors all of the above: size-direction + keep-on-
ambiguity in validate_ec_volume, keep-on-load-failure in
handle_found_ecx_file, and the node-wide + byte-exact gate in the prune.
The Rust validate/prune paths now resolve the data-shard count from the
volume's own .vif instead of hardcoding 10+4, so custom-ratio volumes are
not mis-sized and wrongly deleted on reboot.
Existing tests that encoded the old (unsafe) "delete on low count / size
mismatch" behavior are updated to the safe expectation, and new regression
tests cover the partial-decode-.dat-keeps-shards and transient-error-keeps
cases (Go and Rust); they fail on the pre-fix code.
* fix(ec): record DatFileSize in planted EC .vif for the prune test; trim comments
The multi-disk lifecycle e2e test planted a partial EC leftover with an
empty .vif, so the byte-exact prune gate (which a real encoded volume
satisfies via its recorded source size) kept it instead of cleaning up.
Record DatFileSize + the EC ratio in the planted .vif, matching production.
Also condense the verbose comments added in this change to the repo's
concise style.
Volume Server Integration Tests
This package contains integration tests for SeaweedFS volume server HTTP and gRPC APIs.
Run Tests
Run tests from repo root:
go test ./test/volume_server/... -v
If a weed binary is not found, the harness will build one automatically.
Optional environment variables
WEED_BINARY: explicit path to theweedexecutable (disables auto-build).VOLUME_SERVER_IT_KEEP_LOGS=1: keep temporary test directories and process logs.
Current scope (Phase 0)
- Shared cluster/framework utilities
- Matrix profile definitions
- Initial HTTP admin endpoint checks
- Initial gRPC state/status checks
More API coverage is tracked in /Users/chris/dev/seaweedfs2/test/volume_server/DEV_PLAN.md.