seaweedfs

mirror of https://github.com/seaweedfs/seaweedfs.git synced 2026-06-13 23:36:45 +03:00

Author	SHA1	Message	Date
Chris Lu	8776b9d311	feat(filer): object size distribution metric and dashboard panels (#9902 ) * feat(filer): record object size distribution histogram Add SeaweedFS_filer_object_size_bytes, a histogram sampled when an object is first created in the filer namespace, covering every write protocol (S3, WebDAV, FUSE mount, direct HTTP). Buckets follow the 1KB/100KB/1MB/100MB/1GB ranges operators use to size collections. Directories, overwrites, and metadata-only updates are not sampled, so the bucket counts track the size distribution of distinct objects. * feat(metrics): add filer object size distribution dashboard panels Add a write-rate-by-size-range graph and a size-distribution bar gauge, driven by SeaweedFS_filer_object_size_bytes, to the standalone and Helm Grafana dashboards. Per-range subtractions are clamped at zero so transient negative rate() samples do not render below the axis.	2026-06-09 10:41:11 -07:00
Chris Lu	7b07d8177a	fix(filer.sync): scope filesystem key sanitization to the local sink (#9894 ) * fix(filer.sync): scope filesystem key sanitization to the local sink destKey ran every sink key through escapeKey, whose Windows build strips colons. Colons are illegal in NTFS filenames so the local sink needs that, but s3/filer/azure/gcs/b2 accept them as ordinary key bytes — stripping them silently diverged the destination key (a source a:b replicated as ab). Move the sanitization into the local sink behind a Windows build tag, applied at every entry point so the previously-unescaped in-place-update paths stay consistent. Non-local sinks now keep the raw key; non-Windows builds are unchanged; a leading drive-letter colon is preserved. * test(filer.sync): cover incremental destKey and localsink update/delete sanitization Lock the colon-preserving behavior for the incremental destKey branch, and extend the Windows local-sink test to assert UpdateEntry and DeleteEntry also sanitize the key, not just CreateEntry.	2026-06-09 10:18:49 -07:00
Jaehoon Kim	202517c02a	fix(filer.backup): skip replay events whose source chunk was superseded or deleted (#9886 ) * fix(filer.backup): skip replay events whose chunk no longer exists on the source "Source" is the filer we replicate FROM (e.g. green in a green->blue backup). Replaying the metadata log from a checkpoint can hit an event whose chunk was since overwritten/deleted and garbage-collected on the source volume. Fetching it returns 0 bytes (a permanent size mismatch), which the sink propagated to the subscription — so the same offset retried forever and replication stalled. Skip the event only when proven stale; otherwise keep refusing so genuine loss of a live file still halts loudly: - onCorruptChunk centralizes the three errChunkSizeMismatch sites. - getEntryMtimeNs compares mtime at nanosecond precision so same-second rewrites (git's config.lock dance) are ordered correctly. - sourceSupersedes re-reads the entry's current state on the source: gone (ErrNotFound) or a strictly-newer mtime than the replayed version -> skip; any other lookup error keeps the entry. Skipping is lossless: events are full-entry snapshots, so a later event re-carries the current chunks and a delete event reconciles a removed file. * test(filer.backup): cover the superseded-chunk skip decision - TestSourceSupersedes: not-found (sentinel / wrapped / gRPC string) and nil entry -> skip; network error -> keep; source newer -> skip; same/older -> keep. - TestGetEntryMtimeNs: nanosecond precision, same-second ordering, nil safety. - TestOnCorruptChunkRefusesWhenSupersessionUnconfirmed: never skip silently when supersession cannot be confirmed. * fix(filer.backup): don't infer supersession for incremental sinks In incremental mode the sink key carries a date prefix (sinkDir/YYYY-MM-DD/relPath) that cannot be reversed to a real source path, so a source lookup would always be ErrNotFound and wrongly classify a live entry as deleted — skipping it. Make targetPathToSourcePath report "unmappable" in incremental mode; hasSourceNewerVersion already declines to skip when the source path cannot be mapped. Found in code review. Non-incremental sinks (filer.backup green->blue) are unaffected. * refactor(filer.backup): name the mtime param sourceMtimeNs; note ns overflow bound - Rename the threaded sourceMtime parameter to sourceMtimeNs across the internal replicate/fetch helpers so the unit is explicit (it only feeds hasSourceNewerVersion, which compares in nanoseconds). - Document that getEntryMtimeNs's int64 ns arithmetic is safe until ~year 2262. No behavior change. * fix(filer.backup): order same-second versions in the CreateEntry skip and update gates The CreateEntry already-replicated short-circuit and chooseUpdateAction still compared second-grained mtime, so a newer version written within the same second could be skipped as already-replicated or overwritten by an older same-second replay. Route both through getEntryMtimeNs, matching the precision the chunk-replication path already uses. * test(filer.backup): cover same-second update-action ordering * docs(filer.backup): trim verbose comments to terse why * fix(filer.backup): check supersession against the rename's new path For a rename the filer sink updates in place (the delete+create branch is skipped for sink name "filer"), so the corrupt-chunk supersession check queried the pre-rename key. Its source-side ErrNotFound was read as "superseded", silently advancing the checkpoint without applying the rename. Map the incoming entry's new path (newParentPath/newEntry.Name) for both update branches. * fix(filer.backup): detect a deleted source even when the replayed mtime is epoch hasSourceNewerVersion returned early when sourceMtimeNs <= 0, skipping the source lookup, so a deleted entry with mtime 0 (a valid epoch timestamp) never got the gone verdict and wedged on permanent retries. Always look up; gate only the newer-mtime comparison on a valid replayed mtime. --------- Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-06-09 08:53:29 -07:00
7y-9	1cf92f6c2e	fix(s3api): clear stale object lock years (#9890 ) Problem: Re-storing object-lock default retention with Days left a previous Years extended attribute in place, so later loads could see both Days and stale Years. Root cause: StoreObjectLockConfigurationInExtended only wrote period fields that were set on the new configuration and did not delete old Days or Years keys before writing the replacement rule. Fix: Clear stored default-retention Days and Years keys before writing the current default retention period fields. Reproduction: go test ./weed/s3api -run TestStoreObjectLockConfigurationClearsStaleYears -count=1 failed before the fix because the stale years key remained. Validation: go test ./weed/s3api -run TestStoreObjectLockConfigurationClearsStaleYears -count=1; go test ./weed/s3api -count=1; git diff --check; git diff --cached --check Co-authored-by: Codex <noreply@openai.com>	2026-06-09 00:48:38 -07:00
Chris Lu	7aba10fa1a	fix(mongodb): merge URI auth fields with username/password override (#9889 ) * fix(mongodb): merge URI auth fields with username/password override SetAuth replaced the whole Credential parsed from the URI, dropping AuthSource and AuthMechanism. Start from the URI-parsed Auth and only override the username and password so credentials scoped to a specific auth database keep working. * fix(mongodb): set PasswordSet for explicit credentials Required by GSSAPI auth when a password is supplied; ignored for other mechanisms.	2026-06-09 00:18:33 -07:00
Chris Lu	2871e6552a	fix(s3api): drop ancestor directory markers from prefixed ListObjectVersions (#9885 ) processExplicitDirectory appended a directory-key object as a version without checking it against the prefix. A versioned listing descends through ancestor markers to reach a deeper prefix, so every ancestor (Veeam/, Veeam/Backup/, ...) leaked into Versions even though none of them match the prefix - which makes Veeam's immutable repository scan abort on an unexpected key. Guard on the prefix so only keys at or under it surface, matching ListObjectsV2 and AWS.	2026-06-09 00:01:06 -07:00
7y-9	d569dd686f	fix(shell): move files into existing destination directories (#9887 ) * fix(shell): move files into existing destination directories Problem: fs.mv /src/file /dst/dir treats an existing destination directory as a destination file path, so it renames the source to /dst/dir instead of moving it into /dst/dir/file. Root cause: commandFsMv builds the destination LookupDirectoryEntryRequest with Directory and Name swapped, so the destination directory lookup misses. Fix: Populate LookupDirectoryEntryRequest with Directory=destinationDir and Name=destinationName before deciding whether the destination is a directory. Reproduction: env GOCACHE=/private/tmp/seaweedfs-go-cache go test ./weed/shell -run TestFsMvMovesIntoExistingDestinationDirectory -count=1 Validation: gofmt -w weed/shell/command_fs_mv.go weed/shell/command_fs_mv_test.go; git diff --check; git diff --cached --check; env GOCACHE=/private/tmp/seaweedfs-go-cache go test ./weed/shell -run TestFsMvMovesIntoExistingDestinationDirectory -count=1; env GOCACHE=/private/tmp/seaweedfs-go-cache go test ./weed/shell -count=1 * Update weed/shell/command_fs_mv_test.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-06-08 23:42:13 -07:00
Chris Lu	1c9039d3ac	fix(seaweed-volume): stop EC shard deletion from phantom .dat on restart (#9874 ) * fix(seaweed-volume): stop EC shard deletion from phantom .dat on restart On startup load_existing_volumes() scans .vif/.idx entries (not just .dat). For distributed EC, a volume's .vif can be mirrored onto a disk whose .ecx lives on a sibling disk, so the per-disk ecx check is false and the loader falls through to Volume::new, which always creates the .dat if missing -> a phantom 8-byte superblock stub. The store-level prune_incomplete_ec_with_sibling_dat then treats that stub as the authoritative source and deletes the real EC shards on sibling disks. Go guards the same case (disk_location.go: 'Without this guard NewVolume below would create a phantom empty .dat') but only same-disk. Fix A (root cause): in load_existing_volumes, don't create a .dat during load. Skip the entry when there is no local .dat AND the .vif does not reference remote files -- remote-tiered volumes have no local .dat but must still load via the remote path. Uses the robust check_dat_file_exists helper so a transient stat error doesn't skip a real volume. New volumes go through create_volume(). Covers the cross-disk .vif/.ecx split Go's same-disk hasEcxFile() misses. Fix B (defense in depth, Go + Rust): when the EC .vif records no source size (dat_file_size==0), require the sibling .dat to be strictly larger than a bare superblock, so an empty 8-byte stub can't pass the credibility gate. Previously it fell back to SUPER_BLOCK_SIZE, which an 8-byte stub exactly meets. Adds regression tests reproducing the cross-disk lone-.vif phantom and the 8-byte stub gate; updates an existing prune test to use a real collection so its .ecx lookup matches the loaders. * fix(storage): don't create phantom .dat from lone .vif on Go volume load Mirror Fix A on the Go side. loadExistingVolume scans .vif/.idx entries, and for distributed EC a .vif can be mirrored onto a disk whose .ecx is on a sibling disk. The same-disk hasEcxFile() guard does not fire there, so the loader falls through to NewVolume(createDatIfMissing=true) and writes an 8-byte phantom .dat, which the sibling-.dat prune then uses to delete the real EC shards on sibling disks. Skip the entry when there is no local .dat AND the .vif has no remote file (via MaybeLoadVolumeInfo); remote-tiered volumes have no local .dat but must still load. Adds TestLoneVifDoesNotCreatePhantomDat (fails without the guard) and TestRemoteTier_DiskScanLoadsRemoteOnlyVolume (fails if the guard skips a remote-only volume).	2026-06-08 22:10:16 -07:00
dependabot[bot]	a5f9c55479	build(deps): bump github.com/redis/go-redis/v9 from 9.19.0 to 9.20.0 (#9867 ) Bumps [github.com/redis/go-redis/v9](https://github.com/redis/go-redis) from 9.19.0 to 9.20.0. - [Release notes](https://github.com/redis/go-redis/releases) - [Changelog](https://github.com/redis/go-redis/blob/master/RELEASE-NOTES.md) - [Commits](https://github.com/redis/go-redis/compare/v9.19.0...v9.20.0) --- updated-dependencies: - dependency-name: github.com/redis/go-redis/v9 dependency-version: 9.20.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 22:08:47 -07:00
dependabot[bot]	837c6f82a6	build(deps): bump io.netty:netty-codec-http2 from 4.2.13.Final to 4.2.15.Final in /test/java/spark (#9882 ) build(deps): bump io.netty:netty-codec-http2 in /test/java/spark Bumps [io.netty:netty-codec-http2](https://github.com/netty/netty) from 4.2.13.Final to 4.2.15.Final. - [Release notes](https://github.com/netty/netty/releases) - [Commits](https://github.com/netty/netty/compare/netty-4.2.13.Final...netty-4.2.15.Final) --- updated-dependencies: - dependency-name: io.netty:netty-codec-http2 dependency-version: 4.2.15.Final dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 22:08:25 -07:00
dependabot[bot]	fc6ea8c2da	build(deps): bump io.netty:netty-transport-native-epoll from 4.2.13.Final to 4.2.15.Final in /test/java/spark (#9881 ) build(deps): bump io.netty:netty-transport-native-epoll Bumps [io.netty:netty-transport-native-epoll](https://github.com/netty/netty) from 4.2.13.Final to 4.2.15.Final. - [Release notes](https://github.com/netty/netty/releases) - [Commits](https://github.com/netty/netty/compare/netty-4.2.13.Final...netty-4.2.15.Final) --- updated-dependencies: - dependency-name: io.netty:netty-transport-native-epoll dependency-version: 4.2.15.Final dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 22:08:14 -07:00
dependabot[bot]	2945f7e226	build(deps): bump io.netty:netty-handler from 4.2.13.Final to 4.2.15.Final in /test/java/spark (#9875 ) build(deps): bump io.netty:netty-handler in /test/java/spark Bumps [io.netty:netty-handler](https://github.com/netty/netty) from 4.2.13.Final to 4.2.15.Final. - [Release notes](https://github.com/netty/netty/releases) - [Commits](https://github.com/netty/netty/compare/netty-4.2.13.Final...netty-4.2.15.Final) --- updated-dependencies: - dependency-name: io.netty:netty-handler dependency-version: 4.2.15.Final dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 22:08:07 -07:00
7y-9	7bbd28634a	fix(util): return full uint64 randomness (#9864 ) Problem: RandomUint64 generated eight random bytes but returned int32, truncating the value before mount file and directory handles converted it to uint64. This reduced handle entropy to 32 bits and produced sign-extended handle values.\n\nRoot cause: the helper cast BytesToUint64 to int32 and exposed int32 as its return type.\n\nFix: make RandomUint64 return uint64 and return the full BytesToUint64 result.\n\nReproduction: go test ./weed/util -run TestRandomUint64ReturnsUint64 -count=1 failed before the fix because RandomUint64() had kind int32.\n\nValidation: gofmt -w weed/util/bytes.go weed/util/bytes_test.go; git diff --check; go test ./weed/util -run TestRandomUint64ReturnsUint64 -count=1; go test ./weed/util -count=1; go test ./weed/mount -count=1; git diff --cached --check	2026-06-08 22:07:24 -07:00
dependabot[bot]	4a96f624d9	build(deps): bump gocloud.dev/pubsub/rabbitpubsub from 0.45.0 to 0.46.0 (#9870 ) Bumps [gocloud.dev/pubsub/rabbitpubsub](https://github.com/google/go-cloud) from 0.45.0 to 0.46.0. - [Release notes](https://github.com/google/go-cloud/releases) - [Commits](https://github.com/google/go-cloud/compare/v0.45.0...v0.46.0) --- updated-dependencies: - dependency-name: gocloud.dev/pubsub/rabbitpubsub dependency-version: 0.46.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 22:06:54 -07:00
Chris Lu	3fadbef3eb	feat(admin): export full cluster volume list as JSON (#9876 ) Adds an "Export All (JSON)" button on the Cluster Volumes page that pulls the whole cluster's volume list from the master in one call, a superset of volume.list. Beyond the table columns it carries garbage and fullness ratios, modified time, compact revision, remote tiering keys, per-disk capacity counts, EC shard sizes with file/delete counts, and a cluster-wide duplicate-volume-id scan. Honors the active collection filter. The existing per-page CSV export stays as "Export Page".	2026-06-08 15:01:02 -07:00
Chris Lu	ed470dccb1	mini: grow volumes one at a time Mini auto-sizes a few large volume slots, but the master pre-grows 7 volumes per new collection. Under a filer group each S3 bucket is its own collection, so the first buckets claimed every slot and later writes failed to assign a volume. Cap mini's volume_growth copy counts to 1.	2026-06-08 14:51:40 -07:00
Chris Lu	d67fc48fbd	fix(filer.sync): guard batched events against nil EventNotification (#9877 ) * fix(filer.sync): guard batched events against nil EventNotification The server folds a backlog into one response: the first event in the top-level fields, the rest in resp.Events, and the pipelined sender can drain an idle heartbeat (nil EventNotification) into that tail. Only the envelope got the freshness-signal guard, so a batched heartbeat reached AddSyncJob and nil-derefed in IsEmpty while replaying a backlog buffered during a peer outage. Route every event, envelope and batched, through one handler that sends freshness signals (nil heartbeat, empty marker) to OnIdleHeartbeat. * fix(filer): guard MetaAggregator batched events against nil EventNotification The peer subscription's envelope is nil-guarded but its batched tail was not. The aggregator doesn't enable idle heartbeats today, so the server can't fold a nil EventNotification into the batch yet, but make the two loops consistent so it can't nil-deref if that changes.	2026-06-08 13:56:16 -07:00
Chris Lu	4c050ad76b	Don't mangle filer paths with the OS separator on Windows (#9878 ) fix: don't mangle filer paths with the OS separator on Windows filepath.Dir/Join use the platform separator, so on Windows they rewrite a forward-slash filer path like /buckets/x into \buckets\x. The mangled value then goes into a filer RPC and operates on the wrong key, so the op silently targets nothing. The admin file browser hit this in New Folder (the entry landed under \buckets\my-bucket and never showed up under /buckets/my-bucket), and the same way in delete, view and properties. MQ topic retention and consumer-offset listing, and the SFTP home dir plus create-permission parent lookup, had the same bug. Switch all of these to the path package, which always uses "/".	2026-06-08 13:56:02 -07:00
Chris Lu	8cc10460b4	fix(remote): correct content and permissions when syncing/caching remote objects (#9879 ) * fix(remote): reject short reads when caching remote objects A short read from the remote (stale listing size, truncated or flaky response) was silently zero-padded: the S3 and Azure clients pre-size the buffer and discard the downloaded byte count, and the chunk is recorded with the requested size. The cached file then matched the expected size but its tail was NULL, and the entry was marked cached so it never re-fetched. Check the byte count against the requested size in both clients, and add a backend-agnostic guard in FetchAndWriteNeedle. The cache now fails loudly and the entry stays remote-only for a later retry. * fix(remote): match S3 default modes when syncing remote metadata Remote object listings carry no POSIX mode, so synced entries were created with a hardcoded 0644. Against a SeaweedFS remote, whose S3 layer writes objects as 0660 and auto-creates directories as 0771 (0660\|0111), the mounted copy ended up 0644/0755 and the permissions visibly diverged from the source. Default to the S3 modes instead (files 0660, directories 0771). The filer derives parent-dir modes from the child as fileMode\|0111, so fixing the file default also brings the directories into line. Directory mtimes still reflect sync time: S3 listings don't enumerate directories, so the remote's directory timestamps aren't available.	2026-06-08 13:55:53 -07:00
dependabot[bot]	6475d22774	build(deps): bump github.com/apache/cassandra-gocql-driver/v2 from 2.1.0 to 2.1.1 (#9869 ) build(deps): bump github.com/apache/cassandra-gocql-driver/v2 Bumps [github.com/apache/cassandra-gocql-driver/v2](https://github.com/apache/cassandra-gocql-driver) from 2.1.0 to 2.1.1. - [Release notes](https://github.com/apache/cassandra-gocql-driver/releases) - [Changelog](https://github.com/apache/cassandra-gocql-driver/blob/trunk/CHANGELOG.md) - [Commits](https://github.com/apache/cassandra-gocql-driver/compare/v2.1.0...v2.1.1) --- updated-dependencies: - dependency-name: github.com/apache/cassandra-gocql-driver/v2 dependency-version: 2.1.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 12:20:46 -07:00
dependabot[bot]	feb3adfc21	build(deps): bump github.com/aws/aws-sdk-go-v2 from 1.41.10 to 1.41.12 (#9871 ) Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.41.10 to 1.41.12. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.41.10...v1.41.12) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2 dependency-version: 1.41.12 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 12:20:11 -07:00
dependabot[bot]	70f18fa4d0	build(deps): bump golang.org/x/sync from 0.20.0 to 0.21.0 (#9868 ) Bumps [golang.org/x/sync](https://github.com/golang/sync) from 0.20.0 to 0.21.0. - [Commits](https://github.com/golang/sync/compare/v0.20.0...v0.21.0) --- updated-dependencies: - dependency-name: golang.org/x/sync dependency-version: 0.21.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-08 12:19:10 -07:00
Chris Lu	5a4ff2a122	fix(mq): don't cache topic non-existence on transient filer errors TopicExists and getTopicConfFromCache negative-cached a topic for the full 30s TTL whenever a filer lookup failed for any reason, including timeouts. A topic created earlier then looked gone until the TTL expired, and the metadata auto-create path couldn't heal it (CreateTopic rejects an already-persisted conf), so producers saw UNKNOWN_TOPIC_OR_PARTITION. Only negative-cache on a definitive ErrNotFound; let transient errors fall through and retry against the filer.	2026-06-08 12:04:48 -07:00
7y-9	b408705f5b	fix(s3api): accept HTTP-date conditionals (#9863 ) * fix(s3api): accept HTTP-date conditionals Problem: Object conditional headers rejected valid HTTP-date values in RFC850 or ANSIC format for If-Modified-Since and If-Unmodified-Since. Root cause: parseConditionalHeaders used time.Parse(time.RFC1123), accepting only one HTTP-date representation instead of the standard formats accepted by net/http.ParseTime. Fix: Parse conditional date headers with http.ParseTime so RFC1123, RFC850, and ANSIC HTTP-date forms are accepted. Reproduction: go test ./weed/s3api -run TestParseConditionalHeadersAcceptsHTTPDateFormats -count=1 failed before the fix with ErrInvalidRequest for RFC850 and ANSIC date values. Validation: env GOCACHE=/private/tmp/seaweedfs-go-cache go test ./weed/s3api -run TestParseConditionalHeadersAcceptsHTTPDateFormats -count=1; env GOCACHE=/private/tmp/seaweedfs-go-cache go test ./weed/s3api -count=1; git diff --check; git diff --cached --check * fix(s3api): accept HTTP-date copy-source conditionals Mirror the put-path http.ParseTime switch onto the copy-source If-Modified-Since / If-Unmodified-Since headers, which still rejected valid RFC850 and ANSIC dates. * fix(s3api): keep RFC1123 UTC-zone dates working alongside http.ParseTime http.ParseTime rejects the "UTC" zone that Go clients emit via t.UTC().Format(time.RFC1123), which the old RFC1123 parser accepted. Add a parseHTTPDate helper that tries http.ParseTime first and falls back to RFC1123, so the put and copy-source conditional date headers accept the union of HTTP-date formats plus the UTC zone. --------- Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-06-08 01:12:07 -07:00
Chris Lu	78da9572ae	4.32 4.32	2026-06-07 23:37:57 -07:00
Jaehoon Kim	1b5f1c1f3b	feat(filer.backup): -initialSnapshot re-seeds a reinitialized destination (#9828 ) * feat(filer.backup): add -resetCheckpoint to force a fresh sync filer.backup resumes from a per-sink offset persisted in the source filer's KV. There was no first-class way to discard that checkpoint and re-run from the beginning short of guessing a large -timeAgo, which also skips -initialSnapshot. Add -resetCheckpoint: before reading the offset, write 0 for this sink so getOffset returns 0, isFreshSync stays true, and -initialSnapshot re-runs a full walk. Effective only when -timeAgo is 0. The flag is cleared after the first successful reset: runFilerBackup retries doFilerBackup forever on error, so leaving it set would re-zero the checkpoint on every retry and never make forward progress after a transient failure. Later retries resume from the persisted checkpoint instead. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(filer.backup): keep fresh-sync intent when offset read fails after reset After -resetCheckpoint writes offset 0, a transient getOffset read-back error flipped isFreshSync to false, which skipped the -initialSnapshot walk the reset explicitly requested. Track that the reset happened this iteration and, on a getOffset error, preserve isFreshSync=true in that case (the non-reset path keeps treating a read error as "not fresh" to avoid re-walking on transients). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(filer.backup): skip offset read-back on reset instead of tracking a flag Replace the didReset bool by branching: on -resetCheckpoint, clear the offset and start fresh without reading it back (we just wrote 0, so the state is known); otherwise read the offset as before. This drops the redundant getOffset RPC after a reset and removes the read-back error case entirely, so no separate flag is needed to preserve isFreshSync. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * filer.backup: -initialSnapshot re-seeds on every start; drop -resetCheckpoint -initialSnapshot now walks the live tree whenever -timeAgo is 0, seeds the destination, and overwrites the saved checkpoint, rather than running only on a fresh sync. That re-seeds a reinitialized destination on its own, so the separate -resetCheckpoint flag is gone. The walk runs once per process: the in-memory flag is cleared after the watermark is persisted, so the retry loop resumes from the persisted checkpoint instead of re-walking on every transient error. A process restart re-walks, so remove the flag once the backup is caught up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-06-07 23:35:53 -07:00
Chris Lu	9053d61504	rust release: fix large-disk/normal binary overwrite + publish md5 checksums (#9862 ) * rust release: publish .md5 checksums alongside weed-volume binaries The versioned rust volume release built and uploaded the tarballs/zips but no checksum sidecars (the Go releases get .md5 automatically via go-release-action; this workflow uses softprops/action-gh-release directly). Generate an .md5 next to each asset (md5sum on linux/windows-bash, md5 -r on macOS) and include them in the release/artifact uploads, so downloaders (e.g. seaweed-up, which verifies md5 before installing weed-volume) can check integrity. Covers linux amd64+arm64, darwin amd64+arm64, windows amd64. * rust release: build large-disk and normal into separate target dirs Both cargo builds wrote to target/<triple>/release/weed-volume, so the second (normal, --no-default-features) overwrote the first, and the Package step then copied that same binary into BOTH tarballs — the large-disk asset actually shipped the normal binary. Build each variant into its own --target-dir (target/large-disk and target/normal, both under target/ so the existing cache still covers them) and copy each tarball's binary from its own dir.	2026-06-07 23:20:33 -07:00
Chris Lu	8a4fdf06c0	admin/maintenance: reload in-flight tasks on startup instead of discarding them (#9857 ) * admin/maintenance: reload in-flight tasks on startup instead of discarding LoadTasksFromPersistence deleted all persisted task files on startup and relied on the scanner to re-detect, so saved task state was never consumed — the persistence was effectively write-only. Reload non-terminal tasks (pending/assigned/in_progress) into the queue, resetting in-flight ones to pending since their worker is gone after a restart (maintenance tasks are idempotent). Terminal task files are dropped; the scanner still backfills anything not persisted. * address review: nil-guard reloaded tasks and SyncTask to ActiveTopology - skip nil entries from LoadAllTaskStates (corrupted state) - re-sync restored tasks with MaintenanceIntegration so ActiveTopology (in-memory, empty on startup) knows about them; otherwise GetNextTask's AssignTask rejects them as unknown and they never get assigned	2026-06-07 22:45:38 -07:00
Chris Lu	7c542128c7	vacuum: compact a read-only volume when an explicit volumeId is given (#9861 ) * vacuum: compact a read-only volume when an explicit volumeId is given The on-demand path no longer skips read-only volumes, so an operator can reclaim a benignly read-only (full/oversized) volume without marking it writable first. The background scan and all-volumes sweep still skip read-only, where the flag usually signals an unhealthy disk. * vacuum: copy locationList under lock for on-demand vacuum The volumeId>0 path passed the live vid2location entry into the async vacuum, where heartbeat-driven Register/UnRegister can mutate the slice concurrently. Snapshot it under accessLock, matching the sweep path.	2026-06-07 22:42:51 -07:00
Chris Lu	a549580e65	ec.balance: verify shard landed on destination before deleting the source (#9858 ) * ec.balance: verify shard(s) landed on the destination before deleting source The EC balance task copied/mounted a shard to the destination and then immediately unmounted+deleted it from the source, reporting success as soon as the RPCs returned. A copy/mount can return OK while the shard isn't actually registered/loadable on the destination, so deleting the source then loses the shard (and the scanner re-issues the same move every cycle). Add a verification step (VolumeEcShardsInfo via VerifyShardsAcrossServers, the same check the EC encode task uses before deleting originals): if the destination doesn't report every moved shard, fail the task and keep the source so the move is retried instead of losing data. * address review: use comma-ok when reading destination shard inventory	2026-06-07 21:31:53 -07:00
7y-9	e6ab9e7b09	fix(s3api): reject zero default retention years (#9860 ) Problem: Default object-lock retention accepted an explicitly provided Years value of zero, even though a default retention period must be positive when present. Root cause: validateDefaultRetention rejected zero Days but only rejected negative Years, leaving YearsSet with Years=0 as a successful validation path. Fix: Treat an explicitly provided zero Years value as ErrInvalidRetentionPeriod, matching the existing Days validation. Reproduction: go test ./weed/s3api -run TestValidateDefaultRetention -count=1 failed before the fix because the Zero years case returned nil. Validation: go test ./weed/s3api -run TestValidateDefaultRetention -count=1; go test ./weed/s3api -count=1; git diff --check; git diff --cached --check	2026-06-07 20:53:45 -07:00
Chris Lu	f9d3105e80	ec placement: spread EC shards evenly across machines, not onto the lowest-id one (#9855 ) * ec placement: steer shards to less-loaded machines, not the lowest id EC encode places every volume against one shared topology snapshot (it reserves the shards it assigns so later volumes see reduced capacity), but node selection ranked only by this volume's shard count and broke ties by sorted id. So the lowest-id machine won the first shard of every volume and accumulated far more total shards than the rest -- on a 6-machine cluster the first machines drifted to ~1.5x. Rank eligible nodes by the machine's shards of this volume, then the machine's free capacity, then the node's shards of this volume, then the node's free capacity. Free capacity reflects the load already placed, so ties steer toward the least-loaded machine instead of the lowest id, keeping total EC shards even across machines. * test: ec.balance converges to even per-machine load from a skew Starts machine 10.0.0.1 at 4 shards/volume and the rest at 2, then runs repeated worker-style capped passes; asserts convergence to an even per-machine total (reaches exactly even in ~13 rounds). * reduce comments on the placement fix Trim narration to the non-obvious why. * test: assert convergence and count zero-shard machines Seed the per-machine map with every host so a fully drained machine still registers, and fail explicitly if balance doesn't converge before the round cap.	2026-06-07 20:45:17 -07:00
Chris Lu	89cbb1c558	admin: default -dataDir to "." so maintenance task state persists across restarts (#9856 ) admin: default -dataDir to "." so maintenance task state persists Previously -dataDir defaulted to empty, so the admin ran maintenance in memory only: task state was never saved and maintenance tasks (notably EC balance/rebuild) were re-issued every scan cycle without converging, churning EC shards (moves landed shards without their .ecx index, leaving EC volumes unloadable/missing shards). Default -dataDir to "." (the process working directory, which under the standard systemd unit is the admin's data dir) so state persists out of the box.	2026-06-07 20:45:03 -07:00
Chris Lu	f0d2a0d417	Treat co-located volume servers as one fault domain when balancing and allocating (#9854 ) * admin/topology: carry the volume server address on DiskInfo The planning DiskInfo exposed only the node id, which can be an opaque label rather than ip:port. Record the address too so callers can resolve the physical machine a disk sits on. * ec.balance: spread a volume's shards across machines, not just nodes Volume servers sharing a host are one fault domain, but the within-rack spread treated them as independent nodes, so one box could end up holding more shards of a volume than EC can afford to lose. Add a machine (host) tier between rack and node: the within-rack pass spreads each volume across machines, and the global load phase no longer re-concentrates a volume onto a machine it already sits on. Host defaults to the node id, so clusters with one server per host are unchanged. * ec placement: prefer machines holding fewer of a volume's shards EC allocation and repair picked the least-loaded node in a rack with no regard for which physical machine it sits on, so a volume's shards could pile onto several servers of one box. Rank candidate nodes by their machine's shard count first, then the node's own. The machine is derived from the volume server address carried on DiskInfo, falling back to the node id, matching how the balancer resolves it. * volume.balance: don't move a replica onto a machine already holding one isGoodMove only rejected a move onto the same data node, so two replicas could land on two volume servers of one box and a single machine failure would lose both. Reject a target whose host already holds another replica of the volume. Best-effort: balancing simply skips and tries the next target. * volume allocation: spread same-rack replicas across machines PickNodesByWeight filled the same-rack replica picks by weight alone, so replicas could co-locate on one box. Prefer candidates on not-yet-used hosts, falling back when too few distinct machines exist. Data-center and rack tiers have no host, so their ordering is unchanged. * ec.balance: harden machine spread against re-concentration and capped machines Two cases where the machine-aware spread could still leave a volume badly placed: - The global load phase could move a shard of a volume onto a machine that already held it, raising that machine's count and undoing the within-rack spread (a 4/4/3/3 layout could become 3/5/3/3, past parity for 10+4). Limit the load-only fallback to same-machine moves, which leave a machine's count unchanged; cross-machine concentration is no longer allowed for load alone. - The within-rack spread chose a destination machine by free slots alone, so if that machine's only nodes were already at the SameRackCount cap it skipped the move instead of trying another machine. Require a machine to have a node that can actually take the shard before selecting it. * reduce comments across the machine-affinity change Trim narration down to the non-obvious why; one terse line where a block was overkill. * ec.balance: gate machine spread on fault-tolerance feasibility Spreading a volume evenly across machines only helps when there are enough that each can stay within EC's parity tolerance (numMachines >= ceil(total/parity)). With fewer -- or wildly unequal -- machines it can't make a machine loss survivable anyway, and forcing it fights capacity: e.g. a cluster of 12 volume servers on one host and 2 on another would have half of every volume crammed onto the 2-server box. So spread across machines only when it's achievable; otherwise fall back to per-node spread and let capacity/global balancing decide. The global load phase applies the same test: it protects a volume's machine spread (no cross-machine move that raises a machine's count past the source's) only where that spread is achievable, so heterogeneous clusters still level by fullness. * ec.balance worker: group servers by host when planning The worker built its planner topology without recording each server's host, so automated ec.balance treated ports on one machine as independent nodes and could concentrate a volume's shards on one physical box. Set the host from the volume server address, matching the shell path. * volume.balance worker: don't move a replica onto a machine holding one The worker compared only node ids, and the replica map dropped the server address, so it could move replicas onto different ports of one machine. Carry the host on ReplicaLocation (from the server address) and reject a target whose host already holds another replica of the volume. Best-effort, matching the shell. * ec.balance: judge machine-spread feasibility by the rack's shards The within-rack and global feasibility checks compared the whole volume's shard count against a rack's machine count, so a rack holding only part of a volume after cross-rack spreading -- e.g. 7 of a 10+4 volume across 2 machines -- was wrongly judged infeasible and fell back to node spread, which could pile 6 shards onto one host, past parity. Gate on the rack's own shard count of the volume instead. * ec.balance: spread a volume's shards across machines by combined count EC recovers from any loss within parity regardless of shard type, so what bounds a machine's exposure is its total shards of the volume, not data and parity separately. Spreading the two independently let each type's remainder land on the same machine -- ceil(d/M)+ceil(p/M) can exceed ceil(total/M), e.g. a 5/3 split where 4/4 was achievable, past parity. Balance the combined count in one pass; disk-level data/parity anti-affinity stays in pickBestDiskOnNode. * ec.balance: don't let the imbalance threshold skip an over-parity machine The within-rack spread gated on relative skew ((max-min)/avg > threshold), so a worker threshold of 0.5 skipped an exactly-50%-skewed layout like 5/4/3 for a 10+4 volume, leaving 5 shards -- past parity -- on one machine. The even cap (ceil(shards/groups)) is the real bound and the move loop already sheds only what exceeds it, so drop the threshold gate from the within-rack phase (machine and node): a balanced rack stays a no-op while any over-cap machine is always fixed. * ec.balance: keep the imbalance threshold for the node fallback Dropping the threshold from the whole within-rack phase made the node fallback too eager: it runs only when machine fault tolerance is unachievable, so it is cosmetic load distribution that should defer to the global utilization phase. Without the gate it would, for a one-server-per-host 6/4 split at threshold 0.5, schedule a count move that worsens utilization balance. Restore the threshold there; machine spreading keeps bypassing it, since that bound is durability, not cosmetic skew.	2026-06-07 14:14:45 -07:00
7y-9	25f36cd13d	fix(s3api): require space in v2 auth prefix (#9852 ) * fix(s3api): require space in v2 auth prefix Problem: Signature V2 Authorization headers with a malformed algorithm token such as AWSX... are accepted as if they were AWS ... headers. Root cause: validateV2AuthHeader checks HasPrefix("AWS") but then slices past an assumed trailing space, so an extra character after AWS is skipped and the rest is parsed as credentials. Fix: Require the Authorization header to start with the exact AWS plus space prefix before parsing fields. Reproduction: go test ./weed/s3api -run 'TestValidateV2AuthHeader/algorithm_prefix_without_space\|TestDoesSignV2Match/malformed_auth_-_no_space_after_AWS' -count=1 fails before the fix because AWSXAKIA... is accepted. Validation: go test ./weed/s3api -run 'TestValidateV2AuthHeader/algorithm_prefix_without_space\|TestDoesSignV2Match/malformed_auth_-_no_space_after_AWS' -count=1; go test ./weed/s3api -count=1; git diff --check; git diff --cached --check * Update weed/s3api/auth_signature_v2.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-06-07 11:52:09 -07:00
7y-9	99bb5db1e3	fix(needle): use discovered file content type (#9851 ) Problem: Multipart uploads where the first part was a form field and a later part contained the file used the first part's Content-Type for the file metadata. Root cause: After finding a later part with a filename, parseUpload copied data and MD5 from part2 but read Content-Type from the original part variable. Fix: Read Content-Type from the discovered file part. Reproduction: go test ./weed/storage/needle -run TestParseUploadUsesDiscoveredFilePartContentType -count=1 failed before the fix because the parsed MIME type was text/plain instead of application/x-seaweed-test. Validation: go test ./weed/storage/needle -run TestParseUploadUsesDiscoveredFilePartContentType -count=1; go test ./weed/storage/needle -count=1; git diff --check; git diff --cached --check	2026-06-07 11:50:34 -07:00
Chris Lu	058569c77b	operation: index VidCache by map instead of slice (#9853 ) VidCache.cache was a []VidInfo indexed directly by volume id, so caching one volume with a large id grew the backing array to that many entries (each 48 bytes), allocating a zeroed slot for every unused id below it. A single id of 32M cost ~1.5GB resident, plus geometric realloc churn as the append loop doubled the array. Use map[uint32]VidInfo so memory scales with the number of volumes actually cached rather than the largest id seen. Parse ids with ParseUint(.,32) so values outside the uint32 volume-id range are rejected instead of silently wrapping into a key.	2026-06-07 11:46:57 -07:00
Chris Lu	755af4adf4	s3: actually bind outbound connections when -ip.bind is set (#9849 ) * s3: set outbound bind IP before the first filer dial Standalone weed s3 dialed the filer for GetFilerConfiguration before SetOutboundLocalIP ran, so that gRPC conn was created with the stock dialer and no source address. gRPC caches conns by address and reuses the original dialer on reconnect, so the s3->filer connection kept leaving from the OS-chosen source for the life of the process even after the bind IP was set a moment later. * grpc: install the outbound-bind dialer unconditionally The dialer was installed only when OutboundLocalAddr was already set at GrpcDial time, baking the source-address decision into the cached conn, so a conn dialed before the bind IP was configured never bound. Install the context dialer always and decide per dial: bind through OutboundDialContext once a source is set, otherwise fall back to the stock net.Dialer so default deployments keep gRPC's dial timeout and keepalive behavior. The bind now applies on the next reconnect regardless of ordering, matching the HTTP transport's unconditional DialContext.	2026-06-07 10:20:58 -07:00
Chris Lu	0e9fc6c5ba	worker: drop ec.balance from the default admin script (#9848 ) The dedicated ec_balance task worker handles EC shard balancing now, so the periodic admin script no longer needs to run it.	2026-06-07 00:55:11 -07:00
Chris Lu	b2127c86f4	admin: show S3 servers under Cluster (#9847 ) * s3: register data center with master on startup * admin: show S3 servers under Cluster * admin: add S3 servers to the dashboard	2026-06-07 00:32:20 -07:00
Chris Lu	01637410e2	test(s3): address review feedback on the versioning suite (#9846 ) - Different-users bucket test: use getNewBucketName() so the bucket carries the tracked prefix and run id and gets swept if the test leaks, instead of an untracked name. - Makefile: clarify that '.' matches the opt-in stress tests but they self-skip without ENABLE_STRESS_TESTS, so they don't execute in the default run. - Versioned list test: guard the Object.Size dereference with require.NotNil.	2026-06-06 20:50:09 -07:00
Chris Lu	d321f9efb4	s3: collapse suspended-versioning deletes onto one null marker (#9845 ) A suspended-versioning DELETE was recorded with createDeleteMarker, which mints a fresh real version id each time, so repeated suspended deletes piled up delete markers instead of overwriting a single null marker as S3 specifies. Record the suspended delete as a 'null' marker with a fixed file name (v_null) and point the latest-version pointer at it explicitly; putSuspendedVersioningObject's existing null-version cleanup removes it on the next suspended PUT, so the object undeletes cleanly and at most one null marker exists. Enabled-versioning deletes are unchanged (still distinct historical markers). Update TestSuspendedVersioningDeleteBehavior to the AWS-correct counts: one null marker after a suspended delete, and the null marker plus one real marker after a re-enabled delete.	2026-06-06 20:49:38 -07:00
Chris Lu	fa9bf58c86	test(s3): make the whole versioning suite pass and gate it in CI (#9844 ) * test(s3): correct bucket-recreate expectations and cover the different-owner case A same-owner CreateBucket on an existing bucket returns BucketAlreadyOwnedByYou (idempotent recreate); the suite expected BucketAlreadyExists, which only applies when the name is owned by someone else. Fix the same-owner cases (plain and Object-Lock) and implement the previously-skipped different-owner test, which now exercises the BucketAlreadyExists path via a second identity. * test(s3): assert the deletion invariant for suspended-versioning delete A suspended-versioning DELETE removes the null version and records a delete marker so the object reads as deleted; the test expected no marker, which would let an older version resurface. Assert that a marker is recorded (and read DeleteMarker through aws.ToBool) rather than an exact count, so it holds whether or not the suspended-marker id/dedup is later collapsed to AWS's single null marker. * test(s3): run the whole versioning suite by default TEST_PATTERN was TestVersioning, which left bucket-creation, suspended-delete and directory/version-listing tests ungated. Default to '.' so every test runs; opt-in stress tests self-skip without ENABLE_STRESS_TESTS and keep their own targets.	2026-06-06 18:38:28 -07:00
Chris Lu	795349d796	test(s3): deref Object.Size in versioned list assertion (#9843 ) TestVersionedObjectListBehavior compared int64 against listedObject.Size, which is *int64, so the assertion always failed on a type mismatch once reached. Dereference it (and in the log line).	2026-06-06 18:02:36 -07:00
Chris Lu	309cb32416	s3: list directory key objects in versioned bucket version listings (#9842 ) ListObjectVersions gated explicit directory objects on Mime == FolderMimeType, but an SDK PutObject of "dir/" carries a default Content-Type (e.g. application/octet-stream), so those directory keys were dropped from the version listing while ListObjectsV2 - which keys off IsDirectoryKeyObject (any non-empty mime) - still showed them. Use the same IsDirectoryKeyObject check so the two listings agree. The directory test's storage-class assertion compared an ObjectStorageClass constant against ObjectVersion.StorageClass (ObjectVersionStorageClass); the values matched but the SDK enum types did not, so it only surfaced once the directories started appearing. Use the matching constant.	2026-06-06 18:02:33 -07:00
Chris Lu	6c1fd3aeab	s3: rescan .versions when the cached latest pointer is missing on a list (#9841 ) * s3: rescan .versions when the cached latest pointer is missing on a list ListObjectsV2 resolves each versioned object's current version from the latest-version pointer cached on the .versions directory entry. When that pointer is absent on the filer serving the list, the object was dropped from the listing. Fall back to a read-only rescan of .versions/ to pick the newest version - the version files are present locally even when the cached pointer is not - so the object still lists. This mirrors the read path's recoverLatestVersionWithoutPointer; the scan loop is shared. Read-only by design: a list can touch many objects, so it does not persist a pointer. * s3: copy scanned Extended before stamping the version id	2026-06-06 18:02:30 -07:00
Chris Lu	9ede92a7cc	filer: replicate RECOMPUTE_LATEST pointer updates to peers (#9840 ) applyRecomputeLatest wrote the .versions latest-version pointer and the demoted prior version's stamp through UpdateEntry without a following NotifyUpdateEvent, so neither change entered the metadata log. Across filers the pointer then lived only on whichever filer ran the mutation, and ListObjects served by any other filer dropped those objects from a versioned bucket. Emit the events the way PATCH_EXTENDED already does, keeping a pre-update image for the notification diff.	2026-06-06 18:02:28 -07:00
Chris Lu	6e16994615	s3: make lifecycle TTL fast path per-bucket opt-in (#9825 ) Stamping an Expiration.Days rule as a volume TTL at write time bakes an irreversible TTL into the object: removing or lengthening the rule later can't un-expire it, unlike worker-driven expiration. The metadata-only delete it enables also skips per-chunk DeleteFile, so dead bytes linger in a not-yet-expired TTL volume with no deleted-byte accounting until the whole volume ages out. Gate the resolver on a per-bucket flag, off by default; toggle with the s3.bucket.lifecycle.fastpath shell command. Default writes take the worker path: real deletes that honor current policy and let vacuum reclaim space.	2026-06-06 11:20:15 -07:00
Aleksei Sviridkin	3688be82f5	fix(helm): deduplicate all-in-one extra environment variables (#9837 ) * fix(helm): deduplicate all-in-one extra environment variables The all-in-one Deployment looped global.seaweedfs.extraEnvironmentVars and allInOne.extraEnvironmentVars in two separate ranges, so any key present in both maps was emitted as two env entries with conflicting values. It also computed a merged map for the cluster-default lookup but never used it for the env loop. Use the existing seaweedfs.mergeExtraEnvironmentVars helper (as the filer, master and s3 templates already do) so a key set in both maps renders once with the component value taking precedence, and add a chart-CI render assertion covering it. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * ci(helm): drop checkmark glyphs from chart test output --------- Signed-off-by: Aleksei Sviridkin <f@lex.la> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-06-05 15:31:18 -07:00
Aleksei Sviridkin	ae4ad6859d	fix(helm): suspend bucket versioning for YAML bool false (#9836 ) * fix(helm): suspend bucket versioning for YAML bool false createBuckets[].versioning accepts both a YAML bool and a string. The string branch maps "false"/"disable"/"suspended" to Suspended, but the bool branch only handled true (Enabled) and left false as a silent no-op. The same logical value therefore behaved differently depending on its YAML type: `versioning: false` did nothing while `versioning: "false"` suspended the bucket. Mirror the string behaviour in the bool branch so bool false suspends the bucket, and add a chart-CI render assertion covering it. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la> * ci(helm): trim versioning regression-test comment * chart: document bool false for createBuckets versioning --------- Signed-off-by: Aleksei Sviridkin <f@lex.la> Co-authored-by: Chris Lu <chris.lu@gmail.com>	2026-06-05 15:18:10 -07:00

1 2 3 4 5 ...

14132 Commits