Files
seaweedfs/seaweed-volume
Chris Lu 2417ba0354 fix(volume): add authentication to destructive gRPC admin endpoints (#8876)
* fix(volume): add authentication to destructive gRPC admin endpoints

Three destructive VolumeServer gRPC endpoints (DeleteCollection,
VolumeDelete, VolumeServerLeave) had no authentication checks, unlike
their HTTP counterparts which are protected by the Guard whitelist.

Add IsWhiteListed(host) to security.Guard and a checkGrpcAdminAuth
helper on VolumeServer that extracts the peer IP from gRPC context and
validates it against the guard whitelist. Gate all three endpoints
behind this check.

* fix(volume): tolerate unparseable gRPC peer address in admin auth check

S3 Filer Group integration tests were failing with
PermissionDenied "bad peer address: address @: missing port in address"
when DeleteCollection ran across the in-process gRPC connection
between filer and volume server — the peer addr surfaces as "@" there
and net.SplitHostPort can't parse it. The check rejected before
IsWhiteListed could exercise its allow-all path for empty-whitelist
deployments.

Hand the raw peer string to IsWhiteListed when SplitHostPort fails.
With no whitelist configured (the test environment's mode) it accepts;
with a whitelist configured the unparseable host won't match anything
and the call still gets denied as it should.

Adds three regression tests for IsWhiteListed pinning the empty-config
allow-all, populated-list reject-unknown, and signing-key-only allow-
all branches that the gRPC admin helper relies on.

* refactor(security): dedup checkWhiteList through IsWhiteListed

The HTTP-side checkWhiteList and the gRPC-side IsWhiteListed had the
same lookup logic in two places; future drift was just a matter of
time. Have checkWhiteList delegate so the membership semantics live
in exactly one function.

Behaviour is unchanged: the new path still returns nil for
isEmptyWhiteList (signing-key-only mode) and still rejects unknown
hosts when a whitelist is configured.

Addresses gemini medium review on PR #8876.

* fix(volume): protect remaining state-altering gRPC admin endpoints

DeleteCollection, VolumeDelete, and VolumeServerLeave were the
truly-destructive endpoints, but AllocateVolume, VolumeMount,
VolumeUnmount, VolumeConfigure, VolumeMarkReadonly, and
VolumeMarkWritable also modify server state and should sit behind
the same whitelist gate. Read-only endpoints (VolumeStatus,
VolumeServerStatus, VolumeNeedleStatus, Ping) stay open.

The check is a no-op when no whitelist is configured (the default),
so existing deployments keep working; operators who lock down their
volume servers via guard.white_list now get consistent coverage.

Addresses gemini security-high review on PR #8876.

* fix(volume): typed peer addr + audit log for gRPC admin auth

Prefer a typed *net.TCPAddr when extracting the peer IP — string
parsing was already a fallback for the in-process case but using the
typed form first is cleaner and skips an unnecessary parse on the
common path. Log failed authorization attempts at V(0) so an operator
running with a whitelist sees the host that was rejected (and the
raw remote address in case the IP lookup itself was the failure
mode), matching what the HTTP Guard already does.

Addresses gemini medium review on PR #8876.

* fix(volume): protect vacuum + scrub + EC-shards-delete admin endpoints

Five more master/admin-driven destructive operations live outside
volume_grpc_admin.go and were missing the same whitelist gate:

- VacuumVolumeCompact, VacuumVolumeCommit, VacuumVolumeCleanup
- ScrubVolume
- VolumeEcShardsDelete

VacuumVolumeCheck stays open (read-only). BatchDelete also stays
open: it's the data-plane multi-object delete called from the S3 API
and filer, not an admin operation; gating it would break ordinary S3
DeleteObjects calls.

Addresses gemini security-high review on PR #8876.

* fix(volume): simplify no-peer-info branch in gRPC admin auth

The IsWhiteListed("") fallback was defending against a scenario
that doesn't actually arise — real gRPC connections always populate
peer info. Drop the branch and just deny when peer info is missing,
which is the safer default and matches "if we don't know who the
caller is, refuse".

* fix(volume-rust): mirror gRPC admin auth on the rust volume server

The rust volume server has the same set of destructive admin
endpoints as the Go side and the same Guard infrastructure, but
nothing was wired together — every endpoint accepted unauthenticated
calls regardless of guard configuration. Same vulnerability class
the Go fix on this PR closes; this commit closes it on the rust
side too so the two stacks stay aligned.

Adds VolumeGrpcService::check_grpc_admin_auth that pulls the peer
SocketAddr off the tonic Request and runs Guard::check_whitelist on
its IP, then applies the helper to the same set the Go side covers:
DeleteCollection, AllocateVolume, VolumeMount, VolumeUnmount,
VolumeDelete, VolumeMarkReadonly, VolumeMarkWritable,
VolumeConfigure, VacuumVolumeCompact, VacuumVolumeCommit,
VacuumVolumeCleanup, VolumeServerLeave, ScrubVolume,
VolumeEcShardsDelete. Read-only endpoints stay open; BatchDelete
stays open as a data-plane multi-object delete.
2026-05-04 21:14:55 -07:00
..

SeaweedFS Volume Server (Rust)

A drop-in replacement for the SeaweedFS Go volume server, rewritten in Rust. It uses binary-compatible storage formats (.dat, .idx, .vif) and speaks the same HTTP and gRPC protocols, so it works with an unmodified Go master server.

Building

Requires Rust 1.75+ (2021 edition).

cd seaweed-volume
cargo build --release

The binary is produced at target/release/seaweed-volume.

Running

Start a Go master server first, then point the Rust volume server at it:

# Minimal
seaweed-volume --port 8080 --master localhost:9333 --dir /data/vol1 --max 7

# Multiple data directories
seaweed-volume --port 8080 --master localhost:9333 \
  --dir /mnt/ssd1,/mnt/ssd2 --max 100,100 --disk ssd

# With datacenter/rack topology
seaweed-volume --port 8080 --master localhost:9333 --dir /data/vol1 --max 7 \
  --dataCenter dc1 --rack rack1

# With JWT authentication
seaweed-volume --port 8080 --master localhost:9333 --dir /data/vol1 --max 7 \
  --securityFile /etc/seaweedfs/security.toml

# With TLS (configured in security.toml via [https.volume] and [grpc.volume] sections)
seaweed-volume --port 8080 --master localhost:9333 --dir /data/vol1 --max 7 \
  --securityFile /etc/seaweedfs/security.toml

Common flags

Flag Default Description
--port 8080 HTTP listen port
--port.grpc port+10000 gRPC listen port
--master localhost:9333 Comma-separated master server addresses
--dir /tmp Comma-separated data directories
--max 8 Max volumes per directory (comma-separated)
--ip auto-detect Server IP / identifier
--ip.bind same as --ip Bind address
--dataCenter Datacenter name
--rack Rack name
--disk Disk type tag: hdd, ssd, or custom
--index memory Needle map type: memory, leveldb, leveldbMedium, leveldbLarge
--readMode proxy Non-local read mode: local, proxy, redirect
--fileSizeLimitMB 256 Max upload file size
--minFreeSpace 1 (percent) Min free disk space before marking volumes read-only
--securityFile Path to security.toml for JWT keys and TLS certs
--metricsPort 0 (disabled) Prometheus metrics endpoint port
--whiteList Comma-separated IPs with write permission
--preStopSeconds 10 Graceful drain period before shutdown
--compactionMBps 0 (unlimited) Compaction I/O rate limit
--pprof false Enable pprof HTTP handlers

Set RUST_LOG=debug (or trace, info, warn) for log level control. Set SEAWEED_WRITE_QUEUE=1 to enable batched async write processing.

Features

  • Binary compatible -- reads and writes the same .dat/.idx/.vif files as the Go server; seamless migration with no data conversion.
  • HTTP + gRPC -- full implementation of the volume server HTTP API and all gRPC RPCs including streaming operations (copy, tail, incremental copy, vacuum).
  • Master heartbeat -- bidirectional streaming heartbeat with the Go master server; volume and EC shard registration, leader failover, graceful shutdown deregistration.
  • JWT authentication -- signing key configuration via security.toml with token source precedence (query > header > cookie), file_id claims validation, and separate read/write keys.
  • TLS -- HTTPS for the HTTP API and mTLS for gRPC, configured through security.toml.
  • Erasure coding -- Reed-Solomon EC shard management: mount/unmount, read, rebuild, copy, delete, and shard-to-volume reconstruction.
  • S3 remote storage -- FetchAndWriteNeedle reads from any S3-compatible backend (AWS, MinIO, Wasabi, Backblaze, etc.) and writes locally. Supports VolumeTierMoveDatToRemote/FromRemote for tiered storage.
  • Needle map backends -- in-memory HashMap, LevelDB (via rusty-leveldb), or redb (pure Rust disk-backed) needle maps.
  • Image processing -- on-the-fly resize/crop, JPEG EXIF orientation auto-fix, WebP support.
  • Streaming reads -- large files (>1MB) are streamed via spawn_blocking to avoid blocking the async runtime.
  • Auto-compression -- compressible file types (text, JSON, CSS, JS, SVG, etc.) are gzip-compressed on upload.
  • Prometheus metrics -- counters, histograms, and gauges exported at a dedicated metrics port; optional push gateway support.
  • Graceful shutdown -- SIGINT/SIGTERM handling with configurable preStopSeconds drain period.

Testing

Rust unit tests

cd seaweed-volume
cargo test

Go integration tests

The Go test suite can target either the Go or Rust volume server via the VOLUME_SERVER_IMPL environment variable:

# Run all HTTP + gRPC integration tests against the Rust server
VOLUME_SERVER_IMPL=rust go test -v -count=1 -timeout 1200s \
  ./test/volume_server/grpc/... ./test/volume_server/http/...

# Run a single test
VOLUME_SERVER_IMPL=rust go test -v -count=1 -timeout 60s \
  -run "TestName" ./test/volume_server/http/...

# Run S3 remote storage tests
VOLUME_SERVER_IMPL=rust go test -v -count=1 -timeout 180s \
  -run "TestFetchAndWriteNeedle" ./test/volume_server/grpc/...

Load testing

A load test harness is available at test/volume_server/loadtest/. See that directory for usage instructions and scenarios.

Architecture

The server runs three listeners concurrently:

  • HTTP (Axum 0.7) -- admin and public routers for file upload/download, status, and stats endpoints.
  • gRPC (Tonic 0.12) -- all VolumeServer RPCs from the SeaweedFS protobuf definition.
  • Metrics (optional) -- Prometheus scrape endpoint on a separate port.

Key source modules:

Path Description
src/main.rs Entry point, server startup, signal handling
src/config.rs CLI parsing and configuration resolution
src/server/volume_server.rs HTTP router setup and middleware
src/server/handlers.rs HTTP request handlers (read, write, delete, status)
src/server/grpc_server.rs gRPC service implementation
src/server/heartbeat.rs Master heartbeat loop
src/storage/volume.rs Volume read/write/delete logic
src/storage/needle.rs Needle (file entry) serialization
src/storage/store.rs Multi-volume store management
src/security.rs JWT validation and IP whitelist guard
src/remote_storage/ S3 remote storage backend

See DEV_PLAN.md for the full development history and feature checklist.