When managing large clusters, it's common to add more volume servers, have some servers go down, or replace others. These changes can lead to missing volume replicas or an uneven distribution of volumes across the servers.
Optimize volumes
See Optimization page on how to optimize for concurrent writes and concurrent reads.
Configure volume management scripts
Maintenance scripts are managed by the admin script plugin worker. Start the admin server and a worker:
# Start admin server (connects to master)
weed admin -master=localhost:9333
# Start worker (connects to admin server)
weed worker -admin=localhost:23646
The admin script plugin has a built-in default script:
ec.balance -apply
fs.log.purge -daysAgo=7
volume.deleteEmpty -quietFor=24h -apply
volume.fix.replication -apply
s3.clean.uploads -timeAgo=24h
The script and run interval (default: 17 minutes) are configurable from the admin UI at /plugin.
Several commands that were previously part of the maintenance script now have dedicated plugin workers:
ec.encodeis replaced by theerasure_codingplugin worker. See Erasure Coding for warm storage for details.volume.balanceis replaced by thevolume_balanceplugin worker, which detects imbalanced servers and moves volumes automatically.
See the Worker page for more details on weed worker options and capabilities.
Legacy note: Previously, maintenance scripts were configured in
master.tomlunder[master.maintenance]. That mechanism still exists as a fallback but is automatically skipped when an admin server is connected. When migrating, the admin server automatically imports yourmaster.tomlmaintenance scripts as the default admin script configuration. See Migrate Maintenance Scripts to Admin Script Plugin for details.
Fix missing volumes
When running large clusters, it is common that some volume servers are down. If a volume is replicated and one replica is missing, the volume will be marked as readonly.
One way to fix is to find one healthy copy and replicated to other servers, to meet the replication requirement. This volume id will be marked as writable.
In weed shell, the command volume.fix.replication will do exactly that, automating the replication fixing process. You can start a crontab job to periodically run volume.fix.replication to ensure the system health.
Balance volumes
When running large clusters, it is common to add more volume severs, or some volume servers are down, or some volume servers are replaced. These topology changes can cause unbalanced number of volumes on volume servers.
In weed shell, the command volume.balance will generate a balancing plan, and volume.balance -force will execute the balancing plan and move the actual volumes.
The balancing plan will try to evenly spread the number of writable and readonly
For each type of volume server (different max volume count limit){
for each collection {
balanceWritableVolumes()
balanceReadOnlyVolumes()
}
}
func balanceWritableVolumes(){
idealWritableVolumes = totalWritableVolumes / numVolumeServers
for {
sort all volume servers ordered by the number of local writable volumes
pick the volume server A with the lowest number of writable volumes x
pick the volume server B with the highest number of writable volumes y
if y > idealWritableVolumes and x+1 <= idealWritableVolumes {
if B has a writable volume id v that A does not have {
move writable volume v from A to B
}
}
}
}
func balanceReadOnlyVolumes(){
//similar to balanceWritableVolumes
}
Add volumes
Run weed shell and volume.mount -node <host>:<port> -volumeId <id> to mount a volume file.
To mount all new volume files you can send a hang-up signal to the volume server causing a reload with a command such as pkill -HUP -f "weed volume".
Servicing live volumes
When dealing with hardware storage issues, it can be useful to prevent writes to volume servers without stopping the service altogether - f.ex. on volumes with RAID storage backends. Volume servers support a maintenance mode for this: when enabled, the server becomes read-only. Reads will succeed, but any write attempt will error out.
Maintenance mode can be managed via the volumeServer.state shell command:
> volumeServer.state
192.168.10.111:9007 -> Maintenance mode: no
192.168.10.111:9008 -> Maintenance mode: no
192.168.10.111:9009 -> Maintenance mode: no
> volumeServer.state --nodes 192.168.10.111:9009 --maintenanceOn
192.168.10.111:9009 -> Maintenance mode: yes
> volumeServer.state
192.168.10.111:9007 -> Maintenance mode: no
192.168.10.111:9008 -> Maintenance mode: no
192.168.10.111:9009 -> Maintenance mode: yes
Maintenance mode is a sticky server state flag. Changes are effective immediately, and will persist even if the server is restarted.
Introduction
- Quick Start with weed mini
- Simplest S3 Bucket and User Setup
- Components
- Getting Started
- Production Setup
- A typical step‐by‐step example
- Benchmarks
- FAQ
- Applications
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- EC Bitrot Detection
- Server Startup via Systemd
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- File Operations Quick Reference
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
- TUS Resumable Uploads
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Management
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
- Filer Operation Serialization
FUSE Mount
- FIO benchmark
- fstab and systemd mount
- POSIX Compliance
- Distributed POSIX Locks
- P2P reading in weed mount
WebDAV
SFTP Server
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- Supported APIs vs Minio
- S3 Lifecycle
- S3 Lifecycle vs Volume TTL
- S3 Conditional Operations
- S3 CORS
- S3 Object Lock and Retention
- S3 Object Versioning
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 Rate Limiting
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
S3 Table Bucket
- S3 Table Bucket
- S3 Table Bucket Commands
- S3 Tables Security
- SeaweedFS Iceberg Catalog
- Iceberg Table Maintenance
Iceberg Integrations
- Spark Iceberg Integration
- Trino Iceberg Integration
- Dremio Iceberg Integration
- DuckDB Iceberg Integration
- Doris Iceberg Integration
- RisingWave Iceberg Integration
- Lakekeeper Iceberg Integration
S3 Authentication & IAM
- S3 Configuration - Start Here
- S3 Credentials (
-s3.config) - OIDC Integration (
-s3.iam.config) - Kubernetes ServiceAccount Authentication (IRSA-style)
- S3 Policy Variables
- S3 Policy Conditions
- S3 Bucket Policies
- Amazon IAM API
- AWS IAM CLI
- weed shell - Shell IAM Commands
Server-Side Encryption
S3 Client Tools
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Metadata Change Events
Messaging
- Structured Data Lake with SMQ and SQL
- Seaweed Message Queue
- SQL Queries on Message Queue
- SQL Quick Reference
- PostgreSQL-compatible Server weed db
- Pub-Sub to SMQ to SQL
- Kafka to Kafka Gateway to SMQ to SQL
Use Cases
Operations
- System Metrics
- weed shell
- Data Backup
- Deployment to Kubernetes and Minikube
- Deployment with seaweed-up
Rust Volume Server
Advanced
- Large File Handling
- Optimization
- Optimization for Many Small Buckets
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure
Security
- Security Overview
- Security Configuration
- Cryptography and FIPS Compliance
- Run Blob Storage on Public Internet