Weed Worker
The weed worker command starts a plugin worker that connects to an admin server to detect and execute cluster maintenance jobs.
Overview
Workers are distributed maintenance agents that connect to the admin server via a bidirectional gRPC stream. Each worker registers its capabilities, receives detection and execution requests, and reports progress back to the admin scheduler.
Built-in job types:
| Job Type | Category | Description |
|---|---|---|
vacuum |
default | Reclaim disk space by removing deleted files from volumes |
volume_balance |
default | Redistribute volumes across servers to reduce skew |
admin_script |
default | Execute custom admin shell scripts |
erasure_coding |
heavy | Convert volumes to erasure-coded format for storage efficiency |
iceberg_maintenance |
heavy | Compact, expire snapshots, remove orphans for Iceberg tables |
Usage
weed worker [options]
Options
| Option | Default | Description |
|---|---|---|
-admin |
localhost:23646 |
Admin server address |
-id |
(auto-generated) | Worker ID (persisted to -workingDir when auto-generated) |
-jobType |
all |
Job types or categories to serve (comma-separated) |
-workingDir |
(empty) | Directory for persistent worker state (worker.id) |
-heartbeat |
15s |
Heartbeat interval to admin server |
-reconnect |
5s |
Reconnect delay after disconnection |
-maxDetect |
1 |
Maximum concurrent detection requests |
-maxExecute |
4 |
Maximum concurrent execution requests |
-metricsPort |
0 |
Prometheus metrics listen port (disabled when 0) |
-metricsIp |
0.0.0.0 |
Prometheus metrics listen IP |
-address |
(empty) | Worker address advertised to admin |
-debug |
false |
Enable pprof debug server |
-debug.port |
6060 |
pprof debug HTTP port |
Job Type Categories
The -jobType flag accepts a mix of categories and explicit job type names:
| Token | Resolves to |
|---|---|
all |
Every registered job type |
default |
Lightweight jobs: vacuum, volume_balance, admin_script |
heavy |
Resource-intensive jobs: erasure_coding, iceberg_maintenance |
| (explicit name) | A single job type by canonical name or alias |
Categories and explicit names can be combined freely:
# All registered job types (default behavior)
weed worker -admin=localhost:23646 -jobType=all
# Only lightweight maintenance jobs
weed worker -admin=localhost:23646 -jobType=default
# Only resource-intensive jobs on dedicated hardware
weed worker -admin=localhost:23646 -jobType=heavy
# Default category plus a specific heavy job
weed worker -admin=localhost:23646 -jobType=default,iceberg
# Explicit job types (aliases work too)
weed worker -admin=localhost:23646 -jobType=vacuum,ec
Job Type Aliases
Each job type accepts several aliases on the CLI:
| Canonical Name | Aliases |
|---|---|
vacuum |
vol.vacuum, volume.vacuum |
volume_balance |
balance, volume.balance, volume-balance |
erasure_coding |
ec, erasure-coding, erasure.coding |
admin_script |
admin, script, admin-script, admin.script |
iceberg_maintenance |
iceberg, iceberg-maintenance, iceberg.maintenance |
Examples
Basic Usage
# Start worker connecting to local admin server (all job types)
weed worker -admin=localhost:23646
# Connect to remote admin server
weed worker -admin=admin.example.com:23646
# Persist worker ID across restarts
weed worker -admin=localhost:23646 -workingDir=/var/lib/seaweedfs-worker
Specialised Workers
# Dedicated lightweight worker
weed worker -admin=localhost:23646 -jobType=default
# Dedicated heavy worker on beefy hardware
weed worker -admin=localhost:23646 -jobType=heavy -maxExecute=8
# Only vacuum
weed worker -admin=localhost:23646 -jobType=vacuum
# Named worker with specific job type
weed worker -admin=localhost:23646 -id=ec-worker-1 -jobType=erasure_coding
Monitoring
# Enable Prometheus /metrics, /health, /ready endpoints
weed worker -admin=localhost:23646 -metricsPort=9327
# Debug with pprof
weed worker -admin=localhost:23646 -debug -debug.port=6060
Worker Architecture
Worker Lifecycle
- Connect: Worker dials the admin server's gRPC plugin stream
- Hello: Worker sends
WorkerHellowith its capabilities and supported job types - Heartbeat: Worker sends periodic heartbeats reporting load
- Detection: Admin sends
RunDetectionRequest; worker proposes jobs - Execution: Admin sends
ExecuteJobRequest; worker runs the job and streams progress - Shutdown: Worker sends
WorkerShutdownon SIGINT/SIGTERM
Connection Details
- Protocol: Bidirectional gRPC stream (
plugin.proto) - Port: Admin HTTP port + 10000 by default (e.g., admin on 23646 → gRPC on 33646)
- Auto-discovery: Worker queries
GET /api/plugin/statusto resolve the gRPC port when it differs from the default offset - Security: Supports TLS using
[grpc.worker]insecurity.toml
How Scheduling Works
The admin server runs a single scheduler goroutine that processes job types sequentially, one group at a time. Each job type is a group — its detection and all resulting executions complete (or time out) before the next job type begins.
For each group:
- Detection: The scheduler picks a detector worker and sends a
RunDetectionRequest. The worker inspects cluster state and proposes work items (e.g., "vacuum volume 42"). - Filtering: The admin server deduplicates proposals against already-active jobs.
- Dispatch: Proposals are converted to jobs and dispatched to executor workers in parallel (up to
global_execution_concurrency). - Execution: Workers run the jobs, stream progress updates, and report completion.
Each group has a configurable max runtime (job_type_max_runtime_seconds, default 30 minutes). If the timeout fires, remaining jobs are canceled and the scheduler moves to the next job type. This prevents a slow job type from starving others.
Each job type also has independent settings for detection interval, concurrency, timeouts, and retries — all editable in the admin UI at /plugin. For a detailed walkthrough, see Plugin Worker Scheduling.
Configuration
Security Configuration
Workers read TLS configuration from security.toml:
[grpc.worker]
cert = "/etc/ssl/worker.crt"
key = "/etc/ssl/worker.key"
ca = "/etc/ssl/ca.crt"
Worker Identification
- Worker ID: Auto-generated (format
w-<hostname>-<random>) and persisted to<workingDir>/worker.id - Explicit ID: Set via
-idto override auto-generation
Adding a New Job Type
With the handler registry, adding a new job type requires minimal changes:
- Create the handler file implementing the
JobHandlerinterface - Add an
init()function that callsRegisterHandlerwith the job type, category, aliases, and build function
If the handler lives in the root plugin/worker package, no other files need to change.
If the handler lives in a subpackage (like plugin/worker/iceberg), add a blank import to the aggregator file plugin/worker/handlers/handlers.go so its init() runs:
import (
_ "github.com/seaweedfs/seaweedfs/weed/plugin/worker/iceberg"
_ "github.com/seaweedfs/seaweedfs/weed/plugin/worker/yourpkg" // add new subpackages here
)
The handler is then automatically available to all workers using all or the matching category.
Best Practices
Deployment
- Separate by category: Run
defaultworkers broadly,heavyworkers on dedicated nodes with more CPU/memory - Multiple workers: Deploy multiple workers for redundancy and throughput
- Stable identity: Use
-workingDirso worker IDs survive restarts - Resource sizing: Tune
-maxExecutebased on available resources
Troubleshooting
- Cannot connect to admin server: Verify address, check network, ensure admin is running, check gRPC port
- No tasks received: Verify
-jobTypeincludes the desired job types, check admin scheduler configuration - TLS failures: Check
security.tomlpaths and certificate validity - Debug logging:
weed worker -admin=... -v=4
Related Commands
weed admin: Start admin server that manages workersweed master: Start master serversweed volume: Start volume servers
See Also
- Plugin Worker Scheduling — how the admin server schedules and dispatches work
- Migrate Maintenance Scripts to Admin Script Plugin — migration guide from
master.tomlmaintenance scripts - Erasure Coding
Introduction
- Quick Start with weed mini
- Simplest S3 Bucket and User Setup
- Components
- Getting Started
- Production Setup
- A typical step‐by‐step example
- Benchmarks
- FAQ
- Applications
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- EC Bitrot Detection
- Server Startup via Systemd
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- File Operations Quick Reference
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
- TUS Resumable Uploads
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Management
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
- Filer Operation Serialization
FUSE Mount
- FIO benchmark
- fstab and systemd mount
- POSIX Compliance
- Distributed POSIX Locks
- P2P reading in weed mount
WebDAV
SFTP Server
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- Supported APIs vs Minio
- S3 Lifecycle
- S3 Lifecycle vs Volume TTL
- S3 Conditional Operations
- S3 CORS
- S3 Object Lock and Retention
- S3 Object Versioning
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 Rate Limiting
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
S3 Table Bucket
- S3 Table Bucket
- S3 Table Bucket Commands
- S3 Tables Security
- SeaweedFS Iceberg Catalog
- Iceberg Table Maintenance
Iceberg Integrations
- Spark Iceberg Integration
- Trino Iceberg Integration
- Dremio Iceberg Integration
- DuckDB Iceberg Integration
- Doris Iceberg Integration
- RisingWave Iceberg Integration
- Lakekeeper Iceberg Integration
S3 Authentication & IAM
- S3 Configuration - Start Here
- S3 Credentials (
-s3.config) - OIDC Integration (
-s3.iam.config) - Kubernetes ServiceAccount Authentication (IRSA-style)
- S3 Policy Variables
- S3 Policy Conditions
- S3 Bucket Policies
- Amazon IAM API
- AWS IAM CLI
- weed shell - Shell IAM Commands
Server-Side Encryption
S3 Client Tools
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Metadata Change Events
Messaging
- Structured Data Lake with SMQ and SQL
- Seaweed Message Queue
- SQL Queries on Message Queue
- SQL Quick Reference
- PostgreSQL-compatible Server weed db
- Pub-Sub to SMQ to SQL
- Kafka to Kafka Gateway to SMQ to SQL
Use Cases
Operations
- System Metrics
- weed shell
- Data Backup
- Deployment to Kubernetes and Minikube
- Deployment with seaweed-up
Rust Volume Server
Advanced
- Large File Handling
- Optimization
- Optimization for Many Small Buckets
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure
Security
- Security Overview
- Security Configuration
- Cryptography and FIPS Compliance
- Run Blob Storage on Public Internet