Table of Contents

Weed Worker

Job Type Aliases

Examples

Basic Usage
Specialised Workers
Monitoring

Worker Architecture

Worker Lifecycle
Connection Details
How Scheduling Works

Configuration

Security Configuration
Worker Identification

Adding a New Job Type
Best Practices

Deployment
Troubleshooting

Related Commands
See Also

Weed Worker

The weed worker command starts a plugin worker that connects to an admin server to detect and execute cluster maintenance jobs.

Overview

Workers are distributed maintenance agents that connect to the admin server via a bidirectional gRPC stream. Each worker registers its capabilities, receives detection and execution requests, and reports progress back to the admin scheduler.

Built-in job types:

Job Type	Category	Description
`vacuum`	default	Reclaim disk space by removing deleted files from volumes
`volume_balance`	default	Redistribute volumes across servers to reduce skew
`admin_script`	default	Execute custom admin shell scripts
`erasure_coding`	heavy	Convert volumes to erasure-coded format for storage efficiency
`iceberg_maintenance`	heavy	Compact, expire snapshots, remove orphans for Iceberg tables

Usage

weed worker [options]

Options

Option	Default	Description
`-admin`	`localhost:23646`	Admin server address
`-id`	(auto-generated)	Worker ID (persisted to `-workingDir` when auto-generated)
`-jobType`	`all`	Job types or categories to serve (comma-separated)
`-workingDir`	(empty)	Directory for persistent worker state (worker.id)
`-heartbeat`	`15s`	Heartbeat interval to admin server
`-reconnect`	`5s`	Reconnect delay after disconnection
`-maxDetect`	`1`	Maximum concurrent detection requests
`-maxExecute`	`4`	Maximum concurrent execution requests
`-metricsPort`	`0`	Prometheus metrics listen port (disabled when 0)
`-metricsIp`	`0.0.0.0`	Prometheus metrics listen IP
`-address`	(empty)	Worker address advertised to admin
`-debug`	`false`	Enable pprof debug server
`-debug.port`	`6060`	pprof debug HTTP port

Job Type Categories

The -jobType flag accepts a mix of categories and explicit job type names:

Token	Resolves to
`all`	Every registered job type
`default`	Lightweight jobs: vacuum, volume_balance, admin_script
`heavy`	Resource-intensive jobs: erasure_coding, iceberg_maintenance
(explicit name)	A single job type by canonical name or alias

Categories and explicit names can be combined freely:

# All registered job types (default behavior)
weed worker -admin=localhost:23646 -jobType=all

# Only lightweight maintenance jobs
weed worker -admin=localhost:23646 -jobType=default

# Only resource-intensive jobs on dedicated hardware
weed worker -admin=localhost:23646 -jobType=heavy

# Default category plus a specific heavy job
weed worker -admin=localhost:23646 -jobType=default,iceberg

# Explicit job types (aliases work too)
weed worker -admin=localhost:23646 -jobType=vacuum,ec

Job Type Aliases

Each job type accepts several aliases on the CLI:

Canonical Name	Aliases
`vacuum`	`vol.vacuum`, `volume.vacuum`
`volume_balance`	`balance`, `volume.balance`, `volume-balance`
`erasure_coding`	`ec`, `erasure-coding`, `erasure.coding`
`admin_script`	`admin`, `script`, `admin-script`, `admin.script`
`iceberg_maintenance`	`iceberg`, `iceberg-maintenance`, `iceberg.maintenance`

Examples

Basic Usage

# Start worker connecting to local admin server (all job types)
weed worker -admin=localhost:23646

# Connect to remote admin server
weed worker -admin=admin.example.com:23646

# Persist worker ID across restarts
weed worker -admin=localhost:23646 -workingDir=/var/lib/seaweedfs-worker

Specialised Workers

# Dedicated lightweight worker
weed worker -admin=localhost:23646 -jobType=default

# Dedicated heavy worker on beefy hardware
weed worker -admin=localhost:23646 -jobType=heavy -maxExecute=8

# Only vacuum
weed worker -admin=localhost:23646 -jobType=vacuum

# Named worker with specific job type
weed worker -admin=localhost:23646 -id=ec-worker-1 -jobType=erasure_coding

Monitoring

# Enable Prometheus /metrics, /health, /ready endpoints
weed worker -admin=localhost:23646 -metricsPort=9327

# Debug with pprof
weed worker -admin=localhost:23646 -debug -debug.port=6060

Worker Architecture

Worker Lifecycle

Connect: Worker dials the admin server's gRPC plugin stream
Hello: Worker sends WorkerHello with its capabilities and supported job types
Heartbeat: Worker sends periodic heartbeats reporting load
Detection: Admin sends RunDetectionRequest; worker proposes jobs
Execution: Admin sends ExecuteJobRequest; worker runs the job and streams progress
Shutdown: Worker sends WorkerShutdown on SIGINT/SIGTERM

Connection Details

Protocol: Bidirectional gRPC stream (plugin.proto)
Port: Admin HTTP port + 10000 by default (e.g., admin on 23646 → gRPC on 33646)
Auto-discovery: Worker queries GET /api/plugin/status to resolve the gRPC port when it differs from the default offset
Security: Supports TLS using [grpc.worker] in security.toml

How Scheduling Works

The admin server runs a single scheduler goroutine that processes job types sequentially, one group at a time. Each job type is a group — its detection and all resulting executions complete (or time out) before the next job type begins.

For each group:

Detection: The scheduler picks a detector worker and sends a RunDetectionRequest. The worker inspects cluster state and proposes work items (e.g., "vacuum volume 42").
Filtering: The admin server deduplicates proposals against already-active jobs.
Dispatch: Proposals are converted to jobs and dispatched to executor workers in parallel (up to global_execution_concurrency).
Execution: Workers run the jobs, stream progress updates, and report completion.

Each group has a configurable max runtime (job_type_max_runtime_seconds, default 30 minutes). If the timeout fires, remaining jobs are canceled and the scheduler moves to the next job type. This prevents a slow job type from starving others.

Each job type also has independent settings for detection interval, concurrency, timeouts, and retries — all editable in the admin UI at /plugin. For a detailed walkthrough, see Plugin Worker Scheduling.

Configuration

Security Configuration

Workers read TLS configuration from security.toml:

[grpc.worker]
cert = "/etc/ssl/worker.crt"
key = "/etc/ssl/worker.key"
ca = "/etc/ssl/ca.crt"

Worker Identification

Worker ID: Auto-generated (format w-<hostname>-<random>) and persisted to <workingDir>/worker.id
Explicit ID: Set via -id to override auto-generation

Adding a New Job Type

With the handler registry, adding a new job type requires minimal changes:

Create the handler file implementing the JobHandler interface
Add an init() function that calls RegisterHandler with the job type, category, aliases, and build function

If the handler lives in the root plugin/worker package, no other files need to change.

If the handler lives in a subpackage (like plugin/worker/iceberg), add a blank import to the aggregator file plugin/worker/handlers/handlers.go so its init() runs:

import (
    _ "github.com/seaweedfs/seaweedfs/weed/plugin/worker/iceberg"
    _ "github.com/seaweedfs/seaweedfs/weed/plugin/worker/yourpkg" // add new subpackages here
)

The handler is then automatically available to all workers using all or the matching category.

Best Practices

Deployment

Separate by category: Run default workers broadly, heavy workers on dedicated nodes with more CPU/memory
Multiple workers: Deploy multiple workers for redundancy and throughput
Stable identity: Use -workingDir so worker IDs survive restarts
Resource sizing: Tune -maxExecute based on available resources

Troubleshooting

Cannot connect to admin server: Verify address, check network, ensure admin is running, check gRPC port
No tasks received: Verify -jobType includes the desired job types, check admin scheduler configuration
TLS failures: Check security.toml paths and certificate validity
Debug logging: weed worker -admin=... -v=4

weed admin: Start admin server that manages workers
weed master: Start master servers
weed volume: Start volume servers

Weed Worker

Overview

Usage

Options

Job Type Categories

Job Type Aliases

Examples

Basic Usage

Specialised Workers

Monitoring

Worker Architecture

Worker Lifecycle

Connection Details

How Scheduling Works

Configuration

Security Configuration

Worker Identification

Adding a New Job Type

Best Practices

Deployment

Troubleshooting

Related Commands

See Also

Introduction

API

Configuration

Filer

Filer Stores

Management

Advanced Filer Configurations

FUSE Mount

WebDAV

SFTP Server

Cloud Drive

AWS S3 API

S3 Table Bucket

Iceberg Integrations

S3 Authentication & IAM

Server-Side Encryption

S3 Client Tools

Machine Learning

HDFS

Replication and Backup

Metadata Change Events

Messaging

Use Cases

Operations

Rust Volume Server

Advanced

Security

Misc Use Case Examples