Table of Contents

S3 Lifecycle — Architecture

At a glance
Per-shard state
Replay vs walker
Cadence layers
Failure model
Rate limiting
Components
What it's not
Why this shape

S3 Lifecycle — Architecture

High-level overview of the lifecycle worker. For implementation detail, see weed/s3api/s3lifecycle/DESIGN.md.

At a glance

The lifecycle worker runs as a scheduled job. Each invocation:

                ┌──────────────────────────────────────────┐
                │   dailyrun.Run (one filer subscription)  │
                │                                          │
   meta-log ──→ │   reader  ──→  fan-out  ──→  per-shard   │
                │                              channels    │
                │                                          │
                │   ┌──────────────────────────────────┐   │
                │   │   16 shard goroutines            │   │
                │   │   ┌──────────────────────────┐   │   │
                │   │   │  walker(view, shardID)?  │   │   │
                │   │   │  drainShardEvents        │   │   │
                │   │   │  saveCursorAndPublish    │   │   │
                │   │   └──────────────────────────┘   │   │
                │   └──────────────────────────────────┘   │
                │                                          │
                │   summary heartbeat + exit               │
                └──────────────────────────────────────────┘

One filer SubscribeMetadata stream covers every shard in this worker's set. A fan-out goroutine routes events to per-shard channels by ev.ShardID = sha256(bucket || "/" || key) >> 252. Each shard's goroutine independently runs the walker (when due), drains events, and persists its cursor.

Once every shard's goroutine returns, the worker tears down the subscription, emits a summary heartbeat, and exits.

Per-shard state

Each shard owns a cursor file on the filer at /etc/s3/lifecycle/daily-cursors/shard-NN.json:

TsNs          — last meta-log event for which all matches dispatched successfully
RuleSetHash   — ReplayContentHash of the rule set when this cursor was written
PromotedHash  — PromotedHash(retentionWindow) at write time
LastWalkedNs  — wall-clock of the last successful walker fire

The two hashes together detect every situation that invalidates the cursor: a replay-rule edit (RuleSetHash changes) or a partition flip (PromotedHash changes). On mismatch, the next pass triggers a recovery walk over RecoveryView(snap) to catch already-due objects across the full rule set, then rewinds the cursor.

Replay vs walker

The lifecycle rule space splits two ways:

Path	Action kinds	Why this path
Replay (meta-log)	`ExpirationDays`, `NoncurrentDays`, `AbortMPU`	DueTime is monotonic in event TsNs. The `done` early-stop works.
Walker (bucket list)	`ExpirationDate`, `ExpiredObjectDeleteMarker`, `NewerNoncurrent`	DueTime depends on current sibling/version state, not event age.

The engine's RulesForShard(shardID, retentionWindow) returns two snapshot views (replay, walk); each is a clone of the base snapshot with the action map masked to the right partition. router.Route consumes the replay view per event; the walker consumes the walk view per bucket.

A rule promoted to scan-only because its TTL exceeds meta-log retention moves from replay to walk — visible via PromotedHash.

Cadence layers

Three independent cadences shape worker behavior:

Cadence	Set by	Default
Worker invocation	Admin scheduler `DetectionIntervalMinutes`	1440 (daily)
Walker fire	`walker_interval_minutes` admin config	0 (every invocation)
Cursor save	After each `runShard`	n/a

The walker throttle decouples walker firing from invocation rate. CI invokes the worker every 2s; production invokes once per day. Both can use the same code with appropriate walker_interval_minutes.

Failure model

Worker crash mid-run. Cursor only advances past events whose matches all succeeded. On restart, the next pass resumes at the same cursor. Identity-CAS makes redundant deletes no-ops.
Transient delete failure. Pass halts at the failing event, cursor persists. Next pass retries from there. Head-of-line blocking is intentional — surfaces real problems instead of silently retrying forever.
Rule edits. Replay-rule edits trigger one-time recovery walk over RecoveryView. Walker-only rule edits don't change either hash; walker reads the new rules on its next steady-state fire.
Object overwritten between event and delete. LifecycleDelete RPC's identity-CAS returns NOOP_RESOLVED; cursor advances normally.

Rate limiting

Cluster-wide cap allocated per worker at job dispatch:

per_worker_rate = cluster_deletes_per_second / count(active_s3_lifecycle_workers)

Each worker shares one rate.Limiter across all shard goroutines. dispatchWithRetry calls limiter.Wait(ctx) before each LifecycleDelete RPC.

Components

Path	Role
`engine/`	Rule compilation, partition views (`RulesForShard`, `RecoveryView`)
`evaluate.go`	Per-event rule evaluation (`EvaluateAction`)
`due_at.go`	Per-(rule, kind, info) due-time computation
`router/router.go`	Per-event match emission (calls engine.Action and EvaluateAction)
`reader/reader.go`	Meta-log subscribe with `ShardPredicate`
`bootstrap/walker.go`	Bucket-walker with `RunForShard` filter
`dailyrun/run.go`	Main orchestrator: subscription, fan-out, per-shard runShard
`dailyrun/cursor.go`	Cursor type + filer JSON serializer
`dailyrun/walker_dispatcher.go`	Walker-to-`LifecycleDelete` adapter

What it's not

Not a streaming dispatcher. The earlier model kept a long-running goroutine per shard with an in-memory match heap. That code is gone. Worker is now "start, do today's work, stop."
Not event-time accurate. Latency from PUT to delete is bounded by the worker invocation cadence plus the walker interval — typically up to 24h, not seconds.
Not a general-purpose scheduler. The two action paths (replay, walker) are specific to lifecycle semantics. Don't add new event sources or actions without thinking through which path they belong on.

Why this shape

Each design choice points back to a specific failure mode of the prior streaming worker:

Choice	Replaces
Per-pass run + exit	Long-running goroutines with ticker drift, leak risk, restart pain
Cursor file per shard	Per-key freeze state, retry counters, in-memory heap on every restart
Identity-CAS at dispatch time	Pre-dispatch consistency checks at schedule time, racing object updates
Recovery branch over `RecoveryView`	Implicit "is this rule new" tracking with bookkeeping flags
Walker throttle independent of invocation	Walker hammering filer when test driver invokes every 2s
Single subscription per pass	16x filer load with 16 per-shard subscriptions

The result is a worker the operator can reason about by reading 2 metrics and a heartbeat line, with a state machine small enough to fit in one design doc.

Introduction

API

Configuration

Filer

Filer Stores

Management

Cloud Drive

AWS S3 API

S3 Table Bucket

Iceberg Integrations

S3 Authentication & IAM

S3 Configuration - Start Here
S3 Credentials (-s3.config)
OIDC Integration (-s3.iam.config)
Kubernetes ServiceAccount Authentication (IRSA-style)
S3 Policy Variables
S3 Policy Conditions
S3 Bucket Policies
Amazon IAM API
AWS IAM CLI
weed shell - Shell IAM Commands

Server-Side Encryption

S3 Client Tools

Machine Learning

HDFS

Replication and Backup

Async Replication to another Filer [Deprecated]
Async Backup
Async Filer Metadata Backup
Async Replication to Cloud [Deprecated]
Kubernetes Backups and Recovery with K8up

S3 Lifecycle — Architecture

At a glance

Per-shard state

Replay vs walker

Cadence layers

Failure model

Rate limiting

Components

What it's not

Why this shape

Introduction

API

Configuration

Filer

Filer Stores

Management

Advanced Filer Configurations

FUSE Mount

WebDAV

SFTP Server

Cloud Drive

AWS S3 API

S3 Table Bucket

Iceberg Integrations

S3 Authentication & IAM

Server-Side Encryption

S3 Client Tools

Machine Learning

HDFS

Replication and Backup

Metadata Change Events

Messaging

Use Cases

Operations

Rust Volume Server

Advanced

Security

Misc Use Case Examples