SeaweedFS Iceberg Catalog
SeaweedFS provides a built-in Iceberg REST Catalog that can be used with popular analytics engines.
Architecture
The SeaweedFS S3 Tables feature implements the Iceberg REST Catalog API. This allows clients to talk directly to SeaweedFS to manage Iceberg namespaces and tables, while the underlying data files (Parquet, Avro, Metadata JSON) are stored in SeaweedFS S3 buckets.
- Iceberg REST Catalog: Available on a dedicated port (default
8181) - S3 Data Access: Available on the S3 port (default
8333) - Authentication: SigV4 (Spark, Trino, RisingWave), OAuth2 (DuckDB, Doris), or unsigned REST + S3 access keys (Dremio)
Catalog and Bucket Relationship
In SeaweedFS, an Iceberg Catalog corresponds 1:1 with a Table Bucket.
- When you configure a client with a URI prefix like
http://localhost:8181/v1/my-catalog/, SeaweedFS maps requests to the bucket namedmy-catalog. - If no catalog/prefix is provided in the URL (e.g.,
http://localhost:8181/v1/), it defaults to using a bucket namedwarehouse.
This architecture allows you to manage multiple independent Iceberg catalogs on the same SeaweedFS cluster simply by creating multiple buckets.
Quick Start
1. Start SeaweedFS
weed mini
This starts:
- S3 API on port
8333 - Iceberg REST Catalog on port
8181
2. Create a Table Bucket
weed shell
> s3tables.bucket -create -name my-catalog
3. Connect Your Query Engine
See the integration guide for your engine below.
Client Integrations
| Engine | Auth Method | Guide |
|---|---|---|
| Apache Spark | SigV4 | Spark Iceberg Integration |
| Trino | SigV4 | Trino Iceberg Integration |
| Dremio | S3 access keys (REST source) | Dremio Iceberg Integration |
| DuckDB | OAuth2 | DuckDB Iceberg Integration |
| Apache Doris | OAuth2 | Doris Iceberg Integration |
| RisingWave | SigV4 | RisingWave Iceberg Integration |
| Lakekeeper | STS + SigV4 | Lakekeeper Iceberg Integration |
Metadata Storage
SeaweedFS stores Iceberg metadata using a hybrid approach to maximize performance and compatibility:
Namespaces
Namespace metadata (creation time, properties) is stored as Extended Attributes (xattrs) on the directory corresponding to the namespace in the Filer.
- This ensures lightweight namespace operations.
- The directory structure in the Filer mirrors the namespace hierarchy.
Tables
Table metadata follows the standard Iceberg V2 specification:
- Metadata Location: Stored in the
metadata/subdirectory of the table. - Data Location: Stored in the
data/subdirectory. - Format:
vN.metadata.json: The table metadata file.snap-*.avro: Snapshot manifest lists.*.avro: Manifest files.*.parquet: Data files.
Authentication and Authorization
Authentication Methods
SeaweedFS supports two authentication methods for the Iceberg REST Catalog:
SigV4 (Spark, Trino, RisingWave) — Clients sign each request using AWS Signature Version 4. This is the standard method used by most Iceberg-compatible engines.
OAuth2 (DuckDB, Doris) — Clients exchange S3 credentials for a bearer token via POST /v1/oauth/tokens using the client_credentials grant type. The S3 access key is used as client_id and the secret key as client_secret.
Authorization (IAM)
Permissions are managed via S3 Bucket Policies applied to the Table Bucket.
- You can define granular permissions for
CreateNamespace,CreateTable,WriteTable, etc. - Example Policy to allow read-only access:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3tables:ListNamespaces", "s3tables:GetTable", "s3tables:ListTables" ], "Resource": "arn:aws:s3tables:region:account:bucket/my-catalog/*" } ] }
Anonymous Access (Development)
If SeaweedFS is running without IAM configuration (e.g., weed mini with no -s3.config), the Iceberg Catalog allows anonymous access by default. This is useful for local development and testing. See each integration page for anonymous configuration details.
Configuration Reference
| Parameter | CLI Flag | Default |
|---|---|---|
| Iceberg REST port | -s3.port.iceberg (mini) / --port.iceberg (standalone) |
8181 |
| S3 port | -s3.port (mini) / --port (standalone) |
8333 |
| Disable Iceberg | Set port to 0 |
Enabled |
| IAM config | -s3.config |
None (anonymous) |
See Also
- S3 Table Bucket - Creating and managing table buckets
- S3 Tables Security - IAM policies for table access
- S3 Table Bucket Commands -
weed shellcommands - Iceberg Table Maintenance - Compaction and cleanup
Introduction
- Quick Start with weed mini
- Simplest S3 Bucket and User Setup
- Components
- Getting Started
- Production Setup
- A typical step‐by‐step example
- Benchmarks
- FAQ
- Applications
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- EC Bitrot Detection
- Server Startup via Systemd
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- File Operations Quick Reference
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
- TUS Resumable Uploads
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Management
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
- Filer Operation Serialization
FUSE Mount
- FIO benchmark
- fstab and systemd mount
- POSIX Compliance
- Distributed POSIX Locks
- P2P reading in weed mount
WebDAV
SFTP Server
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- Supported APIs vs Minio
- S3 Lifecycle
- S3 Lifecycle vs Volume TTL
- S3 Conditional Operations
- S3 CORS
- S3 Object Lock and Retention
- S3 Object Versioning
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 Rate Limiting
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
S3 Table Bucket
- S3 Table Bucket
- S3 Table Bucket Commands
- S3 Tables Security
- SeaweedFS Iceberg Catalog
- Iceberg Table Maintenance
Iceberg Integrations
- Spark Iceberg Integration
- Trino Iceberg Integration
- Dremio Iceberg Integration
- DuckDB Iceberg Integration
- Doris Iceberg Integration
- RisingWave Iceberg Integration
- Lakekeeper Iceberg Integration
S3 Authentication & IAM
- S3 Configuration - Start Here
- S3 Credentials (
-s3.config) - OIDC Integration (
-s3.iam.config) - Kubernetes ServiceAccount Authentication (IRSA-style)
- S3 Policy Variables
- S3 Policy Conditions
- S3 Bucket Policies
- Amazon IAM API
- AWS IAM CLI
- weed shell - Shell IAM Commands
Server-Side Encryption
S3 Client Tools
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Metadata Change Events
Messaging
- Structured Data Lake with SMQ and SQL
- Seaweed Message Queue
- SQL Queries on Message Queue
- SQL Quick Reference
- PostgreSQL-compatible Server weed db
- Pub-Sub to SMQ to SQL
- Kafka to Kafka Gateway to SMQ to SQL
Use Cases
Operations
- System Metrics
- weed shell
- Data Backup
- Deployment to Kubernetes and Minikube
- Deployment with seaweed-up
Rust Volume Server
Advanced
- Large File Handling
- Optimization
- Optimization for Many Small Buckets
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure
Security
- Security Overview
- Security Configuration
- Cryptography and FIPS Compliance
- Run Blob Storage on Public Internet