Trino Iceberg Integration
Trino connects to SeaweedFS Iceberg tables using the iceberg connector with the rest catalog type and SigV4 authentication.
Prerequisites
- Trino 4xx+ with the Iceberg connector
- SeaweedFS started as shown in Setup below
Setup
Start weed mini with credentials and a pre-created table bucket via environment variables:
export AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE
export AWS_SECRET_ACCESS_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
export S3_TABLE_BUCKET=my-table-bucket
weed mini -dir ~/data
This brings up the Iceberg REST Catalog on http://localhost:8181, the S3 endpoint on http://localhost:8333, an admin S3 identity using the AWS env vars (used as Trino's SigV4 credentials below), and the table bucket my-table-bucket pre-created.
Configuration
Create a catalog properties file at etc/catalog/iceberg.properties:
connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=http://localhost:8181
iceberg.rest-catalog.warehouse=s3://my-table-bucket
# File format
iceberg.file-format=PARQUET
iceberg.unique-table-location=true
# SigV4 authentication for the REST catalog
iceberg.rest-catalog.security=SIGV4
# S3 FileIO configuration
fs.native-s3.enabled=true
s3.endpoint=http://localhost:8333
s3.path-style-access=true
s3.signer-type=AwsS3V4Signer
s3.aws-access-key=AKIAIOSFODNN7EXAMPLE
s3.aws-secret-key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
s3.region=us-east-1
Recommended: iceberg.unique-table-location=true
Keep Trino's default iceberg.unique-table-location=true (already shown in the configuration above). With it enabled, each new table is created at a unique path s3://<bucket>/<namespace>/<tableName>-<UUID>/. The UUID suffix gives you three operational properties worth keeping:
- Orphan-resistant
DROP+CREATE— if aDROP TABLEcleanup partially fails (filer error, crash mid-cleanup), a freshCREATE TABLEwith the same name lands at a new path and is not blocked or polluted by leftover files. - Crash-safe aborted creates — a
CREATE TABLEthat wrote partial state but failed to commit leaves debris under one UUID; the retry uses a new UUID and proceeds cleanly. - Defense-in-depth against client-side empty-location pre-flight checks — Trino runs
listFiles(<location>).hasNext()before commit, and so do some other engines. With unique paths the check sees a fresh empty directory regardless of any historical state at the deterministic name.
If you prefer the cleaner s3://<bucket>/<namespace>/<tableName>/ layout (one directory per table, no <tableName>-<UUID>/ sibling), setting iceberg.unique-table-location=false is safe against SeaweedFS — you give up the resistance properties above. Pick whichever trade-off fits your operational model.
Multi-Level Namespaces
To use nested namespaces (e.g., db.schema), add:
iceberg.rest-catalog.nested-namespace-enabled=true
Example SQL
Schema Operations
-- Create a schema (maps to an Iceberg namespace)
CREATE SCHEMA IF NOT EXISTS iceberg.my_namespace;
-- List schemas
SHOW SCHEMAS FROM iceberg;
Table Operations
-- Create a table
CREATE TABLE iceberg.my_namespace.events (
id INTEGER,
event VARCHAR,
ts TIMESTAMP(6)
) WITH (
format = 'PARQUET'
);
-- Insert data
INSERT INTO iceberg.my_namespace.events VALUES
(1, 'click', TIMESTAMP '2024-01-15 10:30:00'),
(2, 'view', TIMESTAMP '2024-01-15 11:00:00');
-- Query
SELECT * FROM iceberg.my_namespace.events;
-- Inspect data files
SELECT file_path FROM iceberg.my_namespace."events$files" LIMIT 5;
Multi-Level Namespace Example
CREATE SCHEMA IF NOT EXISTS iceberg."analytics.web";
CREATE TABLE iceberg."analytics.web".pageviews (
id INTEGER,
url VARCHAR,
ts TIMESTAMP(6)
) WITH (
format = 'PARQUET'
);
Anonymous Access
When SeaweedFS runs without IAM, remove the SigV4 and credential properties:
connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=http://localhost:8181
iceberg.rest-catalog.warehouse=s3://my-table-bucket
fs.native-s3.enabled=true
s3.endpoint=http://localhost:8333
s3.path-style-access=true
s3.region=us-east-1
See Also
- SeaweedFS Iceberg Catalog - Architecture and concepts
- S3 Table Bucket - Managing table buckets
Introduction
- Quick Start with weed mini
- Simplest S3 Bucket and User Setup
- Components
- Getting Started
- Production Setup
- A typical step‐by‐step example
- Benchmarks
- FAQ
- Applications
API
Configuration
- Replication
- Store file with a Time To Live
- Failover Master Server
- Erasure coding for warm storage
- EC Bitrot Detection
- Server Startup via Systemd
- Environment Variables
Filer
- Filer Setup
- Directories and Files
- File Operations Quick Reference
- Data Structure for Large Files
- Filer Data Encryption
- Filer Commands and Operations
- Filer JWT Use
- TUS Resumable Uploads
Filer Stores
- Filer Cassandra Setup
- Filer Redis Setup
- Super Large Directories
- Path-Specific Filer Store
- Choosing a Filer Store
- Customize Filer Store
Management
Advanced Filer Configurations
- Migrate to Filer Store
- Add New Filer Store
- Filer Store Replication
- Filer Active Active cross cluster continuous synchronization
- Filer as a Key-Large-Value Store
- Path Specific Configuration
- Filer Change Data Capture
- Filer Operation Serialization
FUSE Mount
- FIO benchmark
- fstab and systemd mount
- POSIX Compliance
- Distributed POSIX Locks
- P2P reading in weed mount
WebDAV
SFTP Server
Cloud Drive
- Cloud Drive Benefits
- Cloud Drive Architecture
- Configure Remote Storage
- Mount Remote Storage
- Cache Remote Storage
- Cloud Drive Quick Setup
- Gateway to Remote Object Storage
AWS S3 API
- Amazon S3 API
- Supported APIs vs Minio
- S3 Lifecycle
- S3 Lifecycle vs Volume TTL
- S3 Conditional Operations
- S3 CORS
- S3 Object Lock and Retention
- S3 Object Versioning
- S3 API Benchmark
- S3 API FAQ
- S3 Bucket Quota
- S3 Rate Limiting
- S3 API Audit log
- S3 Nginx Proxy
- Docker Compose for S3
S3 Table Bucket
- S3 Table Bucket
- S3 Table Bucket Commands
- S3 Tables Security
- SeaweedFS Iceberg Catalog
- Iceberg Table Maintenance
Iceberg Integrations
- Spark Iceberg Integration
- Trino Iceberg Integration
- Dremio Iceberg Integration
- DuckDB Iceberg Integration
- Doris Iceberg Integration
- RisingWave Iceberg Integration
- Lakekeeper Iceberg Integration
S3 Authentication & IAM
- S3 Configuration - Start Here
- S3 Credentials (
-s3.config) - OIDC Integration (
-s3.iam.config) - Kubernetes ServiceAccount Authentication (IRSA-style)
- S3 Policy Variables
- S3 Policy Conditions
- S3 Bucket Policies
- Amazon IAM API
- AWS IAM CLI
- weed shell - Shell IAM Commands
Server-Side Encryption
S3 Client Tools
- AWS CLI with SeaweedFS
- s3cmd with SeaweedFS
- rclone with SeaweedFS
- restic with SeaweedFS
- nodejs with Seaweed S3
Machine Learning
HDFS
- Hadoop Compatible File System
- run Spark on SeaweedFS
- run HBase on SeaweedFS
- run Presto on SeaweedFS
- Hadoop Benchmark
- HDFS via S3 connector
Replication and Backup
- Async Replication to another Filer [Deprecated]
- Async Backup
- Async Filer Metadata Backup
- Async Replication to Cloud [Deprecated]
- Kubernetes Backups and Recovery with K8up
Metadata Change Events
Messaging
- Structured Data Lake with SMQ and SQL
- Seaweed Message Queue
- SQL Queries on Message Queue
- SQL Quick Reference
- PostgreSQL-compatible Server weed db
- Pub-Sub to SMQ to SQL
- Kafka to Kafka Gateway to SMQ to SQL
Use Cases
Operations
- System Metrics
- weed shell
- Data Backup
- Deployment to Kubernetes and Minikube
- Deployment with seaweed-up
Rust Volume Server
Advanced
- Large File Handling
- Optimization
- Optimization for Many Small Buckets
- Volume Management
- Tiered Storage
- Cloud Tier
- Cloud Monitoring
- Load Command Line Options from a file
- SRV Service Discovery
- Volume Files Structure
Security
- Security Overview
- Security Configuration
- Cryptography and FIPS Compliance
- Run Blob Storage on Public Internet