I will set up a postgresql iceberg catalog with spark and trino
DOTNET, C sharp, ETL pipelines
About this Gig
I'll set up a fully Dockerized Iceberg catalog backed by PostgreSQL using the JDBC interface, ready to connect with Apache Spark and Trino. This lightweight but powerful setup is perfect for building real-world lakehouse prototypes without relying on Hive or Nessie.
You'll get (based on your selected package):
- Docker Compose setup with PostgreSQL and Apache Iceberg
- JDBC catalog integration for Spark and Trino
- Optional ingestion and PySpark support (Premium tier)
- Sample Iceberg table and cross-engine query examples
- Modular structure with complete documentation
You can use this for:
- Lightweight local or cloud-based Iceberg development
- Sharing a metadata catalog between Spark and Trino
- Prototyping JDBC-compatible lakehouse setups
- Teaching or demoing catalog behavior without Hive
- Simplifying metadata workflows for data engineers
Everything is modular, minimal, and dev-friendly.
Please note:
- Deliverables depend on the selected package
- Custom offers are available - just message me!
- Included are 2 follow-up messages for clarification after delivery
- You are responsible for testing/running in your own environment
My Portfolio
FAQ
Do I need Hive Metastore for this setup?
No, this setup uses PostgreSQL as the catalog backend via JDBC. Hive is not required at all.
Can I query the same Iceberg tables from both Spark and Trino?
Yes, the JDBC catalog allows Spark and Trino to share a single PostgreSQL-backed metadata store.
Can I use this in the cloud or only locally?
You can use it in both. It’s fully Dockerized, so it works locally and can be deployed to any VM or cloud instance.
