I will build a dockerized big data pipeline using spark and hadoop

Czech Republic

I speak English, Czech

14 orders completed

DOTNET, C sharp, ETL pipelines

4+ years of fintech .NET / C# experience (6+ years total). I build and maintain business-critical systems for investment banking infrastructure. I can get you: ✅ Backend REST APIs in .NET / C# ✅ C# ...
About this Gig

I will set up a fully Dockerized Big Data pipeline using Apache Spark and Hadoop, ready for real-time data processing or batch ETL workflows - ideal for both local and cloud deployment.


What's included (based on your selected package):


  • Docker Compose setup for Spark + Hadoop
  • Pre-configured sample Spark job
  • Integrated HDFS output
  • Clean, modular codebase with comments
  • Step-by-step instructions for local or cloud use


Use cases:


  • IoT sensor data ingestion and transformation
  • Financial transaction analytics
  • Batch processing of large CSV/JSON datasets
  • Time-series pipeline to HDFS for long-term storage
  • Optional GPT AI enrichment using OpenAI API for summarization or tagging


Ideal for engineers, startups, or teams needing a fast-track to scalable data infrastructure.


Need extras like a REST API, OpenAI integration, monitoring (Grafana/Prometheus), or AWS EC2 deployment? Just say the word!


Please note:


  • Deliverables depend on the selected package
  • Custom offers are available - just message me!
  • Included are 2 follow-up messages for clarification after delivery
  • You are responsible for testing/running in your own environment
  • OpenAI usage requires your own API key

Destination Platform:

PostgreSQL

MySQL

Apache Hive

Amazon S3

Other

Tools & Platforms:

Kafka Connect

Apache NiFi

Other

My Portfolio