I will develop scalable etl pipelines using databricks
About this Gig
Need a robust and scalable ETL pipeline built on Databricks? You're in the right place!
Im Gajendra a Certified Data Engineer and Data Analyst with 6+ years of experience building end-to-end data solutions for enterprise clients. Whether you're working with batch or streaming data, I specialize in designing clean, efficient, and production-ready ETL pipelines using Databricks, PySpark, and AWS.
What I Offer:
- End-to-end ETL/ELT pipeline development on Databricks
- Data ingestion from multiple sources (S3, RDS, APIs, etc.)
- Data cleaning, transformation, and enrichment using PySpark
- Integration with Delta Lake, SQL, and cloud storage
- Workflow orchestration with Databricks Jobs or Apache Airflow
- Version-controlled deployment (Git, CI/CD)
- Documentation and notebook-based delivery
Tools & Technologies: Databricks (Jobs, Notebooks, Delta Lake), PySpark, SQL, AWS (S3, Glue, Lambda, RDS), Airflow / Databricks Workflows, Git, CI/CD, DBFS
Why Work With Me?
- Certified in Databricks & AWS
- 6+ years of experience in Data Engineering & Analytics
- Fast and clear communication
- Production-level code with reusable design
Lets automate and scale your data workflows the right way!
FAQ
What do you need from me to get started?
Just a brief about your data sources, expected outputs, and cloud setup (if any).
Can you work with on-prem data or other cloud providers?
Yes, but AWS is my core expertise. We can discuss other options.
