I will write and optimize pyspark etl pipelines for your data workflows

India

I speak Hindi, English

Senior Data Engineer ,Spark ,Scala ,AWS ,Airflow , Kafka ,Big Dat

I’m Pankaj, a Data Engineer with 3+ years of experience building large-scale data pipelines, ETL workflows, and cloud data platforms. I specialize in Spark (Scala/PySpark), Airflow, Kafka, SQL, and AW...
About this Gig

Are you looking for a reliable PySpark Data Engineer to build or optimize your ETL pipelines?

You're in the right place.

I'm Pankaj, a Data Engineer with 3+ years of experience at Paytm, where I built 200+ production ETL pipelines processing over 5 TB/day using PySpark, Airflow, AWS, and Kafka.

This gig focuses 100% on delivering fast, scalable, and clean PySpark ETL solutions for your business.


What I Can Do for You

  • Write clean and optimized PySpark ETL code
  • Build end-to-end ETL workflows (extract transform load)
  • Convert SQL logic into PySpark transformations
  • Fix failing or slow PySpark jobs
  • Optimize Spark jobs to reduce runtime and EMR cost
  • Integrate PySpark with AWS Glue, S3, EMR, Athena
  • Data cleaning, validation & transformation
  • Debug existing ETL pipelines


Why Choose Me

  • Production-ready, clean code
  • Strong real-world experience
  • Fast communication and delivery
  • 100% focus on reliability and scalability
  • Practical understanding of pipeline failures & optimizations


Technologies I Use

  • PySpark / Spark
  • AWS Glue, S3, EMR
  • SQL
  • Airflow (workflow orchestration)
  • Kafka
  • Python & Scala


Have a custom requirement?

Message me anytime I reply fast.

Lets build something scalable.