I will build an AWS data lake and etl pipeline using pyspark

Pakistan

I speak English

Cloud Data Engineer building scalable ETL pipelines

Hi, I'm an independent Data Engineer specializing in building scalable ETL pipelines and robust cloud data architectures. I help businesses transform messy, unstructured logs into clean, query-ready d...
About this Gig

As a Data Engineer, I design robust cloud-native architectures and scalable ETL pipelines. Whether processing high-volume logs or building Medallion Data Lakes, I deliver clean, optimized solutions.

What I Offer:

  • End-to-End ETL Pipelines: Automated data extraction, transformation, and loading using Python and PySpark.
  • Cloud Data Lakes: Architecting serverless Medallion Data Lakes (Bronze, Silver, Gold) on AWS (S3, Glue, Athena).
  • Database Architecture: Designing relational databases (3NF) and optimizing complex SQL queries (CTEs, Window Functions) in PostgreSQL.
  • Performance Optimization: Reducing data processing times and cutting storage costs using formats like Apache Parquet.

Tech Stack: AWS (S3, Glue, Athena) | PySpark | Python | PostgreSQL | Advanced SQL | Git/GitHub

Why choose me? I write production-ready code, ensure scalable designs, and strictly follow data engineering best practices.

Please message me before ordering to discuss your exact project!

Langugae:

English

Urdu

Technical expertise:

dbt (Data Build Tool)

Apache Airflow

Expertise:

Data Pipelines

ETL Development

Data Integration

Industry:

Data analytics

My Portfolio