I will build ai agents web scraping bots and data extraction pipelines in python


About this gig
Who this is for
- Founders and ops leads who need recurring data (price monitoring, lead enrichment, market research)
- Researchers and analysts pulling structured data from public websites or PDFs
- ML and AI teams gathering training data
- Agencies whose clients ask for "scrape this for us" and need a reliable subcontractor
What I build
- Web scrapers in Python (Scrapy, BeautifulSoup, Playwright) or Node (Playwright, Puppeteer)
- AI-powered parsing with OpenAI or Claude so unstructured pages become typed JSON, not regex spaghetti
- Recurring data pipelines with scheduling, deduplication, change detection, and alerting
- PDF, document, and OCR extraction when the data isn't on webpage
Stack
Python, JavaScript, TypeScript, Scrapy, BeautifulSoup, Playwright, Puppeteer, Selenium, requests, httpx, Pandas, OpenAI
API, Anthropic Claude API, function calling and structured outputs, PostgreSQL, MongoDB, Supabase, Airtable, Google
Sheets
Get to know Hamza Khan
Experienced Full Stack AI Developer
- FromPakistan
- Member sinceFeb 2020
- Avg. response time6 hours
- Last delivery1 year
Languages
English, Hindi, Italian, French
My Portfolio
Other Software Development Services I Offer
FAQ
What's the difference between regular scraping and "AI-powered extraction"?
Regular scraping uses CSS/XPath selectors that break the moment a site changes layout. AI-powered extraction uses Claude or GPT to read the page like a human would and return structured JSON to your schema. It's more resilient, handles messy layouts, and lets you extract semantic fields
Will the scraper still work after the website updates?
AI-powered extractions are resilient to most layout changes. Selector-based scrapers are not — if the site rewrites its HTML, the scraper needs maintenance. The Premium tier includes 14 days of free fixes; after that, I offer a maintenance retainer
What sites can you scrape?
Public websites whose Terms of Service permit automated access, or where the data is explicitly public (product catalogs, real estate listings, government data, news, public profiles on professional sites with clear scraping policies, etc.).On the scope call I'll review your target
