I will finetune llms and build rag pipelines for your ai app


About this gig
Are you building an AI application that needs a custom language model or a knowledge-grounded chatbot? You're in the right place. I'm Yash an ML Engineer with 6+ years of experience and hands-on production LLM work at Fidelity National Financial, where I fine-tuned LayoutLMv3 (a multimodal transformer/LLM) for document intelligence on real enterprise data.
What I'll build for you:
- Fine-tune open-source LLMs (LLaMA 3, Mistral, Falcon, BERT, LayoutLM) on your custom dataset using LoRA / QLoRA / full fine-tuning
- RAG pipelines connect your LLM to your knowledge base using vector databases (Pinecone, ChromaDB, FAISS, Weaviate)
- Custom chatbots that answer questions from your documents, PDFs, databases, or APIs
- LLM evaluation & benchmarking measure accuracy, hallucination rate, and latency
- Prompt engineering & system prompt optimization for consistent, reliable outputs
Why hire me?
- Real enterprise LLM fine-tuning experience in production (not just tutorials)
- IIT Kharagpur Dual Degree (B.Tech + M.Tech)
- Clean, documented, production-ready code delivered in Python
- Azure deployment experience for scalable inference
Get to know Yash Bhardwaj
I build GenAI apps, LLM pipelines and NLP systems that ship to production
- FromIndia
- Member sinceApr 2026
- Avg. response time1 hour
Languages
Hindi, English
FAQ
Do you need my data to be labeled?
For fine-tuning, yes — I can also help you structure and annotate your dataset as an add-on. For RAG, raw documents (PDF, TXT, DOCX) work perfectly.
Which LLMs do you work with?
Open-source models — LLaMA 3, Mistral, Phi-3, BERT, and the LayoutLM family — fine-tuned using LoRA/QLoRA via HuggingFace. I also support OpenAI's fine-tuning API for GPT-based models.
Can you deploy the model too?
Yes — I deploy to any cloud platform: AWS SageMaker, Google Cloud Vertex AI, Azure ML, or Hugging Face Spaces. I also build FastAPI inference endpoints wrapped in Docker, deployable anywhere. For mobile/edge use cases, TensorFlow Lite and ONNX export are supported. Deployment includes a working API

