I will evaluate ai agents and create benchmark test datasets

J
jayesh1703
J
jayesh1703
Jayesh R

About this gig

I will evaluate your AI agent, chatbot, or LLM application for accuracy, reliability, and performance. I can create test datasets, benchmark prompts, identify failure cases, and provide actionable recommendations to improve response quality and user experience.

Get to know Jayesh R

Jayesh R

Software Developer FullStack AIML Engineering

  • FromIndia
  • Member sinceSep 2022
  • Avg. response time1 hour
  • Languages

    English, Hindi
I am a Software Developer specializing in AI/ML engineering and full-stack development. I build production-ready systems including RAG pipelines and LLM-powered chatbots. Currently, I ship AI solutions at Bajaj Life Insurance that handle 1,000+ queries daily with sub-2s latency and 88% intent accuracy.

Related tags