I will test your llm and ai chatbot for bugs, accuracy and prompt failures

Pakistan

I speak Urdu, English

Manual Tester and QA Specialist

If you need your website or app tested before launch, I will check every feature carefully, find bugs, and send you a clean, easy-to-read bug report so your developer can fix issues fast. What you wil...
About this Gig

Are you deploying LLMs but worried about hallucinations or prompt injections? Standard QA fails with non-deterministic AI. I bridge the gap between AI development and software reliability by testing, breaking, and securing your LLM APIs.


### What I Will Do:

1. LLM API & Endpoint Testing: Validate status codes, payload schemas, and latency benchmarks (OpenAI, Anthropic, Custom models).

2. Prompt Validation & Vulnerability Testing: Evaluate prompts using Promptfoo or DeepEval. Test for injections, drift, and toxicity.

3. Hallucination Audits: Set up programmatic assertions to measure factual accuracy and semantic similarity.

4. CI/CD Integration: Build regression pipelines to auto-validate prompts on every backend change.


### Tech & Tools:

- Python / TypeScript

- Promptfoo / DeepEval / TruLens

- Postman / Newman / PyTest / Playwright

- CI/CD (GitHub Actions, GitLab CI)


### Why Choose This Gig?

Traditional QA checks static results. LLMs require an engineering mindset to track probability, semantic metrics, and adversarial prompt structures. 


Ensure your AI behaves exactly as intended. Message me with your project details today!


Testing application:

API

Development technology:

C/C++

HTML & CSS

SQL

Device:

PC

Linux

Android mobile phone

Windows phone

Related tags