I will do ai data annotation, rlhf prompt testing, and oromo or amharic localization


About this gig
Looking for a technical data professional to refine your regional East African datasets for large language models (LLMs)? You have found the right partner.
As a Software Engineering graduate with native fluency in both Amharic and Oromo, I bridge the gap between engineering logic and localized semantic precision. I specialize in delivering high-fidelity datasets, rigorous human-in-the-loop preference evaluations, and seamless language localization.
What I Do Best:
AI Data Annotation & Labeling: Precise text, audio, and image categorization, including semantic text classification, syntax correction, and dataset cleaning.
RLHF Prompt Testing & Tuning: Expert evaluation of language model outputs, adversarial red-teaming, multi-turn prompt debugging, and ranking responses for preference alignment.
Localized Translation & Review: Comprehensive translation and cultural localization for English to Amharic and English to Oromo pipelines, capturing deep regional context and idioms.
Maximize your local language model's performance with pristine datasets. Contact me today to discuss your project scope!
Get to know Sisay F.
I value your business and aim to please
- FromEthiopia
- Member sinceOct 2025
Languages
English, Oromo, Amharic
My Portfolio
FAQ
Do you use automated translation tools like Google Translate?
Absolutely not. Machine translation frequently fails with the complex morphology of Oromo and Amharic. All data annotation, RLHF evaluation, and localization tasks are performed entirely manually by a native speaker with a software engineering background to ensure context and data integrity.
What kinds of file formats can you handle for datasets?
I comfortably handle all standard formats used in data engineering pipelines, including JSON, CSV, Excel sheets, plain text corpora, and custom annotation tool interface exports.
Can you do large-scale data annotation or prompt testing?
Yes. Whether you need a small batch of 500 lines for validation or long-term multi-turn chatbot response testing, I scale my workflow to match your model's training pipeline needs. Please drop me a message with your parameters!

