I will automatic speech recognition, speech to text consultancy

D
djimenez_ml
D
djimenez_ml
David J

About this gig

Are you looking to integrate speech-to-text, voice commands, or conversational AI into your project? I am here to help! With expertise in cutting-edge speech recognition technologies like Whisper, Wav2vec, Kaldi, Vosk, phi4, MMS, seamless-m4t, DeepSpeech, etc. I provide tailored consultations to guide you through implementation, optimization, and problem-solving.

I specialize in:

  • Designing and implementing speech-to-text solutions
  • Choosing the best APIs (Deepgram, AssemblyAI, Gemini, OpenAI, Google Speech-to-Text, etc.)
  • Training and fine-tuning SOTA speech models
  • Enhancing accuracy for specific languages or dialects
  • Addressing challenges in noisy environments
  • Speaker diarization
  • Voice Activity Detection
  • Sound Event Detection

Lets discuss your needs and bring your ideas to life!

Get to know David J

David J

Speech Recognition

5.0(7)
  • FromSpain
  • Member sinceNov 2024
  • Avg. response time1 day
  • Last delivery1 week
  • Languages

    English, Spanish
I have +6 year of experience working with machine learning and deep learning applied to speech recognition: - Speech to text, - Diarization, - Voice Activity Detection, - Sound Event Detection, - Denoising, - Audio Signal Processing, - Emotion... in different languages. I have been working with SOTA Automatic Speech Recognition APIs and frameworks: Whisper, Kaldi, Vosk, MMS, DeepSpeech, speechbrain and wav2vec2. I have been working to fine-tuned models to improve WER and speed inference on multiple language. Hugging Face: https://huggingface.co/deepdml Github: https://github.com/djpg

My Portfolio