I will automatic speech recognition, speech to text consultancy

David J

automatic speech recognition, speech to text consultancy

Full Screen

About this gig

Are you looking to integrate speech-to-text, voice commands, or conversational AI into your project? I am here to help! With expertise in cutting-edge speech recognition technologies like Whisper, Wav2vec, Kaldi, Vosk, phi4, MMS, seamless-m4t, DeepSpeech, etc. I provide tailored consultations to guide you through implementation, optimization, and problem-solving.

I specialize in:

Designing and implementing speech-to-text solutions
Choosing the best APIs (Deepgram, AssemblyAI, Gemini, OpenAI, Google Speech-to-Text, etc.)
Training and fine-tuning SOTA speech models
Enhancing accuracy for specific languages or dialects
Addressing challenges in noisy environments
Speaker diarization
Voice Activity Detection
Sound Event Detection

Lets discuss your needs and bring your ideas to life!

Model expertise
- Custom model development
- Fine-tuning models
- Generative AI
- Predicitive analyatics
Industry
- Audio & video
- Data analytics
Programming language
- Python
- PyTorch
- Other
Language
- English
- Spanish
Technical expertise
- Machine learning (Supervised, Unsupervised, Reinforcement)
- Deep learning (Neural networks, GANs)
- Natural language processing (NLP)
- Algorithm development and optimization
- Feature engineering and data processing
- AI ethics and bias mitigation

Get to know David J

David J

Speech Recognition

5.0(7)

FromSpain
Member sinceNov 2024
Avg. response time1 hour
Last delivery3 weeks
Languages
English, Spanish

I have +7 year of experience working with deep learning applied to speech recognition: - Speech to text, - Diarization, - Voice Activity Detection, - Sound Event Detection, - Denoising, - Audio Signal Processing, - Emotion - Voice Agents... in different languages. I have been working with SOTA Automatic Speech Recognition APIs and frameworks: Whisper, Kaldi, Vosk, MMS, DeepSpeech, speechbrain and wav2vec2. I have been working to fine-tuned models to improve WER and speed inference on multiple language. Hugging Face: https://huggingface.co/deepdml Github: https://github.com/djpg

Need to get creative?

Looking for tech experts?

Ready to reach and convert consumers?

Looking for writers?

Get your business running smarter

I will automatic speech recognition, speech to text consultancy

About this gig

Get to know David J

My Portfolio

Related tags