I will build custom speech or emotion recognition models
About this Gig
About This Gig
I specialize in building multi-modal speech and emotion recognition systems by combining audio and text modalities for enhanced performance and accuracy.
With hands-on experience working on complex datasets like IEMOCAP and MELD, I have developed custom hybrid models using Bi-LSTM and CNN, achieving up to 85% accuracy on the IEMOCAP dataset. I'm also actively exploring Word2Vec and Transformer-based architectures for improved contextual understanding in speech.
You can check out my projects and research papers linked below for more details.
What I Offer:
- Preprocessing of complex audio and text datasets
- Custom model development (LSTM, CNN, Transformers, etc.)
- Hyperparameter tuning and model optimization
- Support for academic thesis, research, or industry projects
- Integration-ready solutions for apps or APIs
Feel free to message me before placing your order to discuss your specific needs.
Expertise:
Classification
•
Speech & Audio
•
Predictive analysis
Programming language:
Python
•
Colab
APIs:
Other
Tools:
Jupyter Notebook
•
Amazon SageMaker
•
Colab
Frameworks:
Scikit-learn
•
Keras
•
PyTorch
•
Panda
•
TensorFlow

