Skip to content

nikithareddyb/speech-recognition-dl

Repository files navigation

speech-recognition-dl

Prerequisites:

Below are the prerequisites needed to reproduce the experiments performed while creating the SER FusionNet model. Required data and notebooks for the two experiments using RAVDESS and SER Superset performed is mentioned below. Note: ● Ensure that the code is run as a Kaggle notebook as the data source links in the code are provided accordingly. This would also ensure that the high sized audio datasets need not be downloaded to local directory ● The dataset links mentioned below needs to be added to the working directory of the Kaggle notebook

  1. Speech Emotion Recognition on Ravdess Dataset Python Notebooks: ● ser-ravdess-crossvalidation.ipynb ● ser-ravdess.ipynb Dataset link - Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): https://www.kaggle.com/datasets/uwrfkaggler/ravdess-emotional-speech-audio
  2. Speech Emotion Recognition on SER Superset Dataset (Ravdess+Crema+Savee+Tess) Python Notebooks: ● ser-superset-crossvalidation.ipynb ● ser-superset.ipynb Dataset links - Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) https://www.kaggle.com/datasets/uwrfkaggler/ravdess-emotional-speech-audio Crowd Sourced Emotional Multimodal Actors Dataset (CREMA-D) https://www.kaggle.com/datasets/ejlok1/cremad

Gender Mapping for Crema-D https://www.kaggle.com/datasets/jananiravikumar1/gender-for-crema

Surrey Audio-Visual Expressed Emotion (SAVEE) https://www.kaggle.com/datasets/ejlok1/surrey-audiovisual-expressed-emotion-savee Toronto emotional speech set (TESS) https://www.kaggle.com/datasets/ejlok1/toronto-emotional-speech-set-tess

About

Experimenting new approach to Speech Emotion Recognition (SER) with Deep Learning

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published