Real-Time Emotion Recognition in Speech: A Machine Learning Perspective

Anandhakumar, Dharmalingam (2023) Real-Time Emotion Recognition in Speech: A Machine Learning Perspective. JOURNAL OF XI'AN UNIVERSITY OF ARCHITECTURE & TECHNOLOGY, XV (4). ISSN 1006-7930

[thumbnail of 19.ANAND REAL TIMESPEECH TRANSLATION USING RNN.pdf] Text
19.ANAND REAL TIMESPEECH TRANSLATION USING RNN.pdf

Download (588kB)

Abstract

This paper proposes a novel approach to voice to voice translation using recurrent neural networks (RNNs). Voice to voice translation is a challenging task that involves converting spoken words in one language to another language while retaining the speaker's voice characteristics. RNNs are a class of deep learning models that have shown promise in a variety of tasks involving the use of natural language, such as speech
recognition and machine translation. We present a RNN-based model that takes as input the audio signal in one language and produces the corresponding audio signal in the target language. We also introduce a new loss function that encourages the model to preserve the speaker's voice characteristics. We evaluate the proposed approach on a publicly available dataset and show that in terms of speaker similarity and translation accuracy, our model performs better than cutting-edge techniques. Our approach has potential applications in various
domains, including language learning, entertainment, and communication

Item Type: Article
Subjects: AC Rearch Cluster
Depositing User: Unnamed user with email techsupport@mosys.org
Date Deposited: 21 Dec 2023 08:34
Last Modified: 21 Dec 2023 08:34
URI: https://ir.vignan.ac.in/id/eprint/623

Actions (login required)

View Item
View Item