Anandhakumar, Dharmalingam (2023) Real-Time Emotion Recognition in Speech: A Machine Learning Perspective. JOURNAL OF XI'AN UNIVERSITY OF ARCHITECTURE & TECHNOLOGY, XV (4). ISSN 1006-7930
19.ANAND REAL TIMESPEECH TRANSLATION USING RNN.pdf
Download (588kB)
Abstract
This paper proposes a novel approach to voice to voice translation using recurrent neural networks (RNNs). Voice to voice translation is a challenging task that involves converting spoken words in one language to another language while retaining the speaker's voice characteristics. RNNs are a class of deep learning models that have shown promise in a variety of tasks involving the use of natural language, such as speech
recognition and machine translation. We present a RNN-based model that takes as input the audio signal in one language and produces the corresponding audio signal in the target language. We also introduce a new loss function that encourages the model to preserve the speaker's voice characteristics. We evaluate the proposed approach on a publicly available dataset and show that in terms of speaker similarity and translation accuracy, our model performs better than cutting-edge techniques. Our approach has potential applications in various
domains, including language learning, entertainment, and communication
Item Type: | Article |
---|---|
Subjects: | AC Rearch Cluster |
Depositing User: | Unnamed user with email techsupport@mosys.org |
Date Deposited: | 21 Dec 2023 08:34 |
Last Modified: | 21 Dec 2023 08:34 |
URI: | https://ir.vignan.ac.in/id/eprint/623 |