Improved CycleGAN-Based Speech-to-Speech Neuro-Style Transfer for Voice Conversion

Name

Zedian Shao

Major

Data Science

Class

2023

About

I am Zedian Shao, majoring in Data Science. I am interested in applying computer vision related technologies in Intelligent Transportation System.

Signature Work Project Overview

Voice conversion is a critical area of research with numerous applications in speech recognition, natural language processing, and entertainment. In this project, we aimed to improve the performance of the VAE-GAN model for voice conversion by incorporating a revised architecture inspired by the CycleGAN-VC2 and other popular models. Our goal was to achieve better results in inter-gender voice conversion while maintaining the naturalness and similarity of the converted speech. Overall, our findings suggest that the improved VAE-GAN model with a 2-1-2 CNN layer and other architectures in decoder and discriminator could be a promising approach for improving inter-gender voice conversion. However, further research is needed to address the naturalness and similarity of the converted speech and to identify strategies for improving performance across different accents. By continuing to explore new architectures and training techniques, we hope to continue advancing the field of voice conversion and contributing to the development of more natural and human-like speech technologies.

Signature Work Presentation Video

SIGNATURE WORKCONFERENCE & EXHIBITION 2023