OSW

SIGNATURE WORK
CONFERENCE & EXHIBITION 2022

A Study on the Deep Learning Based Voice Conversion Algorithms

Name

Haozhe Zhang

Major

Data Science

Class

2022

About

Majored in Data Science. Undergraduate research interest is in speech processing, especially voice conversion and speaker recognition.

Signature Work Project Overview

Nowadays, as more and more systems achieve good performance in traditional voice conversion (VC) tasks, people’s attention gradually turns to VC tasks under extreme conditions. In this paper, we propose a novel method for zero-shot voice conversion. We aim to obtain intermediate representations for speaker-content disentanglement of speech to better remove speaker information and get pure content information. Accordingly, our proposed framework contains a module that removes the speaker information from the acoustic feature of the source speaker. Moreover, speaker information control is added to our system to maintain the voice cloning performance. The proposed system is evaluated by subjective and objective metrics. Results show that our proposed system significantly reduces the trade-off problem in zero-shot voice conversion, while it also manages to have high spoofing power to the speaker verification system. The VC technology can also be applied to enhance the speech generated by electrolarynx, which is helpful to the development of medical devices. The enhanced audio samples are available online at https://haydencaffrey.github.io/el/index.html.

Signature Work Presentation Video