From Images to Hip-hop Songs: A Combination of Machine Learning Tasks for Image Recognition, Hip-hop Lyrics Generation, and Beat Production

Name

Shuhe Wang

Major

Data Science

Class

2024

About

This portfolio introduces my motivation, experience, achievement, and future plans of my signature work project.

Signature Work Project Overview

As a widely used technique in modern days, machine learning models have been applied to various domains, especially in art and music production. In this project, I focused on combining machine learning with the production of hip-hop music, a rising music style among young population. The process initiates with an image recognition model that interprets visual inputs, generating descriptive lines that serve as prompts for a Transformer-based lyrics generation model. This model, adept at understanding and crafting hip-hop verses, produces lyrics aligned with the thematic essence conveyed by the images. The music dimension is introduced through a synergistic use of Drums RNN and GETMusic model; Drums RNN generates the foundational drum track based on specified leading patterns, which are subsequently enriched with complementary instrumental layers by the GETMusic model, crafting a cohesive hip-hop beat. The final output was shown with vocal performances that bring the generated lyrics and beats to life, resulting in a collection of hip-hop samples. The efficacy and appeal of the generated music were meticulously assessed through objective metrics and subjective evaluations, ensuring a comprehensive understanding of the project’s innovative capabilities and its potential impact on the intersection of machine learning and music production.

Signature Work Presentation Video

SIGNATURE WORKCONFERENCE & EXHIBITION 2024