As a widely used technique in modern days, machine learning models have been applied to various domains, especially in art and music production. In this project, I focused on combining machine learning with the production of hip-hop music, a rising music style among young population. The process initiates with an image recognition model that interprets visual inputs, generating descriptive lines that serve as prompts for a Transformer-based lyrics generation model. This model, adept at understanding and crafting hip-hop verses, produces lyrics aligned with the thematic essence conveyed by the images. The music dimension is introduced through a synergistic use of Drums RNN and GETMusic model; Drums RNN generates the foundational drum track based on specified leading patterns, which are subsequently enriched with complementary instrumental layers by the GETMusic model, crafting a cohesive hip-hop beat. The final output was shown with vocal performances that bring the generated lyrics and beats to life, resulting in a collection of hip-hop samples. The efficacy and appeal of the generated music were meticulously assessed through objective metrics and subjective evaluations, ensuring a comprehensive understanding of the project’s innovative capabilities and its potential impact on the intersection of machine learning and music production. |