Skip to content

Latest commit

 

History

History

mustango

Mustango: Toward Controllable Text-to-Music Generation

Demo | Model | Website and Examples | Paper | Dataset

Hugging Face Spaces

Meet Mustango, an exciting addition to the vibrant landscape of Multimodal Large Language Models designed for controlled music generation. Mustango leverages the Latent Diffusion Model (LDM), Flan-T5 encoder of Tango with musical features to do the magic!

🔥 Live demo available on Replicate and HuggingFace.

Quickstart Guide

Generate music from a text prompt:

import IPython
import soundfile as sf
from mustango import Mustango

model = Mustango("declare-lab/mustango")

prompt = "This is a new age piece. There is a flute playing the main melody with a lot of staccato notes. The rhythmic background consists of a medium tempo electronic drum beat with percussive elements all over the spectrum. There is a playful atmosphere to the piece. This piece can be used in the soundtrack of a children's TV show or an advertisement jingle."

music = model.generate(prompt)
sf.write(f"{prompt}.wav", audio, samplerate=16000)
IPython.display.Audio(data=audio, rate=16000)

Installation

git clone https://github.com/declare-lab/tango
cd tango/mustango
pip install -r requirements.txt
cd diffusers
pip install -e .