Demo | Model | Website and Examples | Paper | Dataset
Meet Mustango, an exciting addition to the vibrant landscape of Multimodal Large Language Models designed for controlled music generation. Mustango leverages the Latent Diffusion Model (LDM), Flan-T5 encoder of Tango with musical features to do the magic!
🔥 Live demo available on Replicate and HuggingFace.
Generate music from a text prompt:
import IPython
import soundfile as sf
from mustango import Mustango
model = Mustango("declare-lab/mustango")
prompt = "This is a new age piece. There is a flute playing the main melody with a lot of staccato notes. The rhythmic background consists of a medium tempo electronic drum beat with percussive elements all over the spectrum. There is a playful atmosphere to the piece. This piece can be used in the soundtrack of a children's TV show or an advertisement jingle."
music = model.generate(prompt)
sf.write(f"{prompt}.wav", audio, samplerate=16000)
IPython.display.Audio(data=audio, rate=16000)
git clone https://github.com/declare-lab/tango
cd tango/mustango
pip install -r requirements.txt
cd diffusers
pip install -e .