MelodyFlow
This is your private demo for MelodyFlow, A fast text-guided music generation and editing model based on a single-stage flow matching DiT presented at: ["High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching"] (https://huggingface.co/papers/2407.03648)
| Model | Input Text | ODE Solver | Inference steps | Target Flow step | Regularize | Regularization Strength | Duration | File or Microphone |
|---|
More details
The model will generate a short music extract based on the description you provided. The model can generate or edit up to 30 seconds of audio in one pass.
The model was trained with description from a stock music catalog, descriptions that will work best should include some level of details on the instruments present, along with some intended use case (e.g. adding "perfect for a commercial" can somehow help).
You can optionally provide a reference audio from which the model will elaborate an edited version based on the text description, using MelodyFlow's regularized latent inversion.
WARNING: Choosing long durations will take a longer time to generate.
Available models are:
- facebook/melodyflow-t24-30secs (1B)
See github.com/facebookresearch/audiocraft for more details.