MORTM / README.md

Create README.md

5d1a83d verified 6 months ago

3.53 kB

	---
	pipeline_tag: MIDI or Chord to MIDI or Chord
	tags:
	- music-generation
	- transformer
	- MoE
	- ALiBi
	- FlashAttention
	- melody-generation
	- rhythmic-modeling
	---
	# Model Card for MORTM (Metric-Oriented Rhythmic Transformer for Melodic generation)

	MORTM is a Transformer-based model designed for melody generation, with a strong emphasis on metric (rhythmic) structure. It represents music as sequences of pitch, duration, and relative beat positions within a measure (normalized to 96 ticks), making it suitable for time-robust, rhythm-aware music generation tasks.

	This modelcard aims to be a base template for new models. It has been generated using [this raw template](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md?plain=1).

	## Model Details

	### Model Description

	MORTM (Metric-Oriented Rhythmic Transformer for Melodic generation) is a decoder-only Transformer architecture optimized for music generation with rhythmic awareness. It generates melodies measure-by-measure in an autoregressive fashion. The model supports chord-conditional generation and is equipped with the following features:

	- Mixture of Experts (MoE) in the feedforward layers for capacity increase and compute efficiency.
	- ALiBi (Attention with Linear Biases) for relative positional biasing.
	- FlashAttention2 for fast and memory-efficient attention.
	- Relative tick-based tokenization (e.g., [Position, Duration, Pitch]) for metric robustness.

	- Developed by: Koue Okazaki & Takaki Nagoshi
	- Funded by [optional]: Nihon University, Graduate School of Integrated Basic Sciences
	- Shared by [optional]: ProjectMORTM
	- Model type: Transformer (decoder-only with MoE and ALiBi)
	- Language(s) (NLP): N/A (music domain)
	- License: MIT
	- Finetuned from model [optional]: Custom-built from scratch (not fine-tuned from a pretrained LM)

	### Model Sources [optional]

	- Repository: [https://github.com/Ayato964/MORTM](https://github.com/Ayato964/MORTM) (replace with actual link)
	- Paper [optional]: In submission
	- Demo [optional]: Coming soon

	## Uses

	### Direct Use

	MORTM can generate melodies from scratch or conditionally based on chord progressions. It is ideal for:

	- Melody composition in pop, jazz, and improvisational styles.
	- Real-time melodic suggestion systems for human-AI co-creation.
	- Music education and melody completion tools.

	### Downstream Use [optional]

	- Style transfer with different chord inputs.
	- Harmonization and rhythm-based accompaniment systems.

	### Out-of-Scope Use

	- Audio-to-audio tasks (e.g., vocal separation).
	- Raw audio synthesis (requires additional vocoder).
	- Not suitable for genre classification or music recommendation.

	## Bias, Risks, and Limitations

	As the training dataset is primarily composed of Western tonal music, the model may underperform on:

	- Non-tonal, microtonal, or traditional music styles.
	- Polyrhythmic or tempo-variable music.
	- Genres not sufficiently represented in training data (e.g., Indian classical).

	### Recommendations

	Generated melodies should be manually reviewed in professional music contexts. Users are encouraged to retrain or fine-tune on representative datasets when applying to culturally specific music.

	## How to Get Started with the Model

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("nagoshidayo/mortm")
	tokenizer = AutoTokenizer.from_pretrained("nagoshidayo/mortm")