DLM Vi2En

This is a Vietnamese to English translation model based on the DLM architecture.

Base model is: "FacebookAI/xlm-roberta-large"

Requirements

Please ensure you have the following library versions installed:

pip install torch>=2.9.1 transformers>=4.57.3

Inference

Below is the Python code to run the model. It automatically utilizes the GPU if available and loads the model from the local cache after the first run.

import torch
from transformers import AutoTokenizer, AutoModel

# 1. Configuration
repo_id = "myduy/dlm-vi2en-checkpoint-90000"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 2. Load Model & Tokenizer
# trust_remote_code=True is required for custom architectures
tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModel.from_pretrained(repo_id, trust_remote_code=True).to(device)
model.eval()

# 3. Prepare Input
text = "cậu có muốn đến nghe không?"
inputs = tokenizer(text, return_tensors="pt").to(device)

# 4. Generate
with torch.no_grad():
    output_tokens = model.generate(
        inputs.input_ids, 
        max_iterations=50, 
        temperature=1.0, 
        strategy="reparam-uncond-deterministic-cosine"
    )

# 5. Decode Output
output_text = tokenizer.batch_decode(output_tokens, skip_special_tokens=True)[0]

print(f"Input: {text}")
print(f"Output: {output_text}")

Downloads last month: 36

Safetensors

Model size

0.6B params

Tensor type

F32

Model tree for myduy/dlm-vi2en-checkpoint-90000

Base model

FacebookAI/xlm-roberta-large

Finetuned

(841)

this model