Gemma 3 model card

Model Information

Introducing Lapa LLM v0.1.2 — the most efficient Ukrainian open-source language model

Description

Demo page: https://huggingface.co/spaces/lapa-llm/lapa
Link to Lapa Models: https://huggingface.co/collections/lapa-llm/lapa-v012-release

TBD: Datasets would be published soon.

Today, we proudly present Lapa LLM — a cutting-edge open large language model based on Gemma-3-12B with a focus on Ukrainian language processing. The project is the result of many months of work by a team of Ukrainian researchers in artificial intelligence from the Ukrainian Catholic University, AGH University of Krakow, Igor Sikorsky Kyiv Polytechnic Institute, and Lviv Polytechnic, who united to create the best model for Ukrainian language processing.

The model is named in honor of Valentyn Lapa, who together with Oleksiy Ivakhnenko created the Group Method of Data Handling, which is a predecessor to Deep Learning (source).

The project's goal is to create the best model for Ukrainian language processing with open datasets for pretraining and instruction tuning.

Key Achievements

Best tokenizer for the Ukrainian language

Thanks to a SOTA method for tokenizer adaptation developed by Mykola Haltiuk as part of this project, it was possible to replace 80,000 tokens out of 250,000 with Ukrainian ones without loss of model quality, thus making Lapa LLM the fastest model for working with the Ukrainian language. Compared to the original Gemma 3, for working with Ukrainian, the model requires 1.5 times fewer tokens, thus performing three times fewer computations to achieve better results.

Most efficient instruction-tuned model on the market

Our instruction version of the model in some benchmark categories is only slightly behind the current leader — MamayLM. The team is actively working on new datasets to further improve benchmark scores, which we plan to surpass in the v1.0 model.

Benchmark Results

Best English-to-Ukrainian translator with a result of 33 BLEU on FLORES and vice versa, which allows for natural and cost-effective translation of new NLP datasets into Ukrainian
One of the best models for image processing in Ukrainian in its size class, as measured on the MMZNO benchmark
One of the best models for Summarization and Q&A, which means excellent performance for RAG
Tests on propaganda and disinformation questions show the effectiveness of the filtering approach at the pretraining stage and during instruction fine-tuning

Model measurements and comparisons will be published as part of the Ukrainian LLM Leaderboard project; subscribe to the Telegram channel for further news.

Leader in pretraining results

Lapa LLM demonstrates the best performance in pretraining benchmarks for Ukrainian language processing, which opens opportunities for use by other researchers to adapt for their own tasks.

The model was trained on data evaluated by various quality assessment models - evaluation of propaganda and disinformation presence, readability, grammar assessment, etc. In the final stages of training, the model was trained on high-quality materials provided for commercial use by the Open Data division of Harvard Library.

Maximum openness and transparency

Unlike most available models, Lapa LLM is a maximally open project:

The model is available for commercial use
Approximately 25 datasets for model training have been published
Methods for filtering and processing data are disclosed, including for detecting disinformation and propaganda
Open source code for the model
Documentation of the training process is available

This openness allows for the development of the Ukrainian NLP community and helps businesses obtain a tool for the most efficient Ukrainian language processing in terms of both computation and results.

Application Possibilities

Lapa LLM opens wide possibilities for:

Processing sensitive documents without transferring data to external servers
Working with Ukrainian texts taking into account cultural and historical context without code-switching to Russian or other languages
Building RAG systems and chatbots that write in proper Ukrainian
Developing specialized solutions through the ability to fine-tune for specific tasks
Machine translation with the best translation quality from English to Ukrainian and vice versa among all models, including API providers

Next Steps

Complete development of the reasoning model
We are collecting community feedback on the model's performance, so we look forward to receiving it on GitHub or HuggingFace!
Collecting additional datasets for image processing in Ukrainian
Collecting additional datasets for instruction following and programming

Acknowledgment to Sponsors

The creation of Lapa LLM was made possible thanks to the support of our partners and sponsors, primarily the startup Comand.AI, which provided computational resources for training the model. We also want to thank the company ELEKS, which supported this project through a grant dedicated to the memory of Oleksiy Skrypnyk, and the startup HuggingFace, which provided a free corporate subscription to the team for storing models and datasets.

Links:

Try the model: https://huggingface.co/spaces/lapa-llm/lapa
Code: https://github.com/lapa-llm/lapa-llm

Subscribe to the Telegram channel for further news about the project: https://t.me/pehade_blog

Team

Inputs and outputs

Input:
- Text string, such as a question, a prompt, or a document to be summarized
- Images, normalized to 896 x 896 resolution and encoded to 256 tokens each
- Total input context of 128K tokens for the 4B, 12B, and 27B sizes, and 32K tokens for the 1B size
Output:
- Generated text in response to the input, such as an answer to a question, analysis of image content, or a summary of a document
- Total output context of 8192 tokens

Usage

Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0.

$ pip install -U transformers

Then, copy the snippet from the section that is relevant for your use case.

Running with the `pipeline` API

You can initialize the model and processor for inference with pipeline as follows.

from transformers import pipeline
import torch

pipe = pipeline(
    "image-text-to-text",
    model="lapa-llm/lapa-v0.1.2-instruct",
    device="cuda",
    torch_dtype=torch.bfloat16
)

With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline.

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are a helpful assistant."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    }
]

output = pipe(text=messages, max_new_tokens=200)
print(output[0]["generated_text"][-1]["content"])
# Okay, let's take a look! 
# Based on the image, the animal on the candy is a **turtle**. 
# You can see the shell shape and the head and legs.

Running the model on a single / multi GPU

# pip install accelerate

from transformers import AutoProcessor, Gemma3ForConditionalGeneration
from PIL import Image
import requests
import torch

model_id = "lapa-llm/lapa-v0.1.2-instruct"

model = Gemma3ForConditionalGeneration.from_pretrained(
    model_id, device_map="auto"
).eval()

processor = AutoProcessor.from_pretrained(model_id)

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are a helpful assistant."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg"},
            {"type": "text", "text": "Опиши зображення"}
        ]
    }
]

inputs = processor.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=True,
    return_dict=True, return_tensors="pt"
).to(model.device, dtype=torch.bfloat16)

input_len = inputs["input_ids"].shape[-1]

with torch.inference_mode():
    generation = model.generate(**inputs, max_new_tokens=100, do_sample=False)
    generation = generation[0][input_len:]

decoded = processor.decode(generation, skip_special_tokens=True)
print(decoded)

# **Overall Impression:** The image is a close-up shot of a vibrant garden scene, 
# focusing on a cluster of pink cosmos flowers and a busy bumblebee. 
# It has a slightly soft, natural feel, likely captured in daylight.

Citation

TBD

Downloads last month: 265

Safetensors

Model size

12B params

Tensor type

BF16

Model tree for lapa-llm/lapa-v0.1.2-instruct

Base model

google/gemma-3-12b-pt

Finetuned

lapa-llm/lapa-12b-pt