--- language: - en license: apache-2.0 library_name: llm2ner base_model: meta-llama/Llama-3.2-1B tags: - ner - span-detection - llm - pytorch pipeline_tag: token-classification model_name: ToMMeR-Llama-3.2-1B_L7_R64 source: https://github.com/VictorMorand/llm2ner paper: https://arxiv.org/abs/2510.19410 --- # ToMMeR-Llama-3.2-1B_L7_R64 ToMMeR is a lightweight probing model extracting emergent mention detection capabilities from early layers representations of any LLM backbone, achieving high Zero Shot recall across a wide set of 13 NER benchmarks. ## Checkpoint Details | Property | Value | |-----------|-------| | Base LLM | `meta-llama/Llama-3.2-1B` | | Layer | 7| | #Params | 264.2K | # Usage ## Installation Our code can be installed with pip+git, Please visit the [repository](https://github.com/VictorMorand/llm2ner) for more details. ```bash pip install git+https://github.com/VictorMorand/llm2ner.git ``` ## Fancy Outputs ```python import llm2ner from llm2ner import ToMMeR tommer = ToMMeR.from_pretrained("llm2ner/ToMMeR-Llama-3.2-1B_L7_R64") # load Backbone llm, optionnally cut the unused layer to save GPU space. llm = llm2ner.utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,) tommer.to(llm.device) text = "Large language models are awesome. While trained on language modeling, they exhibit emergent Zero Shot abilities that make them suitable for a wide range of tasks, including Named Entity Recognition (NER). " #fancy interactive output outputs = llm2ner.plotting.demo_inference( text, tommer, llm, decoding_strategy="threshold", # or "greedy" for flat segmentation threshold=0.5, # default 50% show_attn=True, ) ```

Large PRED language PRED models are awesome . While trained on language PRED modeling , they exhibit emergent PRED abilities that make them suitable for a wide range of tasks PRED , including Named PRED Entity Recognition ( NER PRED ) .

## Raw inference By default, ToMMeR outputs span probabilities, but we also propose built-in options for decoding entities. - Inputs: - tokens (batch, seq): tokens to process, - model: LLM to extract representation from. - Outputs: (batch, seq, seq) matrix (masked outside valid spans) ```python tommer = ToMMeR.from_pretrained("llm2ner/ToMMeR-Llama-3.2-1B_L7_R64") # load Backbone llm, optionnally cut the unused layer to save GPU space. llm = llm2ner.utils.load_llm( tommer.llm_name, cut_to_layer=tommer.layer,) tommer.to(llm.device) #### Raw Inference text = ["Large language models are awesome"] print(f"Input text: {text[0]}") #tokenize in shape (1, seq_len) tokens = model.tokenizer(text, return_tensors="pt")["input_ids"].to(device) # Output raw scores output = tommer.forward(tokens, model) # (batch_size, seq_len, seq_len) print(f"Raw Output shape: {output.shape}") #use given decoding strategy to infer entities entities = tommer.infer_entities(tokens=tokens, model=model, threshold=0.5, decoding_strategy="greedy") str_entities = [ model.tokenizer.decode(tokens[0,b:e+1]) for b, e in entities[0]] print(f"Predicted entities: {str_entities}") >>> Input text: Large language models are awesome >>> Raw Output shape: torch.Size([1, 6, 6]) >>> Predicted entities: ['Large language models'] ``` Please visit the [repository](https://github.com/VictorMorand/llm2ner) for more details and a demo notebook. ## Evaluation Results | dataset | precision | recall | f1 | n_samples | |---------------------|-------------|----------|--------|-------------| | MultiNERD | 0.1744 | 0.9918 | 0.2966 | 154144 | | CoNLL 2003 | 0.2588 | 0.9489 | 0.4067 | 16493 | | CrossNER_politics | 0.2712 | 0.9786 | 0.4246 | 1389 | | CrossNER_AI | 0.2838 | 0.9791 | 0.4401 | 879 | | CrossNER_literature | 0.3196 | 0.9582 | 0.4793 | 916 | | CrossNER_science | 0.3124 | 0.9687 | 0.4724 | 1193 | | CrossNER_music | 0.3591 | 0.9768 | 0.5252 | 945 | | ncbi | 0.1054 | 0.9394 | 0.1896 | 3952 | | FabNER | 0.2696 | 0.8015 | 0.4034 | 13681 | | WikiNeural | 0.1672 | 0.9882 | 0.286 | 92672 | | GENIA_NER | 0.201 | 0.9722 | 0.3332 | 16563 | | ACE 2005 | 0.2545 | 0.4826 | 0.3332 | 8230 | | Ontonotes | 0.2089 | 0.7736 | 0.3289 | 42193 | | Aggregated | 0.1886 | 0.9418 | 0.3142 | 353250 | | Mean | 0.2451 | 0.9046 | 0.3784 | 353250 | ## Citation If using this model or the approach, please cite the associated paper: ``` @misc{morand2025tommerefficiententity, title={ToMMeR -- Efficient Entity Mention Detection from Large Language Models}, author={Victor Morand and Nadi Tomeh and Josiane Mothe and Benjamin Piwowarski}, year={2025}, eprint={2510.19410}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2510.19410}, } ``` ## License Apache-2.0 (see repository for full text).