๐Ÿ“— SPECTER2โ€“MAG (Multiclass Classification on MAG Level-0 Fields of Study)

This model is a fine-tuned version of allenai/specter2_base for multiclass bibliometric classification using MAG Fields of Study โ€“ Level 0 (SciDocs). It achieves the following results on the evaluation set:

  • Loss: 1.0598
  • Accuracy: 0.8310
  • Precision Micro: 0.8310
  • Precision Macro: 0.8290
  • Recall Micro: 0.8310
  • Recall Macro: 0.8276
  • F1 Micro: 0.8310
  • F1 Macro: 0.8263

Model description

This model is a fine-tuned version of SPECTER2 (allenai/specter2_base) adapted for multiclass classification across the 19 top-level Fields of Study (FoS) from the Microsoft Academic Graph (MAG).

The model accepts the title, abstract, or title + abstract of a scientific publication and assigns it to exactly one of the MAG Level-0 domains (e.g., Biology, Chemistry, Computer Science, Engineering, Psychology).

Key characteristics:

  • Base model: allenai/specter2_base
  • Task: multiclass document classification
  • Labels: 19 MAG Field of Study Level-0 categories
  • Activation: softmax
  • Loss: CrossEntropyLoss
  • Output: single best-matching FoS category

MAG Level-0 represents broad disciplinary domains designed for high-level categorization of scientific documents.

Intended uses & limitations

Intended uses

This multiclass MAG model is suitable for:

  • Assigning publications to top-level scientific disciplines
  • Enriching metadata in:
    • repositories
    • research output systems
    • funding and project datasets
    • bibliometric dashboards
  • Supporting scientometric analyses such as:
    • broad-discipline portfolio mapping
    • domain-level clustering
    • modeling research diversification
  • Classifying documents when only title/abstract is available

The model supports inputs such as:

  • title only
  • abstract only
  • title + abstract (recommended)

Limitations

  • MAG Level-0 categories are very coarse (e.g., Biology, Medicine, Engineering), and do not represent subfields.
  • Documents spanning multiple fields must be forced into one labelโ€”an inherent limitation of multiclass classification.
  • The training labels come from MAGโ€™s automatic field assignment pipeline, not manual expert annotation.
  • Not suitable for:
    • fine-grained subdisciplines
    • downstream tasks requiring multilabel outputs
    • WoS Categories or ASJC Areas (use separate models)
    • clinical or regulatory decision-making

Predictions should be treated as high-level disciplinary metadata, not detailed field classification.

Training and evaluation data

Source dataset: SciDocs

Training data comes from the SciDocs dataset, introduced together with the original SPECTER paper:

SciDocs provides citation graphs, titles, abstracts, and MAG Fields of Study for scientific documents derived from MAG.
For this model, we use MAG Level-0 FoS, the 19 top-level scientific domains.

Dataset characteristics:

Property Value
Documents ~40k scientific papers
Labels 19 FoS Level-0 categories
Input fields Abstract
Task type Multiclass
Source SciDocs (SPECTER paper)
License CC-BY

Training procedure

Preprocessing

  • Input text constructed as:
    abstract
  • Tokenization using the SPECTER2 tokenizer
  • Maximum sequence length: 512 tokens

Model

  • Base model: allenai/specter2_base
  • Classification head: linear layer โ†’ softmax
  • Loss: CrossEntropyLoss

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Micro Precision Macro Recall Micro Recall Macro F1 Micro F1 Macro
0.2603 1.0 1094 0.6733 0.8243 0.8243 0.8315 0.8243 0.8198 0.8243 0.8222
0.1779 2.0 2188 0.6955 0.8240 0.8240 0.8198 0.8240 0.8203 0.8240 0.8176
0.1628 3.0 3282 0.8130 0.8315 0.8315 0.8296 0.8315 0.8265 0.8315 0.8269
0.1136 4.0 4376 0.9842 0.8227 0.8227 0.8254 0.8227 0.8192 0.8227 0.8205
0.0666 5.0 5470 1.0598 0.8310 0.8310 0.8290 0.8310 0.8276 0.8310 0.8263

Evaluation results

precision recall f1-score support
Art 0.654867 0.845714 0.738155 175
Biology 0.982222 0.973568 0.977876 227
Business 0.914894 0.877551 0.895833 196
Chemistry 0.97449 0.969543 0.97201 197
Computer science 0.960452 0.894737 0.926431 190
Economics 0.816425 0.782407 0.799054 216
Engineering 0.906103 0.927885 0.916865 208
Environmental science 0.975369 0.916667 0.945107 216
Geography 0.758454 0.912791 0.828496 172
Geology 0.96729 0.976415 0.971831 212
History 0.62987 0.518717 0.568915 187
Materials science 0.932432 0.958333 0.945205 216
Mathematics 0.938776 0.94359 0.941176 195
Medicine 0.982558 0.923497 0.952113 183
Philosophy 0.752874 0.748571 0.750716 175
Physics 0.964824 0.974619 0.969697 197
Political science 0.642512 0.661692 0.651961 201
Psychology 0.806283 0.758621 0.781726 203
Sociology 0.438889 0.427027 0.432877 185
accuracy 0.845641 0.845641 0.845641 0.845641
macro avg 0.842083 0.841681 0.840318 3751
weighted avg 0.847843 0.845641 0.845311 3751

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.22.1
Downloads last month
47
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for SIRIS-Lab/specter2-mag-multiclass

Finetuned
(26)
this model

Evaluation results