๐ SPECTER2โMAG (Multiclass Classification on MAG Level-0 Fields of Study)
This model is a fine-tuned version of allenai/specter2_base for multiclass bibliometric classification using MAG Fields of Study โ Level 0 (SciDocs). It achieves the following results on the evaluation set:
- Loss: 1.0598
- Accuracy: 0.8310
- Precision Micro: 0.8310
- Precision Macro: 0.8290
- Recall Micro: 0.8310
- Recall Macro: 0.8276
- F1 Micro: 0.8310
- F1 Macro: 0.8263
Model description
This model is a fine-tuned version of SPECTER2 (allenai/specter2_base) adapted for multiclass classification across the 19 top-level Fields of Study (FoS) from the Microsoft Academic Graph (MAG).
The model accepts the title, abstract, or title + abstract of a scientific publication and assigns it to exactly one of the MAG Level-0 domains (e.g., Biology, Chemistry, Computer Science, Engineering, Psychology).
Key characteristics:
- Base model: allenai/specter2_base
- Task: multiclass document classification
- Labels: 19 MAG Field of Study Level-0 categories
- Activation: softmax
- Loss: CrossEntropyLoss
- Output: single best-matching FoS category
MAG Level-0 represents broad disciplinary domains designed for high-level categorization of scientific documents.
Intended uses & limitations
Intended uses
This multiclass MAG model is suitable for:
- Assigning publications to top-level scientific disciplines
- Enriching metadata in:
- repositories
- research output systems
- funding and project datasets
- bibliometric dashboards
- Supporting scientometric analyses such as:
- broad-discipline portfolio mapping
- domain-level clustering
- modeling research diversification
- Classifying documents when only title/abstract is available
The model supports inputs such as:
- title only
- abstract only
- title + abstract (recommended)
Limitations
- MAG Level-0 categories are very coarse (e.g., Biology, Medicine, Engineering), and do not represent subfields.
- Documents spanning multiple fields must be forced into one labelโan inherent limitation of multiclass classification.
- The training labels come from MAGโs automatic field assignment pipeline, not manual expert annotation.
- Not suitable for:
- fine-grained subdisciplines
- downstream tasks requiring multilabel outputs
- WoS Categories or ASJC Areas (use separate models)
- clinical or regulatory decision-making
Predictions should be treated as high-level disciplinary metadata, not detailed field classification.
Training and evaluation data
Source dataset: SciDocs
Training data comes from the SciDocs dataset, introduced together with the original SPECTER paper:
SciDocs provides citation graphs, titles, abstracts, and MAG Fields of Study for scientific documents derived from MAG.
For this model, we use MAG Level-0 FoS, the 19 top-level scientific domains.
Dataset characteristics:
| Property | Value |
|---|---|
| Documents | ~40k scientific papers |
| Labels | 19 FoS Level-0 categories |
| Input fields | Abstract |
| Task type | Multiclass |
| Source | SciDocs (SPECTER paper) |
| License | CC-BY |
Training procedure
Preprocessing
- Input text constructed as:
abstract - Tokenization using the SPECTER2 tokenizer
- Maximum sequence length: 512 tokens
Model
- Base model:
allenai/specter2_base - Classification head: linear layer โ softmax
- Loss: CrossEntropyLoss
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision Micro | Precision Macro | Recall Micro | Recall Macro | F1 Micro | F1 Macro |
|---|---|---|---|---|---|---|---|---|---|---|
| 0.2603 | 1.0 | 1094 | 0.6733 | 0.8243 | 0.8243 | 0.8315 | 0.8243 | 0.8198 | 0.8243 | 0.8222 |
| 0.1779 | 2.0 | 2188 | 0.6955 | 0.8240 | 0.8240 | 0.8198 | 0.8240 | 0.8203 | 0.8240 | 0.8176 |
| 0.1628 | 3.0 | 3282 | 0.8130 | 0.8315 | 0.8315 | 0.8296 | 0.8315 | 0.8265 | 0.8315 | 0.8269 |
| 0.1136 | 4.0 | 4376 | 0.9842 | 0.8227 | 0.8227 | 0.8254 | 0.8227 | 0.8192 | 0.8227 | 0.8205 |
| 0.0666 | 5.0 | 5470 | 1.0598 | 0.8310 | 0.8310 | 0.8290 | 0.8310 | 0.8276 | 0.8310 | 0.8263 |
Evaluation results
| precision | recall | f1-score | support | |
|---|---|---|---|---|
| Art | 0.654867 | 0.845714 | 0.738155 | 175 |
| Biology | 0.982222 | 0.973568 | 0.977876 | 227 |
| Business | 0.914894 | 0.877551 | 0.895833 | 196 |
| Chemistry | 0.97449 | 0.969543 | 0.97201 | 197 |
| Computer science | 0.960452 | 0.894737 | 0.926431 | 190 |
| Economics | 0.816425 | 0.782407 | 0.799054 | 216 |
| Engineering | 0.906103 | 0.927885 | 0.916865 | 208 |
| Environmental science | 0.975369 | 0.916667 | 0.945107 | 216 |
| Geography | 0.758454 | 0.912791 | 0.828496 | 172 |
| Geology | 0.96729 | 0.976415 | 0.971831 | 212 |
| History | 0.62987 | 0.518717 | 0.568915 | 187 |
| Materials science | 0.932432 | 0.958333 | 0.945205 | 216 |
| Mathematics | 0.938776 | 0.94359 | 0.941176 | 195 |
| Medicine | 0.982558 | 0.923497 | 0.952113 | 183 |
| Philosophy | 0.752874 | 0.748571 | 0.750716 | 175 |
| Physics | 0.964824 | 0.974619 | 0.969697 | 197 |
| Political science | 0.642512 | 0.661692 | 0.651961 | 201 |
| Psychology | 0.806283 | 0.758621 | 0.781726 | 203 |
| Sociology | 0.438889 | 0.427027 | 0.432877 | 185 |
| accuracy | 0.845641 | 0.845641 | 0.845641 | 0.845641 |
| macro avg | 0.842083 | 0.841681 | 0.840318 | 3751 |
| weighted avg | 0.847843 | 0.845641 | 0.845311 | 3751 |
Framework versions
- Transformers 4.57.1
- Pytorch 2.8.0+cu126
- Datasets 3.6.0
- Tokenizers 0.22.1
- Downloads last month
- 47
Model tree for SIRIS-Lab/specter2-mag-multiclass
Base model
allenai/specter2_base