BiomedBERT-Amyloid
This model is a text classifier designed to screen scientific articles from the PubMed database. It identifies papers that report on the experimental effects of antibodies on amyloid formation.
Model Details
Model Description
- Developed by: Sonor Kubkowski
- Funded by: National Science Center, Poland (SONATA 19 grant, Project No. DEC-2023/51/D/NZ7/02847)
- Project title: “Taming aggregation with AmyloGraphem 2.0: database and predictive model of amyloid self-organization of modulators”
- Finetuned from model: microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract
- Task: Binary text classification
- License: Apache 2.0
Uses
Direct Use
This model is intended for direct use by researchers in biochemistry, neurobiology, and related fields. It can be used to rapidly filter thousands of PubMed search results to create a shortlist of relevant articles for manual review, saving significant time in literature screening.
Out-of-Scope Use
The model is highly specialized for the amyloid-antibody domain. It should not be used for general-purpose scientific article classification or for topics outside its training scope. It is not designed to provide medical or diagnostic advice.
Recommendations
Users should treat the model's output as a preliminary screening tool. It is recommended that a human expert review the articles flagged as both positive and negative to ensure comprehensiveness, especially when high recall is critical.
Training Details
Training Data
The training dataset was created by searching the PubMed database. A total of 1939 articles were manually assessed.
Search Queries:
"amyloid"[Title/Abstract] AND "antibod*"[Title/Abstract]"amyloid"[Title/Abstract] AND "nanobod*"[Title/Abstract]
Inclusion Criteria for Positive Label: The study must report on the effect of antibodies on amyloid formation and include experimental data from methods such as AFM, PET, ThT, or TEM.
Dataset Split:
- Positive (useful): 167 articles (9%)
- Negative (not useful): 1772 articles (91%)
Acknowledgments
We gratefully acknowledge the support for this research from:
- Institution: Bioinformatics and Multiomics Analysis Laboratory, Clinical Research Centre, Medical University of Bialystok.
- Funding Source: National Science Center, Poland, via the SONATA 19 grant.
- Project No: DEC-2023/51/D/NZ7/02847.
- Project Title: “Taming aggregation with AmyloGraphem 2.0: database and predictive model of amyloid self-organization of modulators”.
- Downloads last month
- 3