vector-institute
/

atomformer-base

 - mae
 pipeline_tag: graph-ml
 ---
+# AtomFormer base model
+This model is a transformer-based model that leverages gaussian pair-wise positional embeddings to train on atomistic graph data.  It
+is part of a suite of datasets/models/utilities in the AtomGen project that supports other methods for pre-training and fine-tuning
+models on atomistic graphs.
+## Model description
+AtomFormer is a transformer model with modifcations to train on atomstic graphs.  It builds primarily on the work
+from uni-mol+ to add the pair-wise pos. embeds. to the attention mask to leverage 3-D positional information.
+This model was pre-trained on a diverse set of aggregated atomistic datasets where the target task is the per-atom
+force prediction and the per-system energy prediction.
+The model also includes metadata regarding the atomic species that are being modeled, this includes the atomic radius,
+electronegativity, valency, etc.  The metadata is normalized and projected to be added to the atom embeddings in the model.
+## Intended uses & limitations
+You can use the raw model for either force and energy prediction, but it's mostly intended to
+be fine-tuned on a downstream task. The performance of the model as a force and energy prediction model
+is not validated, it was primarily used a pre-training task.
+### How to use
+You can use this model directly by loading via the Structure2EnergyandForces task:
+```python
+>>> from transformers import AutoModel
+```
+Here is how to use this model to get the features of a given atomistic graph in PyTorch:
+```python
+from transformers import AutoModel
+```
+## Training data
+AtomFormer is trained on an aggregated S2EF dataset from multiple sources such as OC20, OC22, ODAC23, MPtrj, and SPICE
+with structures and energies/forces for pre-training.  The pre-training data includes total energies and formation
+energies but trains using formation energy (which isn't included for OC22, indicated by "has_formation_energy" column).
+## Training procedure
+### Preprocessing
+The model expects input in the form of tokenized atomic symbols represented as `input_ids` and 3D coordinates represented
+as `coords`.  For the pre-training task it also expects labels for the `forces` and `formation_energy`.
+The `DataCollatorForAtomModeling` utility in the AtomGen library has the capacity to perform dynamic padding to batch the
+data together.  It also offers the option to flatten the data and provide a `batch` column for gnn-style training.
+### Pretraining
+The model was trained on a node of 4xA40 (48 GB) for 10 epochs (~2 weeks). See the
+[training code](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for all hyperparameters
+details.
+## Evaluation results
+We use the Atom3D dataset to evaluate the model's performance on downstream tasks.
+When fine-tuned on downstream tasks, this model achieves the following results:
+| Task | SMP | PIP  | RES | MSP | LBA | LEP | PSR | RSR  |
+|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|
+|      | TBD | TBD | TBD | TBD  | TBD | TBD  | TBD | TBD |