Update README.md
Browse files
README.md
CHANGED
|
@@ -7,3 +7,78 @@ metrics:
|
|
| 7 |
- mae
|
| 8 |
pipeline_tag: graph-ml
|
| 9 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
- mae
|
| 8 |
pipeline_tag: graph-ml
|
| 9 |
---
|
| 10 |
+
|
| 11 |
+
# AtomFormer base model
|
| 12 |
+
|
| 13 |
+
This model is a transformer-based model that leverages gaussian pair-wise positional embeddings to train on atomistic graph data. It
|
| 14 |
+
is part of a suite of datasets/models/utilities in the AtomGen project that supports other methods for pre-training and fine-tuning
|
| 15 |
+
models on atomistic graphs.
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
## Model description
|
| 19 |
+
|
| 20 |
+
AtomFormer is a transformer model with modifcations to train on atomstic graphs. It builds primarily on the work
|
| 21 |
+
from uni-mol+ to add the pair-wise pos. embeds. to the attention mask to leverage 3-D positional information.
|
| 22 |
+
This model was pre-trained on a diverse set of aggregated atomistic datasets where the target task is the per-atom
|
| 23 |
+
force prediction and the per-system energy prediction.
|
| 24 |
+
|
| 25 |
+
The model also includes metadata regarding the atomic species that are being modeled, this includes the atomic radius,
|
| 26 |
+
electronegativity, valency, etc. The metadata is normalized and projected to be added to the atom embeddings in the model.
|
| 27 |
+
|
| 28 |
+
|
| 29 |
+
## Intended uses & limitations
|
| 30 |
+
|
| 31 |
+
You can use the raw model for either force and energy prediction, but it's mostly intended to
|
| 32 |
+
be fine-tuned on a downstream task. The performance of the model as a force and energy prediction model
|
| 33 |
+
is not validated, it was primarily used a pre-training task.
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
### How to use
|
| 37 |
+
|
| 38 |
+
You can use this model directly by loading via the Structure2EnergyandForces task:
|
| 39 |
+
|
| 40 |
+
```python
|
| 41 |
+
>>> from transformers import AutoModel
|
| 42 |
+
```
|
| 43 |
+
|
| 44 |
+
Here is how to use this model to get the features of a given atomistic graph in PyTorch:
|
| 45 |
+
|
| 46 |
+
```python
|
| 47 |
+
from transformers import AutoModel
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
## Training data
|
| 52 |
+
|
| 53 |
+
AtomFormer is trained on an aggregated S2EF dataset from multiple sources such as OC20, OC22, ODAC23, MPtrj, and SPICE
|
| 54 |
+
with structures and energies/forces for pre-training. The pre-training data includes total energies and formation
|
| 55 |
+
energies but trains using formation energy (which isn't included for OC22, indicated by "has_formation_energy" column).
|
| 56 |
+
|
| 57 |
+
## Training procedure
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
### Preprocessing
|
| 62 |
+
|
| 63 |
+
The model expects input in the form of tokenized atomic symbols represented as `input_ids` and 3D coordinates represented
|
| 64 |
+
as `coords`. For the pre-training task it also expects labels for the `forces` and `formation_energy`.
|
| 65 |
+
|
| 66 |
+
The `DataCollatorForAtomModeling` utility in the AtomGen library has the capacity to perform dynamic padding to batch the
|
| 67 |
+
data together. It also offers the option to flatten the data and provide a `batch` column for gnn-style training.
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
### Pretraining
|
| 71 |
+
|
| 72 |
+
The model was trained on a node of 4xA40 (48 GB) for 10 epochs (~2 weeks). See the
|
| 73 |
+
[training code](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for all hyperparameters
|
| 74 |
+
details.
|
| 75 |
+
|
| 76 |
+
## Evaluation results
|
| 77 |
+
|
| 78 |
+
We use the Atom3D dataset to evaluate the model's performance on downstream tasks.
|
| 79 |
+
|
| 80 |
+
When fine-tuned on downstream tasks, this model achieves the following results:
|
| 81 |
+
|
| 82 |
+
| Task | SMP | PIP | RES | MSP | LBA | LEP | PSR | RSR |
|
| 83 |
+
|:----:|:----:|:----:|:----:|:-----:|:----:|:-----:|:----:|:----:|
|
| 84 |
+
| | TBD | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
|