deepbrain commited on
Commit
78406b8
·
verified ·
1 Parent(s): 8464b95

Update card

Browse files
Files changed (1) hide show
  1. README.md +13 -17
README.md CHANGED
@@ -1,13 +1,12 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
4
  ---
5
 
6
  # Model Card for Model ID
7
 
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
-
11
 
12
  ## Model Details
13
 
@@ -15,23 +14,22 @@ tags: []
15
 
16
  <!-- Provide a longer summary of what this model is. -->
17
 
18
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 
 
19
 
20
- - **Developed by:** [More Information Needed]
21
- - **Funded by [optional]:** [More Information Needed]
22
- - **Shared by [optional]:** [More Information Needed]
23
  - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
- - **License:** [More Information Needed]
26
- - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
  ### Model Sources [optional]
29
 
30
  <!-- Provide the basic links for the model. -->
31
 
32
- - **Repository:** [More Information Needed]
33
- - **Paper [optional]:** [More Information Needed]
34
- - **Demo [optional]:** [More Information Needed]
35
 
36
  ## Uses
37
 
@@ -196,6 +194,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
196
 
197
  ## Model Card Contact
198
 
199
- [More Information Needed]
200
-
201
-
 
1
  ---
2
  library_name: transformers
3
+ license: mit
4
+ datasets:
5
+ - gsm8k
6
  ---
7
 
8
  # Model Card for Model ID
9
 
 
 
 
10
 
11
  ## Model Details
12
 
 
14
 
15
  <!-- Provide a longer summary of what this model is. -->
16
 
17
+ This is the result of 3 iterations of self improvement of the model on a subset of GSM8K problems where the base Phi-2 was less confident.
18
+ We utilized self consistency evaluation along with execution traces to self-select high quality self-generated samples for training without looking at the ground truth answers.
19
+ This improved the base model Phi-2 accuracy by about 6% on GSM8K dataset - both the test set and the harder to solve subset of the training data.
20
 
21
+ - **Developed by:** Stanford University team: Artyom Shaposhnikov, Roberto Garcia, Shubhra Mishra
 
 
22
  - **Model type:** [More Information Needed]
23
+ - **Language(s) (NLP):** Python
24
+ - **License:** MIT
25
+ - **Finetuned from model [optional]:** microsoft/phi-2
26
 
27
  ### Model Sources [optional]
28
 
29
  <!-- Provide the basic links for the model. -->
30
 
31
+ - **Repository:** https://github.com/deepbrain/CS224N
32
+ - **Paper [optional]:** "Self-Improvement for Math Problem-Solving in Small Language Models"
 
33
 
34
  ## Uses
35
 
 
194
 
195
  ## Model Card Contact
196
 
197
+ [More Information Needed]