Update inference config (v0.1.0 label)
Browse files- INFERENCE_CONFIG.md +11 -8
INFERENCE_CONFIG.md
CHANGED
|
@@ -1,10 +1,10 @@
|
|
| 1 |
-
# Inference Endpoint Configuration (v0.1.
|
| 2 |
|
| 3 |
## Model Details
|
| 4 |
-
- **Model**: MORBID-Actuarial v0.1.
|
| 5 |
- **Type**: Causal Language Model (Conversational)
|
| 6 |
- **Base**: TinyLlama-1.1B
|
| 7 |
-
- **Handler**: Custom handler.py
|
| 8 |
|
| 9 |
## Recommended Configuration
|
| 10 |
- **Instance Type**: GPU Small (1x NVIDIA T4)
|
|
@@ -27,12 +27,15 @@ def query(payload):
|
|
| 27 |
response = requests.post(API_URL, headers=headers, json=payload)
|
| 28 |
return response.json()
|
| 29 |
|
| 30 |
-
# Test queries
|
| 31 |
output = query({
|
| 32 |
-
"inputs": "
|
|
|
|
| 33 |
"parameters": {
|
| 34 |
-
"max_new_tokens":
|
| 35 |
-
"temperature": 0.
|
|
|
|
|
|
|
| 36 |
}
|
| 37 |
})
|
| 38 |
```
|
|
@@ -41,4 +44,4 @@ output = query({
|
|
| 41 |
- Conversational AI with personality
|
| 42 |
- Actuarial expertise (97.8% accuracy on exams)
|
| 43 |
- Multi-turn context retention and reduced artifacts
|
| 44 |
-
- Tighter generation
|
|
|
|
| 1 |
+
# Inference Endpoint Configuration (v0.1.0)
|
| 2 |
|
| 3 |
## Model Details
|
| 4 |
+
- **Model**: MORBID-Actuarial v0.1.0 Conversational
|
| 5 |
- **Type**: Causal Language Model (Conversational)
|
| 6 |
- **Base**: TinyLlama-1.1B
|
| 7 |
+
- **Handler**: Custom handler.py included
|
| 8 |
|
| 9 |
## Recommended Configuration
|
| 10 |
- **Instance Type**: GPU Small (1x NVIDIA T4)
|
|
|
|
| 27 |
response = requests.post(API_URL, headers=headers, json=payload)
|
| 28 |
return response.json()
|
| 29 |
|
| 30 |
+
# Test queries (tighter generation under the hood)
|
| 31 |
output = query({
|
| 32 |
+
"inputs": "Human: Help me price a level annuity immediate paying 1000/year at 5%
|
| 33 |
+
Assistant:",
|
| 34 |
"parameters": {
|
| 35 |
+
"max_new_tokens": 220,
|
| 36 |
+
"temperature": 0.35,
|
| 37 |
+
"top_p": 0.9,
|
| 38 |
+
"repetition_penalty": 1.15
|
| 39 |
}
|
| 40 |
})
|
| 41 |
```
|
|
|
|
| 44 |
- Conversational AI with personality
|
| 45 |
- Actuarial expertise (97.8% accuracy on exams)
|
| 46 |
- Multi-turn context retention and reduced artifacts
|
| 47 |
+
- Tighter generation and stop handling
|