MorbidCorp
/

MORBID-Actuarial-v010-Conversational

conversational-actuarial

Model card Files Files and versions

h3ir commited on 24 days ago

Commit

df51c26

·

verified ·

1 Parent(s): 8c10a27

Update inference config (v0.1.0 label)

Files changed (1) hide show

INFERENCE_CONFIG.md +11 -8

INFERENCE_CONFIG.md CHANGED Viewed

@@ -1,10 +1,10 @@
-# Inference Endpoint Configuration (v0.1.1)
 ## Model Details
-- **Model**: MORBID-Actuarial v0.1.1 Conversational
 - **Type**: Causal Language Model (Conversational)
 - **Base**: TinyLlama-1.1B
-- **Handler**: Custom handler.py (v0.1.1) included
 ## Recommended Configuration
 - **Instance Type**: GPU Small (1x NVIDIA T4)
@@ -27,12 +27,15 @@ def query(payload):
     response = requests.post(API_URL, headers=headers, json=payload)
     return response.json()
-# Test queries
 output = query({
-    "inputs": "Hi, how are you?",
     "parameters": {
-        "max_new_tokens": 100,
-        "temperature": 0.8
     }
 })
 ```
@@ -41,4 +44,4 @@ output = query({
 - Conversational AI with personality
 - Actuarial expertise (97.8% accuracy on exams)
 - Multi-turn context retention and reduced artifacts
-- Tighter generation with stop handling and bad-words filtering

+# Inference Endpoint Configuration (v0.1.0)
 ## Model Details
+- **Model**: MORBID-Actuarial v0.1.0 Conversational
 - **Type**: Causal Language Model (Conversational)
 - **Base**: TinyLlama-1.1B
+- **Handler**: Custom handler.py included
 ## Recommended Configuration
 - **Instance Type**: GPU Small (1x NVIDIA T4)
     response = requests.post(API_URL, headers=headers, json=payload)
     return response.json()
+# Test queries (tighter generation under the hood)
 output = query({
+    "inputs": "Human: Help me price a level annuity immediate paying 1000/year at 5%
+Assistant:",
     "parameters": {
+        "max_new_tokens": 220,
+        "temperature": 0.35,
+        "top_p": 0.9,
+        "repetition_penalty": 1.15
     }
 })
 ```
 - Conversational AI with personality
 - Actuarial expertise (97.8% accuracy on exams)
 - Multi-turn context retention and reduced artifacts
+- Tighter generation and stop handling