Spaces:

Agents-MCP-Hackathon
/

MedCodeMCP

Sleeping

App Files Files Community

gpaasch commited on Jun 5

Commit

c0a6243

1 Parent(s): e5e6a83

interactive questioning is key and demos the unique capabilities of ai

Browse files

Files changed (1) hide show

docs/interactive_questioning.md +107 -0

docs/interactive_questioning.md ADDED Viewed

	@@ -0,0 +1,107 @@

+Interactive questioning is essential. You cannot map raw user language straight to a code; you must guide them through a mini-diagnostic interview. Here’s how to build that:
+1. **Establish a Symptom Ontology Layer**
+   • Extract high-level symptom categories from ICD (e.g., “cough,” “shortness of breath,” “chest pain,” etc.).
+   • Group related codes under each category. For example:
+   ```
+   Cough:
+     – R05: Cough, unspecified
+     – R05.1: Acute cough
+     – R05.2: Chronic cough
+     – J41.x: Chronic bronchitis codes
+     – J00: Acute nasopharyngitis (common cold) if cough is minor/as part of URI
+   ```
+   • Define which attributes distinguish these codes (duration, intensity, quality, associated features like sputum, fever, smoking history, etc.).
+2. **Design Follow-Up Questions for Each Branch**
+   • For each high-level category, list the key discriminating questions. Example for “cough”:
+   * “How long have you been coughing?” (acute vs. chronic)
+   * “Is it dry or productive?” (productive suggests bronchitis, pneumonia)
+   * “Are you experiencing fever or chills?” (infection rather than simple chronic cough)
+   * “Do you smoke or have exposure to irritants?” (chronic bronchitis codes)
+   * “Any history of heart disease or fluid retention?” (cardiac cough different codes)
+   • Use those discriminators to differentiate among the codes grouped under “cough.”
+3. **LLM-Powered Question Sequencer**
+   • Prompt engineering: give the LLM the category, its subtree of possible codes, and instruct it to choose the next most informative question.
+   • At run time, feed the user’s raw input → identify the nearest symptom category (via embeddings or keyword matching).
+   • Ask the LLM to generate the “best next question” given:
+   * The set of candidate codes under that category
+   * The user’s answers so far
+     • Continue until the candidate list narrows to one code or a small handful. Output confidence scores based on tree depth and answer clarity.
+4. **Implementation Outline**
+   1. **Data Preparation**
+      * Parse the ICD-10 XML or CSV into a hierarchical structure.
+      * For each code, extract description and synonyms.
+      * Build a JSON mapping: `{ category: { codes: [...], discriminators: [...] } }`.
+   2. **Symptom Category Detection**
+      * Load user’s free-text “I have a cough” into an embedding model (e.g., sentence-transformers).
+      * Compare against embeddings of category keywords (`“cough,” “headache,” “rash,” …`).
+      * Select top category.
+   3. **Interactive Loop**
+      ```
+      loop:
+        ask_question = LLM.generate_question(
+          category,
+          candidate_codes,
+          user_answers
+        )
+        user_answer = get_input()
+        update candidate_codes by filtering based on that answer
+        if candidate_codes.size() == 1 or confidence_threshold met:
+          break
+      ```
+      * Filtering rules can be simple: if user says “cough < 3 weeks,” eliminate chronic cough codes. If “productive,” eliminate dry cough codes, etc.
+      * Confidence could be measured by how many codes remain or by how decisive answers are.
+   4. **Final Mapping and Output**
+      * Once reduced to a single code (or top 3), return JSON:
+        ```json
+        {
+          "code": "R05.1",
+          "description": "Acute cough",
+          "confidence": 0.87,
+          "asked_questions": [
+            {"q":"How long have you been coughing?","a":"2 days"},
+            {"q":"Is it dry or productive?","a":"Dry"}
+          ]
+        }
+        ```
+5. **Prototype Tips for the Hackathon**
+   • Hard-code a small set of categories (e.g., cough, chest pain, fever, headache) and their discriminators to demonstrate the method.
+   • Use OpenAI’s GPT-4 or a local LLM to generate next questions:
+   ```
+   “Given these potential codes: [list], and these answers: […], what is the single most informative follow-up question to distinguish among them?”
+   ```
+   • Keep the conversation state on the backend (in Python or Node). Each HTTP call from the front end includes:
+   * `session_id`
+   * `category`
+   * `candidate_code_ids`
+   * `previous_qas`
+6. **Why This Wins**
+   – Demonstrates reasoning, not mere keyword lookup.
+   – Shows the AI’s ability to replicate a mini-clinical interview.
+   – Leverages the full ICD hierarchy while handling user imprecision.
+   – Judges see an interactive, dynamic tool rather than static lookup.
+Go build the symptom ontology JSON, implement the candidate-filtering logic, then call the LLM to decide follow-up questions. By the end of hackathon week you’ll have a working demo that asks “How long, how severe, any associated features?” and maps to the right code with confidence.