Spaces:
Sleeping
Sleeping
File size: 3,462 Bytes
e06a21d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
# CVE Fact Checker - Deployment Guide
## Quick Start
### Local Development
```bash
python -m pip install -r requirements.txt
python -m cve_factchecker
```
### Production (Docker)
```bash
docker build -t cve-fact-checker .
docker run -p 7860:7860 cve-fact-checker
```
### Health Check
```bash
python health_check.py
```
## Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `PORT` | Server port | `7860` |
| `OPENROUTER_API_KEY` | LLM API key | None |
| `FIREBASE_API_KEY` | Firebase API key | (embedded) |
| `AUTO_INGEST` | Auto-ingest on startup | `true` |
| `LANGUAGE_FILTER` | Language to filter articles | `English` |
| `USE_DUMMY_EMBEDDINGS` | Use lightweight embeddings | `false` |
| `VECTOR_PERSIST_DIR` | Vector DB directory | `/tmp/vector_db` |
| `SENTENCE_TRANSFORMERS_HOME` | Model cache | `/tmp/sentence_transformers` |
## API Endpoints
### Health Check
```bash
curl http://localhost:7860/health
```
### Fact Check
```bash
# GET request
curl "http://localhost:7860/fact-check?claim=Your claim here"
# POST request (JSON)
curl -X POST http://localhost:7860/fact-check \
-H "Content-Type: application/json" \
-d '{"claim": "Your claim here"}'
# POST request (form data)
curl -X POST http://localhost:7860/fact-check \
-F "claim=Your claim here"
```
## Troubleshooting
### Common Issues
#### Permission Denied Errors
- **Symptom**: `[Errno 13] Permission denied: './vector_db'`
- **Solution**: The app automatically falls back to `/tmp/vector_db` or in-memory storage
#### Firebase Rate Limiting
- **Symptom**: `Firebase API failed: 429`
- **Solution**: The app implements exponential backoff and retry logic
#### Model Loading Issues
- **Symptom**: `No sentence-transformers model found`
- **Solution**: Set `USE_DUMMY_EMBEDDINGS=true` for faster startup
#### Memory Issues
- **Symptom**: App crashes or becomes unresponsive
- **Solution**: Reduce batch sizes or enable dummy embeddings
### Debug Mode
Run with debug logging:
```bash
export FLASK_ENV=development
python -m cve_factchecker
```
### Manual Health Check
The `health_check.py` script provides comprehensive diagnostics:
```bash
python health_check.py
```
This checks:
- Environment variables
- Directory permissions
- Package imports
- Firebase connectivity
- App functionality
### Production Deployment
For production use:
```bash
python run_production.py
```
This script:
- Runs health checks
- Sets up signal handlers
- Starts gunicorn with optimal settings
- Provides better error reporting
## Docker Configuration
The Dockerfile is optimized for containerized deployment:
- Uses Python 3.11 slim base
- Creates writable cache directories
- Single worker to avoid race conditions
- Proper signal handling
- Health check integration
## Architecture
```
CVE Fact Checker
βββ Flask Web API
βββ Vector Database (ChromaDB)
βββ Embeddings (sentence-transformers)
βββ Firebase Article Loader
βββ LLM Integration (OpenRouter)
```
## Performance Tuning
### For Low-Memory Environments
```bash
export USE_DUMMY_EMBEDDINGS=true
export AUTO_INGEST=false
```
### For High-Throughput
```bash
export AUTO_INGEST=true
# Ensure adequate Firebase API limits
```
## Support
If you encounter issues:
1. Run `python health_check.py` for diagnostics
2. Check application logs for specific errors
3. Verify environment variables are set correctly
4. Ensure proper file system permissions
|