mahimairaja's picture
fix: pinned version
0251dd6
---
title: Geo Spatial Multi Vector Search
emoji: 🌍
colorFrom: yellow
colorTo: red
sdk: streamlit
sdk_version: 1.52.0
app_file: app.py
pinned: false
license: mit
---
<h1> <center>Geo-Spatial Chat with Qdrant & ColPali</center> </h1>
[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-sm-dark.svg)](https://huggingface.co/spaces/mahimairaja/geo-spatial-multi-vector-search)
[![Qdrant](https://img.shields.io/badge/Qdrant-Vector_Database-red)](https://qdrant.tech/)
[![ColPali](https://img.shields.io/badge/ColPali-Multi--Vector_Retrieval-blue)](https://github.com/illuin-tech/colpali)
Query geospatial burn scar data using natural language, powered by **ColPali (Vidore/colSmol-500M)** multi-vector embeddings and **Qdrant**.
## ✨ Features
- **Natural Language Search**: Ask questions like _"Find burn scars larger than 500 hectares in California"_.
- **Multi-Vector Retrieval**: Uses `colpali-v1.2` (via `colSmol-500M`) for fine-grained patch-level image retrieval.
- **Spatial Filtering**:
- **Geocoding Dropdown**: Select US States or Canadian Provinces to automatically focus the search.
- **Radius Search**: Filter results within a specified radius (km) of a location.
- **Temporal Filtering**: Filter burn scars by acquisition date range.
- **Interactive Map**: Visualize results on a Folium map with popups displaying score, area, and RGB imagery.
- **Rich Results**: View top matches with confidence scores, metadata, and **Color (RGB)** imagery.
## πŸš€ Getting Started
### Prerequisites
- Python 3.10+
- Qdrant Instance (Local or Cloud)
### Installation
1. **Clone the repository:**
```bash
git clone https://github.com/mahimairaja/geo-spatial-chat-qdrant.git
cd geo-spatial-chat-qdrant
```
2. **Install dependencies:**
Using `uv` (recommended):
```bash
uv pip install -r requirements.txt
```
Or standard pip:
```bash
pip install -r requirements.txt
```
3. **Environment Setup:**
Create a `.env` file in the root directory:
```env
QDRANT_URL=your_qdrant_url
QDRANT_API_KEY=your_qdrant_api_key
HF_TOKEN=your_huggingface_token
```
### Data Ingestion
[![Dataset on HF](https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-sm-dark.svg)](https://huggingface.co/datasets/mahimairaja/ibm-hls-burn-original) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1yiXCy2WVvvJREhiL75r63Oul_MFjNNO9?usp=sharing)
To ingest the dataset (HLS Burn Scars) into Qdrant:
```bash
python -m utils.ingest_to_qdrant
```
_Note: This process generates ColPali embeddings and may take some time depending on your hardware (GPU recommended)._
### Running the App
```bash
streamlit run app.py
```
## πŸ› οΈ Technology Stack
- **Frontend**: [Streamlit](https://streamlit.io/)
- **Vector Database**: [Qdrant](https://qdrant.tech/)
- **Embedding Model**: [ColPali (Vidore/colSmol-500M)](https://huggingface.co/vidore/colSmol-500M) - Optimized for document/image retrieval using Idefics3 architecture.
- **Map Visualization**: [Folium](https://python-visualization.github.io/folium/) & `streamlit-folium`
- **Geocoding**: `geopy` (Nominatim API)
If you are interested in dataset preparation, you can find it here:
[![Data Preparation](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12KGJQ2UzQdaLIXbV258kh6tp9Ao7duVu?usp=sharing)
## πŸ“„ License
MIT License