--- title: Geo Spatial Multi Vector Search emoji: 🌍 colorFrom: yellow colorTo: red sdk: streamlit sdk_version: 1.52.0 app_file: app.py pinned: false license: mit ---

Geo-Spatial Chat with Qdrant & ColPali

[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-sm-dark.svg)](https://huggingface.co/spaces/mahimairaja/geo-spatial-multi-vector-search) [![Qdrant](https://img.shields.io/badge/Qdrant-Vector_Database-red)](https://qdrant.tech/) [![ColPali](https://img.shields.io/badge/ColPali-Multi--Vector_Retrieval-blue)](https://github.com/illuin-tech/colpali) Query geospatial burn scar data using natural language, powered by **ColPali (Vidore/colSmol-500M)** multi-vector embeddings and **Qdrant**. ## ✨ Features - **Natural Language Search**: Ask questions like _"Find burn scars larger than 500 hectares in California"_. - **Multi-Vector Retrieval**: Uses `colpali-v1.2` (via `colSmol-500M`) for fine-grained patch-level image retrieval. - **Spatial Filtering**: - **Geocoding Dropdown**: Select US States or Canadian Provinces to automatically focus the search. - **Radius Search**: Filter results within a specified radius (km) of a location. - **Temporal Filtering**: Filter burn scars by acquisition date range. - **Interactive Map**: Visualize results on a Folium map with popups displaying score, area, and RGB imagery. - **Rich Results**: View top matches with confidence scores, metadata, and **Color (RGB)** imagery. ## 🚀 Getting Started ### Prerequisites - Python 3.10+ - Qdrant Instance (Local or Cloud) ### Installation 1. **Clone the repository:** ```bash git clone https://github.com/mahimairaja/geo-spatial-chat-qdrant.git cd geo-spatial-chat-qdrant ``` 2. **Install dependencies:** Using `uv` (recommended): ```bash uv pip install -r requirements.txt ``` Or standard pip: ```bash pip install -r requirements.txt ``` 3. **Environment Setup:** Create a `.env` file in the root directory: ```env QDRANT_URL=your_qdrant_url QDRANT_API_KEY=your_qdrant_api_key HF_TOKEN=your_huggingface_token ``` ### Data Ingestion [![Dataset on HF](https://huggingface.co/datasets/huggingface/badges/resolve/main/dataset-on-hf-sm-dark.svg)](https://huggingface.co/datasets/mahimairaja/ibm-hls-burn-original) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1yiXCy2WVvvJREhiL75r63Oul_MFjNNO9?usp=sharing) To ingest the dataset (HLS Burn Scars) into Qdrant: ```bash python -m utils.ingest_to_qdrant ``` _Note: This process generates ColPali embeddings and may take some time depending on your hardware (GPU recommended)._ ### Running the App ```bash streamlit run app.py ``` ## 🛠️ Technology Stack - **Frontend**: [Streamlit](https://streamlit.io/) - **Vector Database**: [Qdrant](https://qdrant.tech/) - **Embedding Model**: [ColPali (Vidore/colSmol-500M)](https://huggingface.co/vidore/colSmol-500M) - Optimized for document/image retrieval using Idefics3 architecture. - **Map Visualization**: [Folium](https://python-visualization.github.io/folium/) & `streamlit-folium` - **Geocoding**: `geopy` (Nominatim API) If you are interested in dataset preparation, you can find it here: [![Data Preparation](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12KGJQ2UzQdaLIXbV258kh6tp9Ao7duVu?usp=sharing) ## 📄 License MIT License