update model card, add example

Files changed (2) hide show

README.md +65 -4
ocr-example-input-1.png +3 -0

README.md CHANGED Viewed

@@ -20,9 +20,10 @@ tags:
 ## **Model Overview**
 ![viz.png](viz.png)
-*Preview of the model output on the example image.*
 ### **Description**
@@ -142,6 +143,18 @@ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated sys
 ### Usage
 The model requires torch, and the custom code available in this repository.
 1. Clone the repository
@@ -159,13 +172,61 @@ git clone https://huggingface.co/nvidia/nemoretriever-ocr-v1
 git clone [email protected]:nvidia/nemoretriever-ocr-v1
 ```
-2. Install the dependencies
-- TODO
 3. Run the model using the following code:
-- TODO
 <!---
 ### Software Integration

 ## **Model Overview**
+<!--
 ![viz.png](viz.png)
+*Preview of the model output on the example image.* -->
 ### **Description**
 ### Usage
+#### Prerequisites
+- **OS**: Linux amd64 with NVIDIA GPU
+- **CUDA**: CUDA Toolkit 12.8 and compatible NVIDIA driver installed (for PyTorch CUDA). Verify with `nvidia-smi`.
+- **Python**: 3.12 (both subpackages require `python = ~3.12`)
+- **Build tools (when building the C++ extension)**:
+  - GCC/G++ with C++17 support
+  - CUDA toolkit headers (for building CUDA kernels)
+  - OpenMP (used by the C++ extension)
+#### Installation
 The model requires torch, and the custom code available in this repository.
 1. Clone the repository
 git clone [email protected]:nvidia/nemoretriever-ocr-v1
 ```
+2. Installation
+##### With pip
+- Create and activate a Python 3.12 environment (optional)
+- Run the following command to install the package:
+```bash
+cd nemo-retriever-ocr
+pip install hatchling
+pip install -v .
+```
+##### With docker
+Run the example end-to-end without installing anything on the host (besides Docker, docker compose, and NVIDIA Container Toolkit):
+- Ensure Docker can see your GPU:
+```bash
+docker run --rm --gpus all nvcr.io/nvidia/pytorch:25.09-py3 nvidia-smi
+```
+- From the repo root, bring up the service to run the example against the provided image `ocr-example-image.png`:
+```bash
+docker compose run --rm nemo-retriever-ocr \
+  bash -lc "python example.py ocr-example-input-1.png --merge-level paragraph"
+```
+This will:
+- Build an image from the provided `Dockerfile` (based on `nvcr.io/nvidia/pytorch`)
+- Mount the repo at `/workspace`
+- Run `example.py` with model from `checkpoints`
+Output is saved next to your input image as `<name>-annotated.<ext>` on the host.
 3. Run the model using the following code:
+```python
+from nemo_retriever_ocr.inference.pipeline import NemoRetrieverOCR
+ocr = NemoRetrieverOCR()
+predictions = ocr("ocr-example-input-1.png")
+for pred in predictions:
+    print(
+        f"  - Text: '{pred['text']}', "
+        f"Confidence: {pred['confidence']:.2f}, "
+        f"Bbox: [left={pred['left']:.4f}, upper={pred['upper']:.4f}, right={pred['right']:.4f}, lower={pred['lower']:.4f}]"
+    )
+```
 <!---
 ### Software Integration

ocr-example-input-1.png ADDED Viewed

Git LFS Details

SHA256: f7faa3e8052eab6c00fb54d269a7b6a5b4bd2774d4f5109120da2525c5c16a3b
Pointer size: 131 Bytes
Size of remote file: 225 kB