Theo Viel commited on
Commit
f1e5511
·
1 Parent(s): f0a19f6

update model card, add example

Browse files
Files changed (2) hide show
  1. README.md +65 -4
  2. ocr-example-input-1.png +3 -0
README.md CHANGED
@@ -20,9 +20,10 @@ tags:
20
 
21
  ## **Model Overview**
22
 
 
23
  ![viz.png](viz.png)
24
 
25
- *Preview of the model output on the example image.*
26
 
27
  ### **Description**
28
 
@@ -142,6 +143,18 @@ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated sys
142
 
143
  ### Usage
144
 
 
 
 
 
 
 
 
 
 
 
 
 
145
  The model requires torch, and the custom code available in this repository.
146
 
147
  1. Clone the repository
@@ -159,13 +172,61 @@ git clone https://huggingface.co/nvidia/nemoretriever-ocr-v1
159
  git clone [email protected]:nvidia/nemoretriever-ocr-v1
160
  ```
161
 
162
- 2. Install the dependencies
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
163
 
164
- - TODO
165
 
166
  3. Run the model using the following code:
167
 
168
- - TODO
 
 
 
 
 
 
 
 
 
 
 
 
 
169
 
170
  <!---
171
  ### Software Integration
 
20
 
21
  ## **Model Overview**
22
 
23
+ <!--
24
  ![viz.png](viz.png)
25
 
26
+ *Preview of the model output on the example image.* -->
27
 
28
  ### **Description**
29
 
 
143
 
144
  ### Usage
145
 
146
+ #### Prerequisites
147
+
148
+ - **OS**: Linux amd64 with NVIDIA GPU
149
+ - **CUDA**: CUDA Toolkit 12.8 and compatible NVIDIA driver installed (for PyTorch CUDA). Verify with `nvidia-smi`.
150
+ - **Python**: 3.12 (both subpackages require `python = ~3.12`)
151
+ - **Build tools (when building the C++ extension)**:
152
+ - GCC/G++ with C++17 support
153
+ - CUDA toolkit headers (for building CUDA kernels)
154
+ - OpenMP (used by the C++ extension)
155
+
156
+
157
+ #### Installation
158
  The model requires torch, and the custom code available in this repository.
159
 
160
  1. Clone the repository
 
172
  git clone [email protected]:nvidia/nemoretriever-ocr-v1
173
  ```
174
 
175
+ 2. Installation
176
+
177
+ ##### With pip
178
+
179
+ - Create and activate a Python 3.12 environment (optional)
180
+
181
+ - Run the following command to install the package:
182
+
183
+ ```bash
184
+ cd nemo-retriever-ocr
185
+ pip install hatchling
186
+ pip install -v .
187
+ ```
188
+
189
+ ##### With docker
190
+
191
+ Run the example end-to-end without installing anything on the host (besides Docker, docker compose, and NVIDIA Container Toolkit):
192
+
193
+ - Ensure Docker can see your GPU:
194
+
195
+ ```bash
196
+ docker run --rm --gpus all nvcr.io/nvidia/pytorch:25.09-py3 nvidia-smi
197
+ ```
198
+
199
+ - From the repo root, bring up the service to run the example against the provided image `ocr-example-image.png`:
200
+
201
+ ```bash
202
+ docker compose run --rm nemo-retriever-ocr \
203
+ bash -lc "python example.py ocr-example-input-1.png --merge-level paragraph"
204
+ ```
205
+
206
+ This will:
207
+ - Build an image from the provided `Dockerfile` (based on `nvcr.io/nvidia/pytorch`)
208
+ - Mount the repo at `/workspace`
209
+ - Run `example.py` with model from `checkpoints`
210
+
211
+ Output is saved next to your input image as `<name>-annotated.<ext>` on the host.
212
 
 
213
 
214
  3. Run the model using the following code:
215
 
216
+ ```python
217
+ from nemo_retriever_ocr.inference.pipeline import NemoRetrieverOCR
218
+
219
+ ocr = NemoRetrieverOCR()
220
+
221
+ predictions = ocr("ocr-example-input-1.png")
222
+
223
+ for pred in predictions:
224
+ print(
225
+ f" - Text: '{pred['text']}', "
226
+ f"Confidence: {pred['confidence']:.2f}, "
227
+ f"Bbox: [left={pred['left']:.4f}, upper={pred['upper']:.4f}, right={pred['right']:.4f}, lower={pred['lower']:.4f}]"
228
+ )
229
+ ```
230
 
231
  <!---
232
  ### Software Integration
ocr-example-input-1.png ADDED

Git LFS Details

  • SHA256: f7faa3e8052eab6c00fb54d269a7b6a5b4bd2774d4f5109120da2525c5c16a3b
  • Pointer size: 131 Bytes
  • Size of remote file: 225 kB