File size: 1,888 Bytes
			
			694c514  | 
								1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73  | 
								# Contributing
## Installation
Install [pixi](https://pixi.sh/latest/) for pulling conda/pip packages:
```bash
curl -fsSL https://pixi.sh/install.sh | sh
```
Create pixi environment and enter activated shell:
```bash
pixi s
```
Create a virtualenv and install nemo-retriever-ocr into it via `uv`:
```bash
uv venv \
&& uv pip install -e ./nemo-retriever-ocr -v
```
Assert that OCR inference libraries can now be imported successfully:
```bash
uv run python -c "import nemo_retriever_ocr; import nemo_retriever_ocr_cpp"
```
## Usage
`nemo_retriever_ocr.inference.pipeline.NemoRetrieverOCR` is the main entry point for performing OCR inference; it can be used to iterate over predictions for a given input image:
```python
from nemo_retriever_ocr.inference.pipeline import NemoRetrieverOCR
ocr = NemoRetrieverOCR()
predictions = ocr("ocr-example-input-1.png")
for pred in predictions:
    print(
        f"  - Text: '{pred['text']}', "
        f"Confidence: {pred['confidence']:.2f}, "
        f"Bbox: [left={pred['left']:.4f}, upper={pred['upper']:.4f}, right={pred['right']:.4f}, lower={pred['lower']:.4f}]"
    )
```
Or predictions can be superimposed on the input image for visualization:
```python
ocr(image_path, visualize=True)
```
The level of detection merging can be adjusted by modifying the `merge_level` argument (defaulting to "paragraph"):
```python
ocr(image_path, merge_level="word")      # leave detected words unmerged
ocr(image_path, merge_level="sentence")  # merge detected words into sentences
```
An example script `example.py` is provided for convenience:
```bash
uv run python example.py ocr-example-input-1.png
```
Detection merging can be adjusted by modifying the `--merge-level` option:
```bash
uv run python example.py ocr-example-input-1.png --merge-level word
uv run python example.py ocr-example-input-1.png --merge-level sentence
``` |