Production-ready SuryaOCR deployment on RunPod H100 GPU with 0.5s per image OCR speed.
Best for production: Models are pre-downloaded, no cold start delays!
The Docker image is automatically built by GitHub Actions and pushed to GitHub Container Registry.
Just use this image in RunPod:
ghcr.io/gunitbindal/surya-runpod-h100:latest
Setup:
-
Create RunPod Template:
- Go to: RunPod Console → Serverless → Templates → New Template
- Container Image:
ghcr.io/gunitbindal/surya-runpod-h100:latest - Docker Command: (leave empty)
- GPU: H100 80GB or H100 PCIe
-
Deploy Endpoint:
- Template: Select your template
- Active Workers: 1
- Max Workers: 1
- Endpoint Type: Queue
Benefits:
- ✅ Instant cold starts (2-3 seconds)
- ✅ No model download wait time
- ✅ No pip installs on startup
- ✅ Production-ready
- ✅ Automatically updated on every push to main
Good for testing: Downloads on every worker start.
-
Create Template:
- Container Image:
runpod/pytorch:2.8.0-py3.11-cuda12.8.1-cudnn-devel-ubuntu22.04 - Docker Command:
bash -c "pip install --no-cache-dir surya-ocr runpod pillow && curl -sSL https://raw.githubusercontent.com/GunitBindal/surya-runpod-h100/main/handler_final.py -o handler.py && python -u handler.py" - Container Image:
-
Deploy Endpoint (same as Option 1, step 3)
Drawbacks:
- ⏱️ First request: 60-90 seconds (downloads models)
- 📦 Installs packages every worker start
- 💰 Wastes compute time on setup
- First request: 60-90 seconds (downloads Surya models ~500MB)
- Subsequent requests: ~0.5 seconds per image
- Active Workers = 1: Keeps worker warm, prevents queue issues
- Logs visible: Handler prints status to worker logs
handler_final.py- Optimized handler with loggingdocker_command.txt- RunPod Docker commandtest_client.py- Python test client
Requests stuck IN_QUEUE?
- Set Active Workers = 1 in endpoint settings
- Wait 90 seconds for first model download
- Check worker logs for errors
Worker crashing?
- Verify H100 GPU is selected
- Ensure Docker command is exactly as shown above
- Check logs for memory issues
✓ Handler updated with better logging
✓ Simplified deployment (no Docker build)
✓ Single-command setup via GitHub