Translate gemma 4B recipes by tanzeel-amd · Pull Request #404 · microsoft/olive-recipes

tanzeel-amd · 2026-05-08T10:32:17Z

Adds a complete Olive recipe to export and run google/translategemma-4b-it as a three-submodel VLM pipeline via ONNX Runtime GenAI. TranslateGemma is a Gemma3-based multimodal translation model supporting 55 languages with both text-to-text and image-to-text translation.

What's included

Export pipeline (builtin/optimize.py): Runs three Olive workflows to produce the text decoder (INT4 or FP32 via ModelBuilder), embedding (token embed + image feature scattering), and vision encoder (SigLIP 27-layer + AvgPool2d projector). Patches genai_config.json and generates processor_config.json for the OGA C++ image pipeline.
Two export configs: cpu_and_mobile (INT4 RTN, ~6.7 GB) and cpu_and_mobile_fp32 (FP32, ~19.2 GB)
Inference script (inference.py): Text and image translation with streaming output via ORT GenAI multimodal processor
WMT24++ benchmark (benchmark_wmt24pp.py): Evaluates translation quality across up to 55 language pairs using COMET scoring, with domain-stratified sampling and resume support
Custom export wrappers (builtin/user_script.py): Gemma3VisionModel and Gemma3EmbeddingModel wrappers for clean ONNX export of vision and embedding submodels

Copilot

Pull request overview

Adds a new Olive recipe for exporting and running the gated google/translategemma-4b-it multimodal translation model (text and image translation) via ONNX Runtime GenAI, including an optional WMT24++ + COMET benchmarking script.

Changes:

Added end-to-end export pipeline (builtin/optimize.py) plus Olive JSON configs to export text/vision/embedding ONNX sub-models.
Added runtime scripts for inference (inference.py) and WMT24++ benchmarking (benchmark_wmt24pp.py).
Added recipe documentation and included the Gemma Terms of Use.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
google-translategemma-4b-it/README.md	Documents export/inference/benchmark usage and model architecture.
google-translategemma-4b-it/LICENSE	Adds Gemma Terms of Use text to the recipe directory.
google-translategemma-4b-it/inference.py	CLI for text/image translation using ORT GenAI multimodal pipeline.
google-translategemma-4b-it/benchmark_wmt24pp.py	CLI to run WMT24++ translations and score with COMET, with resume support.
google-translategemma-4b-it/builtin/optimize.py	Orchestrates Olive exports and patches GenAI/processor/tokenizer configs.
google-translategemma-4b-it/builtin/user_script.py	Provides PyTorch wrapper modules + IO/dummy-input helpers for Olive export.
google-translategemma-4b-it/builtin/cpu_and_mobile/text.json	Olive config for INT4 RTN text decoder export.
google-translategemma-4b-it/builtin/cpu_and_mobile/embedding.json	Olive config for embedding/scatter ONNX export.
google-translategemma-4b-it/builtin/cpu_and_mobile/vision.json	Olive config for vision encoder ONNX export.
google-translategemma-4b-it/builtin/cpu_and_mobile_fp32/text.json	Olive config for FP32 text decoder export.
google-translategemma-4b-it/builtin/cpu_and_mobile_fp32/embedding.json	Olive config for FP32 embedding/scatter ONNX export.
google-translategemma-4b-it/builtin/cpu_and_mobile_fp32/vision.json	Olive config for FP32 vision encoder ONNX export.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tanzeel-amd · 2026-05-08T10:43:40Z

@microsoft-github-policy-service agree company="AMD"

VishalX · 2026-05-11T10:07:09Z

@xieofxie / @devang-ml pls review

VishalX · 2026-05-21T14:31:36Z

@xieofxie can you please review this?

- Introduced benchmark_wmt24pp.py for evaluating translation quality on the WMT24++ dataset using COMET. - Added inference.py for text and image translation capabilities. - Created README.md with setup instructions, usage examples, and model architecture details. - Implemented optimization pipeline in optimize.py for exporting sub-models (text decoder, vision encoder, embedding) with INT4 quantization. - Developed user script in user_script.py for model integration. - Configured export settings in JSON files for different model variants (INT4, AWQ, FP32). - Included test images for demonstration purposes.

Copied from https://ai.google.dev/gemma/terms

- README.md: Remove nonexistent AWQ config references, fix file tree to match actual repo layout (google-translategemma-4b-it/, no data/ dir) - benchmark_wmt24pp.py: Default --hf-model-dir to HF model ID, remove unused stream param and dead new_pairs variable - inference.py: Remove unused numpy import - builtin/user_script.py: Remove unused math, os, glob imports Co-authored-by: Cursor <cursoragent@cursor.com>

Moved README.md, inference.py, benchmark_wmt24pp.py into builtin/ to follow the common recipe file organization pattern. Added info.yml with recipe metadata. Updated default model paths in inference.py and benchmark_wmt24pp.py. Co-authored-by: Cursor <cursoragent@cursor.com>

Copilot AI review requested due to automatic review settings May 8, 2026 10:32

Copilot started reviewing on behalf of tanzeel-amd May 8, 2026 10:32 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

tanzeel-amd force-pushed the translate-gemma-4b branch from 4d43b33 to 6fae78b Compare May 19, 2026 10:08

xieofxie reviewed May 22, 2026

View reviewed changes

Comment thread google-translategemma-4b-it/builtin/README.md

VishalX and others added 9 commits May 22, 2026 05:07

Add Gemma3 License file

b31a9c5

Copied from https://ai.google.dev/gemma/terms

Update Licence

fc5978e

Update Licence

c9af021

Update user_script.py

09bb96b

Update Licence

645c32f

Add MIT license for new files

d44a6cd

tanzeel-amd force-pushed the translate-gemma-4b branch from 22328c6 to ada0ad0 Compare May 22, 2026 12:07

tanzeel-amd requested a review from xieofxie May 25, 2026 05:16

xieofxie approved these changes May 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translate gemma 4B recipes#404

Translate gemma 4B recipes#404
tanzeel-amd wants to merge 9 commits into
microsoft:mainfrom
tanzeel-amd:translate-gemma-4b

tanzeel-amd commented May 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tanzeel-amd commented May 8, 2026

Uh oh!

VishalX commented May 11, 2026

Uh oh!

VishalX commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

tanzeel-amd commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's included

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tanzeel-amd commented May 8, 2026

Uh oh!

VishalX commented May 11, 2026

Uh oh!

VishalX commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tanzeel-amd commented May 8, 2026 •

edited

Loading