Add Fara-7b recipes by apsonawane · Pull Request #384 · microsoft/olive-recipes

apsonawane · 2026-04-24T23:17:53Z

This pull request introduces a complete ONNX Runtime GenAI example for the Fara-7B vision-language model, including documentation, configuration files for model export and optimization (for both CPU and CUDA), a Python inference script, and supporting metadata. The changes enable users to export, optimize, quantize, and run inference with Fara-7B using ONNX Runtime GenAI, supporting both text and image inputs.

Key changes:

1. Documentation and Metadata

Added a comprehensive README.md describing the Fara-7B ONNX Runtime GenAI pipeline, setup instructions, usage examples, and directory structure.
Introduced info.yml with metadata such as supported execution providers, devices, and keywords for discoverability.

2. Model Export and Optimization Pipelines

Added Olive configuration JSONs for CPU/mobile (cpu_and_mobile/embedding.json, cpu_and_mobile/vision.json, cpu_and_mobile/text.json) and CUDA (cuda/embedding.json, cuda/vision.json, cuda/text.json) pipelines, specifying model export, graph surgeries, optimizations, and quantization/precision steps for each sub-model (vision encoder, embedding, text decoder). [1] [2] [3] [4] [5] [6]

3. Inference Script

Added inference.py, a Python script for running text or multimodal (image+text) inference with ONNX Runtime GenAI, supporting both batch and interactive modes.

4. Project Structure and Ignore Rules

Updated .gitignore to exclude generated models, cache, Python bytecode, and log files.

These changes together provide an end-to-end workflow for exporting, optimizing, quantizing, and running inference on the Fara-7B model with ONNX Runtime GenAI, making it easy for users to deploy and test the model on both CPU and GPU platforms.

Copilot

Pull request overview

This PR adds a new microsoft-Fara-7B/builtin recipe bundle to export/optimize the Fara-7B vision-language model to ONNX (via Olive) and run it with ONNX Runtime GenAI, including CPU and CUDA pipelines.

Changes:

Added Olive pipeline JSONs for embedding / vision / text sub-model export + optimization for cpu_and_mobile/ and cuda/.
Added Python orchestration and runtime scripts (optimize.py, inference.py, user_script.py) plus model code under codes/.
Added supporting metadata/docs (README.md, info.yml) and local ignore rules (.gitignore).

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
microsoft-Fara-7B/builtin/user_script.py	Olive callbacks for loading the custom VL model + IO/dummy input definitions for export.
microsoft-Fara-7B/builtin/requirements.txt	Python dependencies to run Olive export/optimization and scripts.
microsoft-Fara-7B/builtin/optimize.py	Runs the three Olive configs and patches/writes GenAI runtime config files.
microsoft-Fara-7B/builtin/info.yml	Minimal metadata for the builtin recipe directory.
microsoft-Fara-7B/builtin/inference.py	Example ONNX Runtime GenAI inference script for text-only and image+text prompts.
microsoft-Fara-7B/builtin/cuda/embedding.json	CUDA Olive pipeline for exporting/optimizing embedding sub-model.
microsoft-Fara-7B/builtin/cuda/text.json	CUDA Olive pipeline for producing INT4 text decoder via ModelBuilder.
microsoft-Fara-7B/builtin/cuda/vision.json	CUDA Olive pipeline for exporting/optimizing vision encoder sub-model.
microsoft-Fara-7B/builtin/cpu_and_mobile/embedding.json	CPU/mobile Olive pipeline for exporting/quantizing embedding sub-model.
microsoft-Fara-7B/builtin/cpu_and_mobile/text.json	CPU/mobile Olive pipeline for producing INT4 text decoder via ModelBuilder.
microsoft-Fara-7B/builtin/cpu_and_mobile/vision.json	CPU/mobile Olive pipeline for exporting/quantizing vision encoder sub-model.
microsoft-Fara-7B/builtin/codes/modeling_qwen2_5_vl.py	Custom Qwen2.5-VL-derived PyTorch modeling code to enable ONNX export.
microsoft-Fara-7B/builtin/codes/init.py	Package marker for `codes/`.
microsoft-Fara-7B/builtin/README.md	End-to-end instructions and usage examples for export + inference.
microsoft-Fara-7B/builtin/.gitignore	Ignores generated artifacts and caches for the builtin workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

devang-ml · 2026-05-12T22:47:08Z

Please add LICENSE file.

Add Fara-7b recipes

46a8753

Copilot AI review requested due to automatic review settings April 24, 2026 23:17

Copilot started reviewing on behalf of apsonawane April 24, 2026 23:18 View session

github-code-quality Bot found potential problems Apr 24, 2026

View reviewed changes

Comment thread microsoft-Fara-7B/builtin/codes/modeling_qwen2_5_vl.py Dismissed

Copilot AI reviewed Apr 24, 2026

View reviewed changes

Comment thread microsoft-Fara-7B/builtin/codes/modeling_qwen2_5_vl.py

Comment thread microsoft-Fara-7B/builtin/requirements.txt Outdated

Comment thread microsoft-Fara-7B/builtin/README.md Outdated

Comment thread microsoft-Fara-7B/builtin/codes/modeling_qwen2_5_vl.py

apsonawane and others added 2 commits April 24, 2026 16:27

Address copilot comments

a99914b

Merge branch 'main' into asonawane/fara-recipes

4a3d5f8

apsonawane enabled auto-merge (squash) April 24, 2026 23:52

Add eval script

59184c1

github-code-quality Bot found potential problems Apr 28, 2026

View reviewed changes

Comment thread microsoft-Fara-7B/builtin/eval.py Fixed

Comment thread microsoft-Fara-7B/builtin/eval.py Fixed

apsonawane and others added 4 commits April 28, 2026 17:38

Potential fix for pull request finding 'Empty except'

0f6dea8

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

Potential fix for pull request finding 'Unused local variable'

4c2b82d

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

Merge branch 'main' into asonawane/fara-recipes

7cd95f9

Merge branch 'main' into asonawane/fara-recipes

34dc067

devang-ml reviewed May 12, 2026

View reviewed changes

Comment thread microsoft-Fara-7B/builtin/cpu_and_mobile/embedding.json Outdated

Comment thread microsoft-Fara-7B/builtin/cpu_and_mobile/text.json

Comment thread microsoft-Fara-7B/builtin/eval.py

apsonawane added 2 commits May 25, 2026 01:09

Merge branch 'main' into asonawane/fara-recipes

a9947a0

Address comments on PR

bcc3f28

apsonawane requested a review from devang-ml May 25, 2026 01:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Fara-7b recipes#384

Add Fara-7b recipes#384
apsonawane wants to merge 10 commits into
mainfrom
asonawane/fara-recipes

apsonawane commented Apr 24, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devang-ml commented May 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

apsonawane commented Apr 24, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

devang-ml commented May 12, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants