Skip to content

[WIP] feat: [ENG-1568] add to_hub_format() in SuiteResult#2560

Draft
henchaves wants to merge 7 commits into
mainfrom
feature/eng-1568-snapshot-import-upload-suiteresult-to-the-hub-as-a-local
Draft

[WIP] feat: [ENG-1568] add to_hub_format() in SuiteResult#2560
henchaves wants to merge 7 commits into
mainfrom
feature/eng-1568-snapshot-import-upload-suiteresult-to-the-hub-as-a-local

Conversation

@henchaves

@henchaves henchaves commented Jun 24, 2026

Copy link
Copy Markdown
Member

Description

This PR adds trace_index param to TestCaseResult and a helper method to_hub_format(). The goal is to make possible to upload local evaluation (SuiteResult) to the Giskard Hub.

Related Issue

ENG-1568

Type of Change

  • 📚 Examples / docs / tutorials / dependencies update
  • 🔧 Bug fix (non-breaking change which fixes an issue)
  • 🥂 Improvement (non-breaking change which improves an existing feature)
  • 🚀 New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to change)
  • 🔐 Security fix

henchaves added 3 commits May 21, 2026 09:29
…eresult-to-the-hub-as-a-local

Resolved core/result.py: keep both SuiteResult.to_hub_format() (feature) and
group_by()/print_report() (from main); TestCaseResult.trace_index preserved.
(--no-verify: pre-commit pyright flags pre-existing reportInvalidTypeForm errors
in scenarios/runner.py that also exist on main.)
@linear

linear Bot commented Jun 24, 2026

Copy link
Copy Markdown

ENG-1568

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a trace_index field to TestCaseResult to attribute check results to specific interactions, and adds a to_hub_format method to export suite results for the Giskard Hub. Additionally, skipped check results now include extra details. Feedback on the changes highlights a bug in runner.py where trace_index is incorrectly calculated if a step adds no interactions but previous steps did, and provides a suggestion to fix this by comparing interaction counts before and after the step.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 136 to +137
trace = await trace.with_interactions(*step.interacts)
trace_index = len(trace.interactions) - 1 if trace.interactions else None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If a step does not add any interactions (i.e., step.interacts is empty), but previous steps have already added interactions to the trace, trace.interactions will not be empty. In this case, trace_index will incorrectly be set to the index of the last interaction from the previous step instead of None.

To fix this, we should compare the number of interactions before and after applying the step's interactions.

Suggested change
trace = await trace.with_interactions(*step.interacts)
trace_index = len(trace.interactions) - 1 if trace.interactions else None
prev_len = len(trace.interactions)
trace = await trace.with_interactions(*step.interacts)
trace_index = len(trace.interactions) - 1 if len(trace.interactions) > prev_len else None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

1 participant