Skip to content

feat(checks): add consensus retries for LLM-based checks#2483

Open
harsh21234i wants to merge 2 commits into
Giskard-AI:mainfrom
harsh21234i:feat/llm-check-consensus
Open

feat(checks): add consensus retries for LLM-based checks#2483
harsh21234i wants to merge 2 commits into
Giskard-AI:mainfrom
harsh21234i:feat/llm-check-consensus

Conversation

@harsh21234i

Copy link
Copy Markdown
Contributor

Closes #2372

Summary

  • add num_runs and consensus to BaseLLMCheck
  • support repeated LLM judge execution with majority, unanimous, and any consensus strategies
  • preserve individual run results in details["runs"] when multiple runs are used
  • include aggregated consensus metadata in result details
  • keep the default single-run behavior unchanged

Testing

  • uv run -m pytest -q libs/giskard-checks/tests/builtin/test_base.py libs/giskard-checks/tests/builtin/test_judge.py libs/giskard-checks/tests/builtin/ test_groundedness.py libs/giskard-checks/tests/builtin/test_conformity.py
  • uv run -m pytest -q libs/giskard-checks/tests/builtin
  • uv run ruff check libs/giskard-checks/src/giskard/checks/judges/base.py libs/giskard-checks/tests/builtin/test_base.py

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces multi-run support for LLM-based checks, allowing users to specify a number of runs and a consensus strategy (majority, unanimous, or any) to determine the final result. The implementation includes logic for aggregating results and selecting representative outputs for the final report. The review feedback suggests a performance optimization to execute the multiple LLM runs in parallel using asyncio.gather instead of the current sequential execution.

Comment thread libs/giskard-checks/src/giskard/checks/judges/base.py
Comment thread libs/giskard-checks/src/giskard/checks/judges/base.py Outdated
@harsh21234i

Copy link
Copy Markdown
Contributor Author

hey @kevinmessiaen can you look into this??

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Add retry/majority-voting for LLM-based checks

1 participant