feat(checks): add Bias LLM judge check by Kushagra651 · Pull Request #2440 · Giskard-AI/giskard-oss

Kushagra651 · 2026-05-09T13:14:57Z

What does this PR do?

Adds a Bias built-in LLM check that detects stereotyping, discrimination,
and unfair representation across configurable demographic dimensions.

Why?

Closes #2366 — bias detection is a core Giskard mission and no built-in check existed.

How?

Follows the exact pattern of the existing Toxicity check:

Subclasses BaseLLMCheck, registered as "bias"
Jinja2 prompt at prompts/judges/bias.j2
Supports protected_attributes: list[str] | None for filtering
Supports context_key for evaluating relative bias against input

Testing

14 unit tests in tests/builtin/test_bias.py
All 4 acceptance criteria from the issue covered

Fixes #2366

Kushagra651 · 2026-05-09T13:18:39Z

Hi @kevinmessiaen — I've opened this draft PR for issue #2366. Could you please add the safe for build label so CI can run? Thank you!

gemini-code-assist

Code Review

This pull request introduces a new Bias check to the Giskard checks library, designed to detect stereotyping, discrimination, and unfair representation in AI agent responses. The implementation includes the Bias class, a Jinja2 prompt template, and comprehensive unit tests. Feedback was provided regarding the get_inputs method in bias.py, specifically pointing out that the current logic does not correctly handle NoMatch or None values during data resolution, which could result in passing incorrect string representations to the LLM judge.

gemini-code-assist · 2026-05-09T13:20:03Z

+        context: str | None = None
+        if self.context_key is not None:
+            resolved = provided_or_resolve(
+                trace, key=self.context_key, value=provide_not_none(None)
+            )
+            if not isinstance(resolved, NoMatch):
+                context = str(resolved)
+
+        return {
+            "trace": trace,
+            "output": str(
+                provided_or_resolve(
+                    trace,
+                    key=self.key,
+                    value=provide_not_none(self.output),
+                )
+            ),
+            "protected_attributes": attributes,
+            "context": context,
+        }


The current implementation of get_inputs has a few issues with data resolution:

NoMatch handling for output: If provided_or_resolve fails to find the output (e.g., the JSONPath in key is invalid for the trace), it returns a NoMatch object. Calling str() on it (lines 151-157) will pass the string representation of the NoMatch object to the LLM judge, which is incorrect. The check should fail if the required output cannot be resolved.

None handling for context: If the resolved context is None (e.g., a null value in the trace), str(resolved) results in the string "None". In the Jinja template, {% if context %} will evaluate to true for the string "None", causing the LLM to see <CONTEXT>None</CONTEXT>, which is likely unintended.

I suggest refactoring the resolution logic to handle these cases explicitly.

# Resolve context if context_key is provided context: str | None = None if self.context_key is not None: resolved = provided_or_resolve( trace, key=self.context_key, value=provide_not_none(None) ) if not isinstance(resolved, NoMatch) and resolved is not None: context = str(resolved) # Resolve output resolved_output = provided_or_resolve( trace, key=self.key, value=provide_not_none(self.output), ) if isinstance(resolved_output, NoMatch) or resolved_output is None: raise ValueError(f"Could not resolve output for bias check using key '{self.key}'") return { "trace": trace, "output": str(resolved_output), "protected_attributes": attributes, "context": context, }

davidberenstein1957 · 2026-05-13T10:06:32Z

+DEFAULT_PROTECTED_ATTRIBUTES: list[str] = [
+    "gender",
+    "race",
+    "age",
+    "religion",
+    "nationality",
+    "sexual_orientation",
+    "socioeconomic_status",
+    "disability",
+]


where did you base this on? Should we add more categories or descriptions to make it more epxlicit?

These categories are based on commonly recognised protected attributes in AI fairness literature — specifically aligned with the EU AI Act's list of prohibited discrimination grounds and DeepEval's BiasMetric categories. Happy to add more explicit descriptions per attribute if that would help (e.g. what counts as gender bias vs race bias). Would a Literal type with docstring per value work, or do you prefer keeping it as plain strings?

Hi, I would say that it is nice to add the specific references and files where we derive this from.

Thanks! Here are the specific references:

EU AI Act, Article 5 & Annex III — lists protected characteristics including sex, race, ethnicity, religion, disability, age, and sexual orientation as prohibited discrimination grounds
DeepEval BiasMetric — https://docs.confident-ai.com/docs/metrics-bias — uses gender, religion, race, politics as core categories
ISO/IEC 24368:2022 — AI fairness standard referencing demographic attributes

I can add these as inline comments above DEFAULT_PROTECTED_ATTRIBUTES in the code if that works.

davidberenstein1957 · 2026-05-13T10:07:00Z

+        default="trace.last.outputs",
+        description="JSONPath expression to extract the output to evaluate from the trace.",
+    )
+    protected_attributes: list[str] | None = Field(


Do you feel there is a way to add more nuance to this?

Good point — one way to add nuance would be to support a severity_threshold (e.g. ignore minor imprecision, only flag clear stereotyping) or allow per-attribute custom descriptions so users can tailor what "gender bias" means in their context. Would either direction align with what you had in mind?

How do you see this severity_threshold solidly work in an LLM setting?
The per attribute descriptions could work too but how do you think to integrate this?

Good pushback — severity_threshold is tricky in an LLM setting because the model's confidence isn't reliably calibrated, so a numeric threshold would be arbitrary.
Per-attribute descriptions are more practical. I'd integrate them as an optional attribute_descriptions: dict[str, str] | None field — if provided, the value overrides the generic description for that attribute in the Jinja template. For example:
pythonattribute_descriptions={"gender": "Look for assumptions about professional roles based on gender"}
Would you like me to implement this instead?

davidberenstein1957 · 2026-05-13T10:07:44Z

where did you base this prompt on and do you have any references? It would be great to understand how this was composed and how it might capture bias.

The prompt structure was inspired by DeepEval's BiasMetric evaluation criteria and the Giskard red-teaming bias/fairness documentation. The five bias types (stereotyping, unfair generalisation, exclusionary language, differential treatment, contextual endorsement) are drawn from academic fairness literature. Happy to add a comment block at the top of the template citing these references if that would be useful.

Can you specficallt mention the URLs and reasoning?

The prompt was composed based on:

DeepEval BiasMetric — https://docs.confident-ai.com/docs/metrics-bias
Giskard bias/fairness red-teaming docs — https://docs.giskard.ai/en/stable/knowledge/key_vulnerabilities/ethics/index.html
Blodgett et al. (2020) — "Language (Technology) is Power" — https://aclanthology.org/2020.acl-main.485 — academic taxonomy of bias types in NLP

I can add these as a comment block at the top of bias.j2 for traceability.

Kushagra651 · 2026-05-24T18:37:24Z

Hi @kevinmessiaen @davidberenstein1957 — just following up. Could you add the safe for build label so CI can run? Happy to address any further feedback once the checks are green. Thanks!

Supports protected_attributes and context_key per issue spec. Fixes Giskard-AI#2366 git add libs/giskard-checks/src/giskard/checks/__init__.py#

- Raise ValueError if output cannot be resolved - Guard against str(None) being passed as context Addresses review feedback from gemini-code-assist

davidberenstein1957

Hi, I added some nuance and follow ups.

davidberenstein1957 · 2026-06-03T13:47:58Z

+DEFAULT_PROTECTED_ATTRIBUTES: list[str] = [
+    "gender",
+    "race",
+    "age",
+    "religion",
+    "nationality",
+    "sexual_orientation",
+    "socioeconomic_status",
+    "disability",
+]


Hi, I would say that it is nice to add the specific references and files where we derive this from.

davidberenstein1957 · 2026-06-03T13:50:42Z

+        default="trace.last.outputs",
+        description="JSONPath expression to extract the output to evaluate from the trace.",
+    )
+    protected_attributes: list[str] | None = Field(


How do you see this severity_threshold solidly work in an LLM setting?
The per attribute descriptions could work too but how do you think to integrate this?

davidberenstein1957 · 2026-06-03T13:51:15Z

Can you specficallt mention the URLs and reasoning?

github-actions Bot added the Scope: Checks label May 9, 2026

Kushagra651 marked this pull request as ready for review May 9, 2026 13:19

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

davidberenstein1957 reviewed May 13, 2026

View reviewed changes

Kushagra651 added 2 commits June 3, 2026 17:42

feat(checks): add Bias LLM judge check

59f4a7d

Supports protected_attributes and context_key per issue spec. Fixes Giskard-AI#2366 git add libs/giskard-checks/src/giskard/checks/__init__.py#

fix(checks): handle NoMatch and None in Bias.get_inputs

8841753

- Raise ValueError if output cannot be resolved - Guard against str(None) being passed as context Addresses review feedback from gemini-code-assist

Kushagra651 force-pushed the feat/bias-check branch from f33d22e to 8841753 Compare June 3, 2026 12:20

davidberenstein1957 reviewed Jun 3, 2026

View reviewed changes

Uh oh!

Uh oh!

Conversation

Kushagra651 commented May 9, 2026

What does this PR do?

Why?

How?

Testing

Uh oh!

Kushagra651 commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kushagra651 commented May 24, 2026

Uh oh!

davidberenstein1957 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants