Add faithfulness check by DarioDiPalma-DDP · Pull Request #2484 · Giskard-AI/giskard-oss

DarioDiPalma-DDP · 2026-05-21T09:36:01Z

Description

Adds a built-in Faithfulness LLM-based check for evaluating whether a generated answer faithfully represents the provided source material.

The check is intended for RAG and source-grounded LLM workflows, where the answer should preserve the meaning, scope, nuance, and factual content of the source without distortion, misrepresentation, or unsupported claims.

Changes included:

Added the Faithfulness check.
Added the faithfulness.j2 judge prompt.
Registered and exported the new check.
Added tests for faithful, distorted, partially faithful, trace-based extraction, direct-value priority, and list-based source material.

Related Issue

Closes #2368

Type of Change

📚 Examples / docs / tutorials / dependencies update
🔧 Bug fix (non-breaking change which fixes an issue)
🥂 Improvement (non-breaking change which improves an existing feature)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
🔐 Security fix

Checklist

I've read the CODE_OF_CONDUCT.md document.
I've read the CONTRIBUTING.md guide.
I've written tests for all new methods and classes that I created.
I've written the docstring in NumPy format for all the methods and classes that I created or modified.
I've updated the uv.lock running uv lock (only applicable when pyproject.toml has been
modified)

gemini-code-assist

Code Review

This pull request introduces a new Faithfulness check to the Giskard library, designed to evaluate whether an AI agent's response accurately represents provided source material. The implementation includes the Faithfulness class, a dedicated Jinja2 prompt template, and comprehensive unit tests. Feedback was provided regarding the get_inputs method to improve extensibility by calling the base class, enhance LLM readability by joining list-based sources with newlines, and ensure type consistency with the base class signature.

gemini-code-assist · 2026-05-21T09:37:28Z

+    async def get_inputs(self, trace: Trace[InputType, OutputType]) -> dict[str, str]:
+        """Build template variables from resolved inputs.
+
+        Parameters
+        ----------
+        trace : Trace
+            Trace for resolving inputs.
+
+        Returns
+        -------
+        dict[str, str]
+            Template variables with ``answer`` and ``source`` keys.
+        """
+        answer = provided_or_resolve(
+            trace,
+            key=self.answer_key,
+            value=provide_not_none(self.answer),
+        )
+
+        source: Any
+        if self.source is not None:
+            source = self.source
+        elif self.source_key is not None:
+            source = provided_or_resolve(trace, key=self.source_key)
+        else:
+            source = ""
+
+        return {
+            "answer": str(answer),
+            "source": str(source),
+        }


The get_inputs implementation can be improved in several ways:

Extensibility: It should call await super().get_inputs(trace) to include base inputs (like the trace object itself). This ensures that users can reference the trace in custom prompt templates if they choose to override the default prompt.

List Handling: Since source can be a list[str], using str(source) results in a Python list representation (e.g., ['doc1', 'doc2']) being injected into the prompt. Joining the list with double newlines provides a much more natural and effective format for LLM evaluation.

Type Consistency: The return type should be dict[str, Any] to match the base class signature and accommodate the trace object in the returned dictionary.

async def get_inputs(self, trace: TraceType) -> dict[str, Any]: """Build template variables from resolved inputs. Parameters ---------- trace : Trace Trace for resolving inputs. Returns ------- dict[str, Any] Template variables with ``answer`` and ``source`` keys. """ answer = provided_or_resolve( trace, key=self.answer_key, value=provide_not_none(self.answer), ) source: Any if self.source is not None: source = self.source elif self.source_key is not None: source = provided_or_resolve(trace, key=self.source_key) else: source = "" # Join list-based sources for better LLM readability if isinstance(source, list): source = "\n\n".join(map(str, source)) inputs = await super().get_inputs(trace) inputs.update({ "answer": str(answer), "source": str(source), }) return inputs

Add faithfulness check

7025e78

github-actions Bot added the Scope: Checks label May 21, 2026

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add faithfulness check#2484

Add faithfulness check#2484
DarioDiPalma-DDP wants to merge 1 commit into
Giskard-AI:mainfrom
DarioDiPalma-DDP:feat/faithfulness-check

DarioDiPalma-DDP commented May 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

DarioDiPalma-DDP commented May 21, 2026

Description

Related Issue

Type of Change

Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant