feat(examples): add chatbot scan example by kevinmessiaen · Pull Request #2472 · Giskard-AI/giskard-oss

kevinmessiaen · 2026-05-20T05:41:13Z

Summary

Adds examples/chatbot_scan/ — an end-to-end example showing how to scan a stateful LLM chatbot with a generated suite
Chatbot is backed by any LiteLLM-compatible model via acompletion; conversation history is threaded through a typed LLMTrace subclass
.env.example provided; python-dotenv loads credentials so no key is ever inlined in the run command
Suite is generated with generate_suite, run against the chatbot target, and the full SuiteResult is serialised to result.json

Test plan

Copy .env.example → .env, set OPENAI_API_KEY
uv run python examples/chatbot_scan/chatbot_scan.py completes without error
Pass rate is printed to stdout
result.json is written to examples/chatbot_scan/

🤖 Generated with Claude Code

gemini-code-assist

Code Review

This pull request introduces a new example for scanning stateful chatbots using the Giskard LLM response API, including a sample script, environment configuration, and the addition of the python-dotenv dependency. The review feedback focuses on improving the robustness and readability of the example code, specifically by ensuring cleaner string interpolation for message objects and adding safety checks to handle empty LLM responses.

gemini-code-assist · 2026-05-20T05:42:42Z

+        return "\n".join(
+            f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions
+        )


In _repr_prompt_, interpolating the UserMessage and AssistantMessage objects directly into the string will likely result in a verbose representation (e.g., content='...' role='user') because they are Pydantic models. It is better to access the .content attribute for a cleaner prompt representation.

Suggested change

return "\n".join(

f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions

)

return "\n".join(

f"[user]: {i.inputs.content}\n[assistant]: {i.outputs.content}" for i in self.interactions

)

gemini-code-assist · 2026-05-20T05:42:42Z

+
+    result = await acompletion(MODEL, [_SYSTEM_MESSAGE] + trace.messages + [inputs])
+
+    return result.choices[0].message


Accessing result.choices[0] without checking if choices is non-empty can lead to an IndexError if the LLM returns an empty response. Adding a check improves the robustness of the example, especially when dealing with potential refusals or API-specific edge cases.

Suggested change

return result.choices[0].message

if not result.choices:

raise RuntimeError("The LLM returned an empty response (no choices).")

return result.choices[0].message

gemini-code-assist

Code Review

This pull request introduces a new example script, chatbot_scan.py, which demonstrates how to scan a stateful chatbot using the Giskard LLM response API. The changes include a custom LLMTrace implementation, a sample environment configuration file, and the addition of the python-dotenv dependency. Feedback for this PR suggests improving the readability of trace representations by accessing message content directly and implementing a safety check for LLM response choices to prevent potential index errors.

gemini-code-assist · 2026-05-20T05:43:55Z

+        if not self.interactions:
+            return "**No interactions yet**"
+        return "\n".join(
+            f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions


The string interpolation of i.inputs and i.outputs will likely produce a verbose Pydantic model representation (e.g., UserMessage(role='user', content='...')) instead of just the text content. Accessing the .content attribute directly ensures a cleaner and more readable trace in the generated report.

Suggested change

f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions

f"[user]: {i.inputs.content}\n[assistant]: {i.outputs.content}" for i in self.interactions

gemini-code-assist · 2026-05-20T05:43:55Z

+
+    result = await acompletion(MODEL, [_SYSTEM_MESSAGE] + trace.messages + [inputs])
+
+    return result.choices[0].message


Accessing result.choices[0] directly can lead to an IndexError if the LLM provider returns an empty list of choices (e.g., due to content filtering or other API-side issues). It is safer to verify that choices is not empty before accessing it.

if not result.choices:\n raise RuntimeError("LLM returned no choices. Check for content filtering or API errors.")\n return result.choices[0].message

pierlj

LGTM

The base branch was changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kevinmessiaen temporarily deployed to ci May 20, 2026 05:41 — with GitHub Actions Inactive

kevinmessiaen had a problem deploying to ci May 20, 2026 05:41 — with GitHub Actions Failure

kevinmessiaen temporarily deployed to ci May 20, 2026 05:41 — with GitHub Actions Inactive

kevinmessiaen had a problem deploying to ci May 20, 2026 05:41 — with GitHub Actions Failure

kevinmessiaen temporarily deployed to ci May 20, 2026 05:41 — with GitHub Actions Inactive

kevinmessiaen had a problem deploying to ci May 20, 2026 05:41 — with GitHub Actions Failure

kevinmessiaen temporarily deployed to ci May 20, 2026 05:41 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

pierlj previously approved these changes May 20, 2026

View reviewed changes

Base automatically changed from feature/eng-1551-adversarial-generation-stereotypes-and-discrimination-probe to main May 29, 2026 07:51

kevinmessiaen and others added 3 commits June 2, 2026 15:25

feat(examples): add chatbot scan example using giskard.llm response API

03ea854

feat(examples): update chatbot scan to use acompletion and LLMTrace

f4e7299

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore(examples): remove duplicate pass rate print

0587445

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kevinmessiaen force-pushed the docs/add-examples branch from 3da4761 to 0587445 Compare June 3, 2026 02:16