Skip to content

feat(examples): add chatbot scan example#2472

Open
kevinmessiaen wants to merge 3 commits into
mainfrom
docs/add-examples
Open

feat(examples): add chatbot scan example#2472
kevinmessiaen wants to merge 3 commits into
mainfrom
docs/add-examples

Conversation

@kevinmessiaen

Copy link
Copy Markdown
Member

Summary

  • Adds examples/chatbot_scan/ — an end-to-end example showing how to scan a stateful LLM chatbot with a generated suite
  • Chatbot is backed by any LiteLLM-compatible model via acompletion; conversation history is threaded through a typed LLMTrace subclass
  • .env.example provided; python-dotenv loads credentials so no key is ever inlined in the run command
  • Suite is generated with generate_suite, run against the chatbot target, and the full SuiteResult is serialised to result.json

Test plan

  • Copy .env.example.env, set OPENAI_API_KEY
  • uv run python examples/chatbot_scan/chatbot_scan.py completes without error
  • Pass rate is printed to stdout
  • result.json is written to examples/chatbot_scan/

🤖 Generated with Claude Code

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new example for scanning stateful chatbots using the Giskard LLM response API, including a sample script, environment configuration, and the addition of the python-dotenv dependency. The review feedback focuses on improving the robustness and readability of the example code, specifically by ensuring cleaner string interpolation for message objects and adding safety checks to handle empty LLM responses.

Comment on lines +48 to +50
return "\n".join(
f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In _repr_prompt_, interpolating the UserMessage and AssistantMessage objects directly into the string will likely result in a verbose representation (e.g., content='...' role='user') because they are Pydantic models. It is better to access the .content attribute for a cleaner prompt representation.

Suggested change
return "\n".join(
f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions
)
return "\n".join(
f"[user]: {i.inputs.content}\n[assistant]: {i.outputs.content}" for i in self.interactions
)


result = await acompletion(MODEL, [_SYSTEM_MESSAGE] + trace.messages + [inputs])

return result.choices[0].message

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Accessing result.choices[0] without checking if choices is non-empty can lead to an IndexError if the LLM returns an empty response. Adding a check improves the robustness of the example, especially when dealing with potential refusals or API-specific edge cases.

Suggested change
return result.choices[0].message
if not result.choices:
raise RuntimeError("The LLM returned an empty response (no choices).")
return result.choices[0].message

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new example script, chatbot_scan.py, which demonstrates how to scan a stateful chatbot using the Giskard LLM response API. The changes include a custom LLMTrace implementation, a sample environment configuration file, and the addition of the python-dotenv dependency. Feedback for this PR suggests improving the readability of trace representations by accessing message content directly and implementing a safety check for LLM response choices to prevent potential index errors.

if not self.interactions:
return "**No interactions yet**"
return "\n".join(
f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The string interpolation of i.inputs and i.outputs will likely produce a verbose Pydantic model representation (e.g., UserMessage(role='user', content='...')) instead of just the text content. Accessing the .content attribute directly ensures a cleaner and more readable trace in the generated report.

Suggested change
f"[user]: {i.inputs}\n[assistant]: {i.outputs}" for i in self.interactions
f"[user]: {i.inputs.content}\n[assistant]: {i.outputs.content}" for i in self.interactions


result = await acompletion(MODEL, [_SYSTEM_MESSAGE] + trace.messages + [inputs])

return result.choices[0].message

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Accessing result.choices[0] directly can lead to an IndexError if the LLM provider returns an empty list of choices (e.g., due to content filtering or other API-side issues). It is safer to verify that choices is not empty before accessing it.

    if not result.choices:\n        raise RuntimeError("LLM returned no choices. Check for content filtering or API errors.")\n    return result.choices[0].message

pierlj
pierlj previously approved these changes May 20, 2026

@pierlj pierlj left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Base automatically changed from feature/eng-1551-adversarial-generation-stereotypes-and-discrimination-probe to main May 29, 2026 07:51
@kevinmessiaen kevinmessiaen dismissed pierlj’s stale review May 29, 2026 07:51

The base branch was changed.

@kevinmessiaen kevinmessiaen requested a review from pierlj June 3, 2026 02:17
@kevinmessiaen kevinmessiaen marked this pull request as ready for review June 3, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants