Harden read path: base-URL host allowlist (SSRF) and error-body whitelisting#60
Harden read path: base-URL host allowlist (SSRF) and error-body whitelisting#60jgalea wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
🚩 Docs client getBaseUrl() reads env var directly, bypassing config-time host allowlist validation
The host allowlist validation in validateConfig() (src/utils/config.ts:206-211) validates config.helpscout.docsBaseUrl, which is set at module initialization from process.env.HELPSCOUT_DOCS_BASE_URL. However, HelpScoutDocsClient.getBaseUrl() (src/utils/helpscout-docs-client.ts:122) reads process.env.HELPSCOUT_DOCS_BASE_URL directly at request time: const baseUrl = process.env.HELPSCOUT_DOCS_BASE_URL || config.helpscout.docsBaseUrl;. If the env var were changed after validateConfig() runs, the runtime URL could differ from the validated one, bypassing the allowlist. This is unlikely in typical deployments (env vars are set before startup), and this pattern pre-exists the PR, but it's worth noting for the security-conscious reviewer.
(Refers to line 122)
Was this helpful? React with 👍 or 👎 to provide feedback.
| @@ -448,7 +464,9 @@ export class HelpScoutClient { | |||
| message: `Help Scout API validation error: ${responseData.message || 'Invalid request data'}`, | |||
There was a problem hiding this comment.
🟡 Top-level error message interpolates upstream responseData.message without the 200-char truncation applied by safeApiResponse
The PR adds safeApiResponse() to whitelist and truncate fields from upstream Help Scout error bodies (capping message to 200 chars in details.apiResponse). However, both the 422 handler (line 464) and the general 4xx handler (line 479) also interpolate responseData.message directly into the top-level message field without any truncation. This means a 500-char upstream message is correctly truncated in details.apiResponse.message but appears in full in the top-level message. The test at line 471 only asserts transformed.details.apiResponse.message has length 200 — it never checks transformed.message, which would be 529 chars. This defeats the stated PII-defense goal if the upstream message field ever echoes user input.
Prompt for agents
In src/utils/helpscout-client.ts, the transformError method has two places where responseData.message is interpolated into the top-level error message without the 200-char truncation that safeApiResponse applies:
1. Line 464 (422 handler): message: Help Scout API validation error: ${responseData.message || 'Invalid request data'}
2. Line 479 (4xx handler): message: Help Scout API client error: ${responseData.message || 'Invalid request'}
Similarly, in src/utils/helpscout-docs-client.ts, the transformError method uses apiMessage (derived from responseData.error or responseData.message without truncation) in the top-level message field at lines 195 and 216.
To fix, truncate the upstream message before interpolating into the top-level message field. For example, extract a helper like:
const safeMsg = typeof responseData.message === 'string' ? responseData.message.slice(0, 200) : undefined;
and use safeMsg in the template literal. Apply the same treatment to apiMessage in the docs client.
Was this helpful? React with 👍 or 👎 to provide feedback.
This PR adds two backward-compatible security hardenings to the read path. Neither changes tool behaviour, adds write capability, or alters any default that an existing deployment depends on; the server stays read-only.
Host allowlist on the base URLs (SSRF defense).
HELPSCOUT_BASE_URLandHELPSCOUT_DOCS_BASE_URLwere only checked for HTTPS. Because the OAuth2 bearer token (and the Docs basic-auth key) is sent on every authenticated request, a repointed base URL would exfiltrate those credentials to an attacker host. Validation now constrains the API base URL toapi.helpscout.net/api.helpscout.comand the Docs base URL todocsapi.helpscout.net. It is default-closed to the real Help Scout hosts and overridable for self-hosted or proxy setups viaHELPSCOUT_ALLOWED_API_HOSTSandHELPSCOUT_ALLOWED_DOCS_HOSTS. The default config already points at these hosts, so existing setups are unaffected. (This also adds the HTTPS check the Docs base URL was missing at config-validation time.)Error-body field whitelisting. On an upstream error, Help Scout sometimes echoes the submitted input verbatim, which would flow customer PII into the LLM context and anywhere that context is logged. The API client previously propagated
responseData.errors || responseData(422) and the fullresponseData(other 4xx); the Docs client propagated the full body on 401/403/404/4xx. Both now surface only a whitelisted set: acodeplus a length-capped (<=200 char)message(the Docs side also keeps a cappederror). The human-readable error message andcodeare unchanged.Each change has focused Jest tests covering both the blocked path and the allowed/override path. type-check, lint, and the full test suite pass.