Skip to content

test(idn-hostname): reject zero width space#928

Open
vtushar06 wants to merge 2 commits into
json-schema-org:mainfrom
vtushar06:idn-hostname-zero-width-space
Open

test(idn-hostname): reject zero width space#928
vtushar06 wants to merge 2 commits into
json-schema-org:mainfrom
vtushar06:idn-hostname-zero-width-space

Conversation

@vtushar06

@vtushar06 vtushar06 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Following the methodology I used for ipv4 and uuid, I read RFC 5892 section 2.6 and found the current idn-hostname.json has no test for a zero width space in a label.

U+200B (ZERO WIDTH SPACE) has the DISALLOWED property under IDNA2008. UTS46 processors treat it as ignorable and silently strip it, so a+U+200B+b becomes ab and is accepted - an invisible character that changes the resolved name.

Changes

  • Added 1 test case across draft7, draft2019-09, draft2020-12, and v1.
  • a + U+200B + b (a zero width space between two letters) is invalid.

Ecosystem Impact

  1. python-jsonschema 4.x: PASSES (rejects it). idna.encode raises InvalidCodepoint on U+200B.
  2. libidn2, Go x/net/idna, Node (WHATWG), PHP idn_to_ascii, Rust idna, java.net.IDN, ICU4J, Guava: FAIL (accept it). They strip U+200B as ignorable under UTS46 and validate ab. The silent strip is also GNU libidn2 GitLab issue Create tests for new (upcoming?) meta-schema #136.

RFC References

Reproduction commands and the idn-hostname cross-implementation matrix are in my evidence repo: https://github.com/vtushar06/JSON-Schema-format-test-Evidence/blob/main/idn-hostname.md

Related: #965

Copilot AI review requested due to automatic review settings June 10, 2026 14:46
@vtushar06 vtushar06 requested a review from a team as a code owner June 10, 2026 14:46

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds additional conformance coverage for the idn-hostname format by asserting that zero-width space (U+200B) is rejected across multiple JSON Schema draft test suites.

Changes:

  • Added a new invalid test case for a\u200bb (U+200B inside an IDN label) in the v1 suite.
  • Mirrored the same new invalid test case into draft7, draft2019-09, and draft2020-12 optional format suites.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/v1/format/idn-hostname.json Adds U+200B “DISALLOWED” invalid-case coverage for idn-hostname.
tests/draft7/optional/format/idn-hostname.json Mirrors the same U+200B invalid-case coverage for draft7 optional tests.
tests/draft2020-12/optional/format/idn-hostname.json Mirrors the same U+200B invalid-case coverage for draft2020-12 optional tests.
tests/draft2019-09/optional/format/idn-hostname.json Mirrors the same U+200B invalid-case coverage for draft2019-09 optional tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/v1/format/idn-hostname.json Outdated

@jviotti jviotti left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double checked and also looks valid to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants