Skip to content

test(idn-hostname): reject fullwidth digits#927

Open
vtushar06 wants to merge 2 commits into
json-schema-org:mainfrom
vtushar06:idn-hostname-fullwidth-digits
Open

test(idn-hostname): reject fullwidth digits#927
vtushar06 wants to merge 2 commits into
json-schema-org:mainfrom
vtushar06:idn-hostname-fullwidth-digits

Conversation

@vtushar06

@vtushar06 vtushar06 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Following the methodology I used for ipv4 (#907) and the earlier idn-hostname tests, I read RFC 5892 section 2.6 and found the current idn-hostname.json has no test for fullwidth digits in a label.

Fullwidth digits U+FF11-U+FF13 have the DISALLOWED property under IDNA2008, so a label made of them is not a valid U-label. UTS46 processors map them to ASCII 123 and accept; strict IDNA2008 rejects. This parallels the merged ipv4 fullwidth/astral digit test (#907).

Changes

  • Added 1 test case across draft7, draft2019-09, draft2020-12, and v1.
  • 123 (U+FF11 U+FF12 U+FF13) is invalid.

Ecosystem Impact

  1. python-jsonschema 4.x: PASSES (rejects it). is_idn_hostname calls idna.encode, which raises InvalidCodepoint on U+FF11.
  2. libidn2, Go x/net/idna, Node (WHATWG), PHP idn_to_ascii, Rust idna, java.net.IDN, ICU4J, Guava: FAIL (accept it). UTS46 maps the fullwidth digits to ASCII and validates the mapped form. ICU4J, the reference UTS46 implementation, accepts it even with CHECK_BIDI and CHECK_CONTEXTJ enabled.

RFC References

Reproduction commands and the idn-hostname cross-implementation matrix are in my evidence repo: https://github.com/vtushar06/JSON-Schema-format-test-Evidence/blob/main/idn-hostname.md

Related: #965

@vtushar06 vtushar06 requested a review from a team as a code owner June 10, 2026 14:46
Copilot AI review requested due to automatic review settings June 10, 2026 14:46

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds test coverage to ensure idn-hostname rejects RFC 5892–disallowed fullwidth digits across multiple JSON Schema draft test suites.

Changes:

  • Added a negative test case asserting fullwidth digits (U+FF11..U+FF13) are invalid for idn-hostname.
  • Applied the same test addition to v1, draft7, draft2019-09, and draft2020-12 optional format suites.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/v1/format/idn-hostname.json Adds invalid-case test for fullwidth digits in idn-hostname.
tests/draft7/optional/format/idn-hostname.json Mirrors the new invalid-case test for draft7 optional formats.
tests/draft2019-09/optional/format/idn-hostname.json Mirrors the new invalid-case test for draft2019-09 optional formats.
tests/draft2020-12/optional/format/idn-hostname.json Mirrors the new invalid-case test for draft2020-12 optional formats.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/v1/format/idn-hostname.json

@jviotti jviotti left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Very interesting that some implementations fail on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants