Skip to content

Fix correct pointer stride in Query predicate iteration#69

Merged
ObserverOfTime merged 2 commits into
tree-sitter:masterfrom
mataku:feature/fix-native-query-parse
May 13, 2026
Merged

Fix correct pointer stride in Query predicate iteration#69
ObserverOfTime merged 2 commits into
tree-sitter:masterfrom
mataku:feature/fix-native-query-parse

Conversation

@mataku

@mataku mataku commented May 12, 2026

Copy link
Copy Markdown
Contributor

Summary

Maybe from v0.22.4, Query.<init> in nativeMain advances the TSQueryPredicateStep cursor by nargs raw bytes instead of (nargs + 1) * sizeOf<TSQueryPredicateStep>(), missing both the struct stride (8 bytes on arm64/x86_64) and the Done terminator. Any pattern with 2+ predicates reads a misaligned step, yielding a bogus value_id and throwing IndexOutOfBoundsException against captureNames / stringValues (terminates the process via Kotlin/Native's unhandled-exception handler).

JVM and Android consumers are unaffected. They go through the JNI Java_..._Query path which iterates predicates in C and never touches this Kotlin code.

Changes

Fix multiplies the advance by sizeOf<TSQueryPredicateStep>() and includes the Done terminator, matching the j += nargs + 1U index advance one line above.

How to reproduce

fwcd/tree-sitter-kotlin/queries/highlights.scm lines 235–246 (commit f66d290) is the only multi-predicate single-pattern block in that grammar's highlights query — and it is what triggers the crash for any downstream Kotlin Multiplatform project that consumes tree-sitter-kotlin on an iOS / native target:

;   - Regex.fromLiteral("[abc]?")
(call_expression
    (navigation_expression
        ((simple_identifier) @_class
        (#eq? @_class "Regex"))                   ; predicate 1
        (navigation_suffix
            ((simple_identifier) @_function
            (#eq? @_function "fromLiteral"))))    ; predicate 2 — same pattern
    (call_suffix
        (value_arguments
            (value_argument
                (string_literal) @string.regex))))

(Single top-level S-expression rooted at call_expression = one pattern; both #eq? predicates belong to it.)

ts_query_predicates_for_pattern returns eight steps for this pattern (each #eq? is String("eq?") + 2 args + Done):

[ String("eq?"), Capture(@_class),    String("Regex"),       Done,
  String("eq?"), Capture(@_function), String("fromLiteral"), Done ]

The buggy advance after processing the first predicate puts tokens 3 bytes past the original cursor (instead of the required (3 + 1) * sizeof(TSQueryPredicateStep) = 32 bytes), so the second-iteration read of tokens[0] overlaps the still-unread step's bytes and yields a junk value_id. With tree-sitter-kotlin's captureNames.size == 84, the corrupted value_id happens to land at 256, producing:

kotlin.IndexOutOfBoundsException: index: 256, size: 84
    at kfun:kotlin.collections.AbstractList.Companion#checkElementIndex(...)
    at kfun:kotlin.collections.ArrayList#get(kotlin.Int){}1:0
    at kfun:io.github.treesitter.ktreesitter.Query#<init>(io.github.treesitter.ktreesitter.Language;kotlin.String){}
    at kfun:io.github.treesitter.ktreesitter.Language#query(kotlin.String){}io.github.treesitter.ktreesitter.Query
    ...

Crash terminates the process with SIGABRT (Kotlin/Native terminateWithUnhandledException).

256 / 84 is grammar-specific. Other grammars produce different ratios but the same IndexOutOfBoundsException thrown against captureNames[someBogusId].

Unit Test

I added da6a728 and confirmed ./gradlew ktreesitter:macosArm64Test -q failed. After fix: ccd2870, the test passed on local dev.

@mataku mataku marked this pull request as ready for review May 12, 2026 10:49
@ObserverOfTime ObserverOfTime merged commit 6d71c3b into tree-sitter:master May 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants