KAFKA-20658: Cast SMT honors ByteBuffer remaining bytes#22534
Open
BK202503 wants to merge 1 commit into
Open
Conversation
`Cast.castToString` Base64-encoded `ByteBuffer` BYTES values via
`Utils.readBytes(ByteBuffer)`, which reads from index `0` to the
buffer's `limit` and ignores `position`. A sliced or partially consumed
buffer therefore emitted bytes that were never part of the logical
BYTES value.
Switch the call site to `Utils.toArray(ByteBuffer)`, which is what the
sibling Connect conversion paths already use; it respects the buffer's
remaining-bytes contract and supports direct buffers as a bonus. The
fix is a one-line change at the only ByteBuffer call site in `Cast`;
no other transforms in this file touch a `ByteBuffer`.
Added one `CastTest` case that fails against the previous
implementation and passes with this fix:
- `castSlicedByteBufferFieldToStringUsesRemainingBytes` wraps
`{1,2,3,4}`, slides the buffer to position=1/limit=3, casts the
struct field to string, and asserts the Base64 output is `AgM=`
(`{2,3}`) instead of `AQID` (`{1,2,3}`).
Signed-off-by: BK202503 <199436087+BK202503@users.noreply.github.com>
This was referenced Jun 10, 2026
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
JIRA: KAFKA-20658
What
Cast.castToStringBase64-encodedByteBufferBYTES values viaUtils.readBytes(ByteBuffer).Utils.readBytes(ByteBuffer)reads from index0to the buffer'slimitand ignoresposition, so a sliced or partially consumed buffer produced a string for bytes outside the logical BYTES value.Switched the one call site to
Utils.toArray(ByteBuffer), which the sibling Connect conversion paths already use. It respects the buffer'sposition..limitcontract and also supports direct buffers.Utilswas already imported in this file.Tests
Added one
CastTestcase that fails against the previous implementation and passes with this change:castSlicedByteBufferFieldToStringUsesRemainingByteswraps{1,2,3,4}, slides the buffer toposition=1/limit=3, casts the struct field tostring, and asserts the Base64 output isAgM=({2,3}) instead ofAQID({1,2,3}).The existing
castFieldsWithSchemacase still passes (covers thebyte[]path and a heap-backed-from-position-0ByteBuffer).Validation
Both tests pass locally on JDK 17.
Scope
This PR targets only the
CastSMT. KAFKA-20657 (JsonConverter) is already up as #22533; KAFKA-20656 (Struct.getBytes/equals) and KAFKA-20666 (Connect offset backing stores) cover the sameByteBuffer.array()-class bug in their own components and are intentionally left for separate PRs.Committer Checklist