feat!: make voice receivers return AudioPacket classes instead of just Audio#11432
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
📝 WalkthroughWalkthroughThe changes extend audio packet handling in the voice receiver to include RTP metadata (sequence number, timestamp, and SSRC). A new public Changes
Sequence DiagramsequenceDiagram
actor UDP as UDP Source
participant VR as VoiceReceiver
participant CAP as createAudioPacket
participant ARS as AudioReceiveStream
UDP->>VR: onUdpMessage (encrypted packet)
Note over VR: Extract RTP sequence,<br/>timestamp from bytes
VR->>VR: Decrypt packet payload
VR->>CAP: createAudioPacket(buffer,<br/>sequence, timestamp, ssrc)
CAP->>CAP: Attach metadata as<br/>non-enumerable properties
CAP-->>VR: AudioPacket (Buffer + metadata)
VR->>ARS: stream.push(AudioPacket)
Note over ARS: Consumer receives<br/>Buffer with metadata
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/voice/src/receive/AudioReceiveStream.ts`:
- Around line 39-41: Update the documentation for the AudioPacket buffer to
state it contains an Opus-encoded payload (with RTP header metadata) rather than
a "decoded Opus packet"; locate the AudioPacket type/comment in
AudioReceiveStream (or the AudioPacket JSDoc) and change the wording to
explicitly say "a Buffer containing an Opus-encoded payload with RTP header
metadata" so API consumers aren't misled.
In `@packages/voice/src/receive/VoiceReceiver.ts`:
- Around line 177-180: The RTP header reads in VoiceReceiver (variables
sequence, timestamp, ssrc reading from msg) can throw for 9–11 byte buffers;
change the early length guard to require at least 12 bytes (e.g., if (msg.length
< 12) return;) or move these reads inside the existing try that wraps
parsePacket so any RangeError is caught; update the check or relocate the reads
in the VoiceReceiver.ts function that computes sequence/timestamp/ssrc to ensure
no unhandled RangeError occurs.
ℹ️ Review info
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
packages/voice/__tests__/VoiceReceiver.test.tspackages/voice/src/receive/AudioReceiveStream.tspackages/voice/src/receive/VoiceReceiver.ts
📜 Review details
🧰 Additional context used
🧬 Code graph analysis (2)
packages/voice/src/receive/VoiceReceiver.ts (1)
packages/voice/src/receive/AudioReceiveStream.ts (1)
AudioPacket(42-59)
packages/voice/__tests__/VoiceReceiver.test.ts (1)
packages/voice/__mocks__/rtp.ts (3)
RTP_PACKET_DESKTOP(7-16)RTP_PACKET_CHROME(18-26)RTP_PACKET_ANDROID(28-37)
🔇 Additional comments (2)
packages/voice/src/receive/VoiceReceiver.ts (1)
23-30:createAudioPacketwrapper is clean and backward-compatible.Non-enumerable readonly metadata on
Bufferis a good approach for preserving legacy buffer behavior.packages/voice/__tests__/VoiceReceiver.test.ts (1)
71-110: Great coverage additions for metadata passthrough and compatibility.These tests validate both RTP header field extraction and Buffer backward compatibility across multiple packet variants.
|
@vladfrangu Looks like the other PR was created after the last comment on this PR, and (given I've been doing cleanup in this PR, and other changes as requested), the conflict is a bit annoying to me... I am willing to look into this but only if there's a path to actually getting this merged in quickly, it feels like it's just sitting in limbo |
|
@vladfrangu / @Qjuh what is happening with this PR? |
|
Hey, sorry for bumping, but I'm also interested in the state of the PR since I'm facing similar issues with desync and as far as I can see, this should solve it |
|
@vladfrangu any update? |
|
Sorry for the dead silence, can you get this PR rebased please? From my end these changes look good, I will take another quick look but should be all good |
…nd SSRC. Refactored so that instead of pure Buffer, we now send AudioPacket (interface extending Buffer) which has readonly fields sequence, timestamp, and ssrc.
…e, timestamp, ssrc)
- Improve docstring use (also moved method to be private static to be more in-line with rest of code, and improved clarity of naming) - Fix pre-existing issue (min packet length was 8 bytes, but was expecting reading a uint32 at offset 8, so actual min length is 12) - Fix AudioPacket description
…r.ts Co-authored-by: Qjuh <76154676+Qjuh@users.noreply.github.com>
…d fixing style to use hyphens. Also move addPacketHeaders to a bare function below the class per review comment
Minor naming changes as part of this because we are working with two Buffers (raw RTP packet, decrypted RTP/DAVE payload). Change tests in VoiceReceiver.test.ts that were directly testing `decrypt` to instead test `parsePacket`
…er than an extension of Buffer. This breaks backwards-compatibility for existing AudioReceiveStream users, but is cleaner and allows for future extensibility.
… Documented constructor as not a public interface to discourage end-users from using it and breaking if we extend AudioPacket in the future
fef397d to
b0357d1
Compare
|
rebased with no conflicts, pr tests showing awaiting maintainer approval |
Currently, RTP packet headers are stripped and clients are only delivered Opus frames as
Buffers. The lack of timestamp makes jitter/drift hard to combat for clients, who must compute wall-clock delivery time.This change introduces an
AudioPacketinterface which extendsBufferwith the following read-only fields:sequence(16-bit uint monotonically increasing counter to allow for identifying out-of-order RTP packets)timestamp(32-bit uint that counts encoder-side timestamps; RFC 7587, Opus in RTP requires this be expressed at 48kHz no matter the audio sample rate)ssrc(to allow consumers to detect a change and reset their Opus decoder state; updates on this value can already be received via SSRCMap events, however I think it also belongs on AudioPacket because there's a risk of delayed SSRCMap event delivery, and the client needs to know as soon as they receive a packet that changes it so they can reset their decoder state to correctly parse the packet)This change deliberately does not attempt to provide a generalised parser for all the fields in the RTP Header to keep this PR small and focused just on the essential fields.