Add Roaring Bitmap support (issue #1270)#1741
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a new host-level Roaring Bitmap custom object extension to Garnet, including a compressed bitmap data structure, a RoaringBitmapObject wrapper, and four new R.* RESP commands, along with docs and tests.
Changes:
- Introduces a pure C# Roaring Bitmap implementation with array/bitmap containers and versioned serialization.
- Adds a Garnet custom object + RESP command implementations for
R.SETBIT,R.GETBIT,R.BITCOUNT, andR.BITPOS, and registers them in the default server host. - Adds end-to-end RESP tests and data-structure unit tests, plus documentation for the new commands.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| website/docs/commands/roaring-bitmap.md | Adds user-facing documentation for the new Roaring Bitmap object and R.* commands. |
| test/Garnet.test/RoaringBitmapDataTests.cs | Adds unit tests for the standalone RoaringBitmap data structure (promotion/demotion, bitpos, serialization). |
| test/Garnet.test/RespRoaringBitmapTests.cs | Adds RESP-level integration tests for the new commands via StackExchange.Redis. |
| main/GarnetServer/Program.cs | Registers the Roaring Bitmap custom type and R.* commands in the default host. |
| main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapObject.cs | Implements the Garnet custom-object wrapper (clone/serialize/size tracking). |
| main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmapCommands.cs | Implements argument parsing and the four RESP commands. |
| main/GarnetServer/Extensions/RoaringBitmap/RoaringBitmap.cs | Implements the roaring bitmap core, bit operations, enumeration, and serialization format. |
| main/GarnetServer/Extensions/RoaringBitmap/Containers/IContainer.cs | Defines the internal container abstraction and serialization kind enum. |
| main/GarnetServer/Extensions/RoaringBitmap/Containers/BitmapContainer.cs | Implements dense bitmap container behavior, popcount, and serialization. |
| main/GarnetServer/Extensions/RoaringBitmap/Containers/ArrayContainer.cs | Implements sparse sorted-array container behavior, promotion logic, and serialization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thanks for the thorough review! Pushed
All 27 data-structure tests + 14 RESP integration tests still pass: |
|
@microsoft-github-policy-service agree company="Microsoft" |
2 similar comments
|
@microsoft-github-policy-service agree company="Microsoft" |
|
@microsoft-github-policy-service agree company="Microsoft" |
|
Thanks for your contribution! This extension is interesting, but instead of putting it in main, we should place it in https://github.com/microsoft/garnet/tree/main/modules (where e.g., GarnetJSON is kept) so that it is not bundled by default in the server. |
|
Also, main is closed to new features, so we would request that you retarget your PR to the dev (v2) branch. |
02743e5 to
66af335
Compare
|
Thanks for the review! Both points addressed in the latest force-push:
Local validation: all 43 RoaringBitmap tests (29 data + 14 RESP) pass on net8.0, |
66af335 to
dd3e2da
Compare
| { | ||
| private static ReadOnlySpan<byte> ErrOffset => "ERR bit offset is not an unsigned 32-bit integer"u8; | ||
|
|
||
| public override bool NeedInitialUpdate(scoped ReadOnlySpan<byte> key, ref ObjectInput input, ref RespMemoryWriter writer) |
There was a problem hiding this comment.
This is more a question for @badrishc as I haven't played around with custom objects too much - is validation expected to get in NeedInitialUpdate like this?
It's unfortunate as it forces this read command to act like a write which will hurt throughput.
There was a problem hiding this comment.
Left as-is for now and added an explicit architectural note in code (above each affected command) flagging the question for @badrishc. The Reader path is also implemented as an existing-key fast-path, but NeedInitialUpdate is currently the only hook that can emit Redis-compatible missing-key responses (0 / -1) without nil semantics. Happy to switch if @badrishc confirms a cleaner Reader+miss override is planned.
There was a problem hiding this comment.
@badrishc — flagging this for your input.
We have R.GETBIT / R.BITCOUNT / R.BITPOS, all conceptually read-only. To return the Redis-compatible missing-key response (0 for GETBIT/BITCOUNT, -1 for BITPOS) without materializing an empty object, we currently register them as CommandType.ReadModifyWrite and emit the response from NeedInitialUpdate. The Reader path is only invoked when the key exists.
Is there an idiomatic way to express "read-only with typed missing-key response" in CustomObjectFunctions today, or is the NeedInitialUpdate hack the recommended pattern? If a cleaner Reader+miss override is on the roadmap, we'd happily refactor these three commands. As of 47d62b29b the Reader override is already in place as a fast path for the existing-key case.
| /// </summary> | ||
| public sealed class RBitCount : CustomObjectFunctions | ||
| { | ||
| public override bool NeedInitialUpdate(scoped ReadOnlySpan<byte> key, ref ObjectInput input, ref RespMemoryWriter writer) |
There was a problem hiding this comment.
Similar Q for @badrishc (and again in RBitPos) - using NeedInitialUpdate for "missing" is messy; can this be phrased as a Reader op instead?
There was a problem hiding this comment.
Same note as above — Reader is now implemented on RGetBit, RBitCount, RBitPos as the existing-key fast-path; NeedInitialUpdate handles the typed missing-key response. Architectural comment added in-file pointing at @badrishc.
There was a problem hiding this comment.
@badrishc — same question for R.BITPOS specifically (missing key must return -1 per Redis semantics, not nil). Tagging you so this isn't lost; full context in the sibling thread above. Happy to refactor if CustomObjectFunctions gains a cleaner Reader-with-miss hook.
|
Thank you for your reviews Will address all these in next PR |
Addresses PR review feedback from @badrishc: - Move the extension from main/GarnetServer/Extensions/RoaringBitmap to modules/RoaringBitmap so it isn't bundled by default (mirrors GarnetJSON). - Retarget the PR to dev (companion change). Implementation changes for the move: - New modules/RoaringBitmap/GarnetRoaringBitmap.csproj (mirrors GarnetJSON.csproj, signs assembly, exposes InternalsVisibleTo Garnet.test). - New RoaringBitmapModule : ModuleBase entry point that registers the factory and the four R.SETBIT/R.GETBIT/R.BITCOUNT/R.BITPOS commands. - Renamed namespace Garnet.Extensions.RoaringBitmap -> GarnetRoaringBitmap to avoid the namespace/class collision with class RoaringBitmap. - Updated CustomObjectFunctions overrides to dev-branch scoped ReadOnlySpan<byte> signatures for NeedInitialUpdate / Updater. - Updated RoaringBitmapObject to dev-branch CustomObjectBase ctor and HeapMemorySize accounting. - Wired the module into Garnet.slnx and Garnet.test.csproj. - Tests still register via server.Register.NewCommand in [SetUp] (in-process), matching the existing custom-object test pattern. - Updated StringKeyAndCustomObjectKey_AreSeparate to expect WRONGTYPE on the unified store on dev.
dd3e2da to
47d62b2
Compare
Updated — all review comments addressedForce-pushed Highlights:
Build: 0 warnings, 0 errors. Two threads contain open questions for @badrishc about whether the read commands can be expressed as pure |
|
@badrishc — two open architectural questions on this PR I'd love your read on. Both relate to how a read-only custom object command should emit a Redis-compatible missing-key response (
TL;DR: the three read commands are currently registered as |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
On a second review of this, we have concerns about the write commands being registered as RMW. That implies reads will proceed sequentially rather than in parallel. Unfortunately there's no existing way to customize the RESP response of a custom read command today, you always get a null. I'm going to take a look at fixing that in general (and moving existing extensions to use it if necessary). Once that's merged you can update this PR to use the new infra and catch up to |
|
Thank you. Sounds good @kevin-montrose will look for updates on this from you following week. |
|
Infra changes for this are pending review here: #1820 |
Add Roaring Bitmap support (issue #1270)
Resolves #1270.
Summary
Adds a Roaring Bitmaps extension to Garnet that introduces a new compressed
bitmap object type plus four
R.*RESP commands. Implemented entirely as ahost extension in
main/GarnetServer/Extensions/RoaringBitmap/— zerochanges to
libs/server/— so this is a clean, reviewable foundation thatcan be deepened in follow-up PRs.
Why Roaring?
A naive
uint32bitmap is 512 MiB. Roaring partitions the universe into65 536 chunks of 65 536 bits and represents each chunk as either:
ushort[], used while a chunk holds ≤ 4 096set bits (~
2·countbytes).ulong[1024](8 KiB exactly), used once a chunkexceeds the threshold.
Empty chunks consume zero memory. Chunks promote (array → bitmap) and demote
(bitmap → array) automatically as cardinality changes.
Commands
R.SETBIT key offset valueoffset∈[0, 2³²-1]to0/1. Returns previous bit.R.GETBIT key offsetoffset.0for missing keys.R.BITCOUNT key0for missing keys.R.BITPOS key bit [from]bit(0/1) at or afterfrom.-1if none.Docs:
website/docs/commands/roaring-bitmap.md.Design notes
RoaringBitmap.cs,Containers/*) is a pure C# librarywith no Garnet dependencies → independently unit-testable.
RoaringBitmapObject(CustomObjectBase) wraps the structure and trackssize deltas via
bitmap.ByteSizefor the per-object size accounting.CommandType.ReadModifyWrite. The reads(
R.GETBIT/R.BITCOUNT/R.BITPOS) do not mutate state, but the RMWpath is required so that
NeedInitialUpdateis invoked on missing keys —the framework's
Readpath simply returns nil otherwise. Missing-keyresponses (
0/-1) are written fromNeedInitialUpdatewhich thenreturns
falseto decline key creation.NeedInitialUpdateerror paths usewriter.WriteError(...)+return falserather than
AbortWithErrorMessage(which returnstrueand would cause theframework to proceed into
InitialUpdater/Updater, double-writing theresponse and corrupting the protocol stream).
Tests
RoaringBitmapDataTests(pure data structure)RespRoaringBitmapTests(RESP integration via SE.Redis)Coverage highlights:
4096↔4097) and demotion across both directions.HashSet<uint>oracle.0,65535,65536,2³¹,2³²-1.R.SETBIT/R.GETBITparity with oracle,R.BITCOUNT,R.BITPOS(set / unset /fromoffset), large offsets, persistence acrossrestart, concurrent setbits from multiple clients, error paths
(bad offset, bad bit, bad value, wrong arity), and the two-store key
separation property.
Known limitations (intentional v1 scope)
contiguous ranges; the array/bitmap pair captures the bulk of real-world
savings.
R.BITOP AND/OR/XOR/NOTis not exposed. The data structure supportsthese natively; only command surface is needed.
rather than removing the key. This is a property of the custom-object
framework's tombstone path (
output.HasRemoveKeyis honoured only on thebuilt-in path) and is best fixed in
libs/server/Storage/Functions/ObjectStorein a separate PR.
Files