fix(drift): derive fitColumns from stored metadata when omitted#180
fix(drift): derive fitColumns from stored metadata when omitted#180SudipSinha wants to merge 1 commit into
Conversation
The Python service rejected drift metric requests with HTTP 400 when fitColumns was not provided. The Java service treats fitColumns as optional, fitting on all input columns when omitted. Derive fitColumns from the model's stored input schema metadata across all 6 drift endpoint functions (3 compute + 3 schedule). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
|
Warning Review limit reached
More reviews will be available in 32 minutes and 41 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (7)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Apply the same fix as PR #180 to the streaming KS test endpoints: when fitColumns is not provided, derive it from the model's stored input schema metadata instead of rejecting with HTTP 400. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
|
PR image build and manifest generation completed successfully! 📦 PR image: 🗂️ CI manifests |
Apply the same fix as PR #180 to the MMD endpoint: when fitColumns is not provided, derive it from the model's stored input schema metadata instead of rejecting with HTTP 400. The shared _validate_drift_request function is now async to support the metadata lookup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
Apply the same fix as PR #180 to the streaming KS test endpoints: when fitColumns is not provided, derive it from the model's stored input schema metadata instead of rejecting with HTTP 400. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
Apply the same fix as PR #180 to the MMD endpoint: when fitColumns is not provided, derive it from the model's stored input schema metadata instead of rejecting with HTTP 400. The shared _validate_drift_request function is now async to support the metadata lookup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
Apply the same fix as PR #180 to the streaming KS test endpoints: when fitColumns is not provided, derive it from the model's stored input schema metadata instead of rejecting with HTTP 400. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
Summary
All drift metric endpoints reject requests with HTTP 400 when
fitColumnsis not provided. The Java service treatsfitColumnsas optional — when omitted, it fits on all available input columns. This mismatch causes integration tests to fail when scheduling drift metrics without explicit column lists.Root cause
Each drift endpoint's compute and schedule functions had a hard validation:
In the Java service (
DriftMetricRequest.java),fitColumnsis aSet<String>with no validation — fully optional. When omitted, the Java implementation fits on all available input columns from stored data.The fix
When
fitColumnsis omitted, derive it from the model's stored input schema metadata:When
fitColumnsis explicitly provided, the existing whitespace validation is preserved (compare_means) or the values are used as-is (kstest, jensenshannon).Applied consistently across all 6 endpoint functions:
Also fixed a latent bug in
compare_means.pywhere the feature iteration loop usedvalid_features(only defined in theelsebranch) instead ofrequest.fit_columns.Files changed
src/endpoints/metrics/drift/compare_means.pyvalid_features→request.fit_columnssrc/endpoints/metrics/drift/kolmogorov_smirnov.pysrc/endpoints/metrics/drift/jensen_shannon.pytests/endpoints/metrics/drift/factory.pyget_metadataAsyncMock to both test factoriestests/endpoints/metrics/drift/test_compare_means.pytests/endpoints/metrics/drift/test_kolmogorov_smirnov.pytests/endpoints/metrics/drift/test_jensen_shannon.pyTest plan
ruff checkcleanruff formatcleanpyrefly checkcleanPOST /metrics/drift/meanshift/requestwith{"modelId": "...", "referenceTag": "TRAINING"}returns 200🤖 Generated with Claude Code