Skip to content

feat(scheduler): register drift metric calculators with MetricsDirectory#185

Open
SudipSinha wants to merge 4 commits into
mainfrom
fix/register-drift-metric-calculators
Open

feat(scheduler): register drift metric calculators with MetricsDirectory#185
SudipSinha wants to merge 4 commits into
mainfrom
fix/register-drift-metric-calculators

Conversation

@SudipSinha

@SudipSinha SudipSinha commented Jun 16, 2026

Copy link
Copy Markdown
Member

Summary

All 8 Prometheus metric tests timeout because drift metric calculators are never registered with MetricsDirectory. The scheduler finds scheduled requests but logs "No calculator found for metric CompareMeans" every 5 seconds indefinitely.

Root cause

The MetricsDirectory requires each metric to register a calculator function at module load time. Only fairness metrics (SPD, DIR) and Batch Mean registered calculators. None of the drift metrics did.

The fix

Add async calculator functions and module-level registration for three drift metrics:

  • CompareMeanscalculate_compare_means_metric using CompareMeans.ttest_ind
  • KSTestcalculate_kstest_metric using KolmogorovSmirnov.kstest
  • JensenShannoncalculate_jensenshannon_metric using JensenShannon.jensenshannon

Each calculator:

  1. Receives batch (current data from scheduler) and request (with reference_tag, fit_columns)
  2. Fetches reference data via data_source.get_dataframe_by_tag(model_id, reference_tag)
  3. Computes per-feature metric values
  4. Returns MetricValueCarrier(named_values) for Prometheus publishing

Since drift calculators need async get_dataframe_by_tag, _calculate_metric in the scheduler is now async. It detects whether a calculator returns a coroutine and awaits it if so — sync calculators (SPD, DIR) still work unchanged.

Files changed

File Change
src/endpoints/metrics/drift/compare_means.py Add calculator + registration
src/endpoints/metrics/drift/kolmogorov_smirnov.py Add calculator + registration
src/endpoints/metrics/drift/jensen_shannon.py Add calculator + registration
src/service/prometheus/prometheus_scheduler.py Make _calculate_metric async, add asyncio import
tests/endpoints/metrics/drift/test_drift_calculators.py 4 tests: per-feature values for all 3 metrics + column derivation from batch

Test plan

  • All 446 tests pass (4 new calculator tests)
  • Calculator tests verify MetricValueCarrier returns per-feature named values
  • test_derives_columns_from_batch verifies empty fit_columns falls back to batch columns
  • ruff check clean (no noqa needed)
  • ruff format clean
  • pyrefly check clean
  • Deploy and verify GET /q/metrics shows drift metric gauge values after scheduled computation

Note

The streaming KS test (KSTestStreaming) and MMD calculators need the same treatment on their respective feature branches (PRs #99 and #149).

🤖 Generated with Claude Code

@SudipSinha SudipSinha self-assigned this Jun 16, 2026
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@SudipSinha, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 33 minutes and 26 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f79f1929-8a19-4fcc-a50b-367c2bb3a934

📥 Commits

Reviewing files that changed from the base of the PR and between 1e3e6d2 and 585a908.

📒 Files selected for processing (7)
  • src/endpoints/metrics/drift/compare_means.py
  • src/endpoints/metrics/drift/jensen_shannon.py
  • src/endpoints/metrics/drift/kolmogorov_smirnov.py
  • src/service/prometheus/prometheus_scheduler.py
  • tests/endpoints/metrics/drift/test_compare_means.py
  • tests/endpoints/metrics/drift/test_drift_calculators.py
  • tests/endpoints/metrics/drift/test_jensen_shannon.py
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/register-drift-metric-calculators

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

The Prometheus scheduler found scheduled drift metric requests but had
no calculator to compute them, logging "No calculator found for metric
CompareMeans" every 5 seconds indefinitely. Add calculator functions
for CompareMeans, KSTest, and JensenShannon that fetch reference data
via get_dataframe_by_tag and return MetricValueCarrier with per-feature
values. Make _calculate_metric async to support async calculators.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
@SudipSinha SudipSinha force-pushed the fix/register-drift-metric-calculators branch from 88e6575 to f962c59 Compare June 16, 2026 11:44
@SudipSinha SudipSinha added bug Something isn't working python Pull requests that update python code tests high-priority Critical issues that should be addressed soon ok-to-test Proceed with CI testing labels Jun 16, 2026
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown

PR image build and manifest generation completed successfully!

📦 PR image: quay.io/trustyai/trustyai-service-python-ci:585a9089fc8538ab6a3c6484634812b29aeb8c6c

🗂️ CI manifests

devFlags:
  manifests:
    - contextDir: config
      sourcePath: ''
      uri: https://api.github.com/repos/trustyai-explainability/trustyai-service-operator-ci/tarball/service-python-585a9089fc8538ab6a3c6484634812b29aeb8c6c

SudipSinha and others added 2 commits June 16, 2026 15:32
…ted names

Three Prometheus sub-problems:

A) Deprecated schedule_meanshift overwrites metric_name to CompareMeans,
   so the Prometheus gauge is trustyai_comparemeans instead of
   trustyai_meanshift. Fix: set metric_name to DEPRECATED_METRIC_NAME
   before delegating, and only set METRIC_NAME if not already set.

B) METRIC_NAME constants used mixed case (KSTest, CompareMeans) but
   Java uses uppercase (KSTEST, COMPAREMEANS). The integration test
   checks metric_name.upper(), causing case mismatch. Fix: uppercase
   all METRIC_NAME constants.

C) KSTestStreaming calculator is missing — noted for PR #99 branch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
Deprecated list and delete endpoints delegated to canonical versions
that query scheduler by METRIC_NAME, but requests were stored under
DEPRECATED_METRIC_NAME. Parameterize list/delete functions to accept
metric_name, pass DEPRECATED_METRIC_NAME from deprecated wrappers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
@SudipSinha SudipSinha force-pushed the fix/register-drift-metric-calculators branch from 3084ada to ef7b9a7 Compare June 16, 2026 15:19
…ted names

Deprecated endpoints store requests under DEPRECATED_METRIC_NAME but
the calculator was only registered under METRIC_NAME. The scheduler
couldn't find a calculator for "MEANSHIFT" requests. Register under
both names so deprecated scheduled metrics compute correctly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Sudip Sinha <Sudip.Sinha@RedHat.com>
@SudipSinha SudipSinha marked this pull request as ready for review June 18, 2026 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working high-priority Critical issues that should be addressed soon ok-to-test Proceed with CI testing python Pull requests that update python code tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant