feat(epp): add OpenTelemetry spans for the scheduler scoring path#1834
Open
mvanhorn wants to merge 1 commit into
Open
feat(epp): add OpenTelemetry spans for the scheduler scoring path#1834mvanhorn wants to merge 1 commit into
mvanhorn wants to merge 1 commit into
Conversation
Wrap the scorer chain in a parent llm_d.epp.scoring span and each scorer invocation in an llm_d.epp.scorer.<type> child span, so a request trace shows which scorers ran, how long they took, and aggregate score signals. Span attributes are request- and chain-level only (scorer type/name/weight, candidate count, score max/avg); no per-endpoint keys are emitted, keeping span cardinality bounded. Spans are no-ops when tracing is uninitialized. Signed-off-by: Matt Van Horn <mvanhorn@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind feature
Summary
Adds fine-grained OpenTelemetry spans to the EPP scheduler scoring path so a single request trace shows which scorers ran, how long each took, and the aggregate score signals each produced.
SchedulerProfile.runScorerPluginsnow opens a parentllm_d.epp.scoringspan over the scorer chain, and each scorer invocation runs inside anllm_d.epp.scorer.<type>child span. The child spans nest under the parent, and the existing plugin-internal span (e.g.score_prefix_cache) nests under its scorer span. This mirrors the span-emission pattern already used by the precise-prefix-cache scorer.Attributes are request- and chain-level only:
llm_d.epp.scorer.count,llm_d.epp.scoring.candidate_endpoints, plusgen_ai.request.model/gen_ai.request.idwhen present.llm_d.epp.scorer.type,llm_d.epp.scorer.name,llm_d.epp.scorer.weight,llm_d.epp.scorer.candidate_endpoints, and aggregatellm_d.epp.scorer.score.max/.avg/endpoints_scoredderived from the returned score map.No per-pod / per-endpoint attribute keys are emitted, keeping span cardinality bounded.
Why this matters
Issue #1692 ("Add span for scheduler score") is the scoped sub-task of umbrella #1483, which asks for spans across the scheduling/scoring pipeline so operators can see scorer-level behavior inside a request trace instead of only the top-level
gateway.request/gateway.request_orchestrationspans. Today the scorer chain is invisible in traces: there is no way to tell which scorer dominated a routing decision or which one was slow.This package is allocation-sensitive (the surrounding code documents per-request allocation work in
runScorerPluginsandrunPickerPlugin), so the scoring path stays allocation-free when tracing is disabled: the parent span'sIsRecording()is checked once and all attribute and child-span construction is skipped on the default no-op path, with the tracer and span-kind option resolved once rather than per scorer.BenchmarkScheduleconfirms the disabled path holds at the prior baseline.Testing
llm_d.epp.scoringspan recorded with onellm_d.epp.scorer.<type>child per scorer, children nested under the parent, with correct weight, candidate-count, and aggregate max/avg attributes.Which issue(s) this PR fixes:
Fixes #1692
Release note: