feat(topdown): add OoO-window bottleneck attribution#6173
Draft
lewislzh wants to merge 10 commits into
Draft
Conversation
Refine the sub-pipeline execution efficiency issue at the ROB head by dividing sub-pipeline execution stalls into those interrupted by cancellation (multi issue) and those caused by excessive execution latency after the final issue.
Track ROB-head issue state for topdown analysis and classify stalls by not-issued, issue-delay, and issue-cancel reasons. - Add ROB-head not-issued accounting to topdown counters - Propagate issue queue cancel source/debug info to ROB - Split issue cancel stalls into og0, og1, load, store, and other types - Update topdown config rename maps and extraction targets
Track source-ready state in issue queues and use it to estimate the ROB entry's ideal issue timing for topdown stall classification. - Add src-ready information to TopdownIQInfo - Propagate per-entry src-ready state from issue entries to ROB - Track topdownSrcReady and topdownLastShouldIssueTime in ROB entries - Use ideal issue timing to distinguish issue-delay stalls from execution latency
- trans collect logic to DPI-C to optimize compile speed
- trans ROB collect logic to DPI-C to optimize compile speed
Track cancel time per IQ cancel source in ROB entries and use the dominant
cancel source to classify ROB-head issue-cancel stalls.
Accumulate repeated cancels from the same source instead of overwriting their
time, initialize the new per-source ROB state on enqueue, and expose
IQCancelSource metadata for vector sizing. Also fix the IssueCancelStallSt
counter name to match the other TopDown counters.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR extends elaborated TopDown analysis with a ROB-head-centered bottleneck attribution
flow. It uses the oldest in-flight instruction as the key observation point,
tracks its issue/readiness/cancel/resource state, and attributes exposed
performance loss to the dominant bottleneck during that instruction's lifetime.
The oldest unfinished instruction in the ROB is often the point where internal
pipeline bottlenecks become visible: it can block commit and delay ROB, physical
register, issue queue, and other resource release. Observing the ROB head gives a
direct signal for identifying where performance loss is exposed.
Since multiple stall events may overlap in the same cycle, this PR avoids simply
summing every event duration. Instead, it divides the ROB-head instruction
lifetime into time slices and attributes each slice to the dominant bottleneck,
so the final accounting better matches the actual exposed performance-loss
cycles.
not-issued, memory, and other resource-related categories.
over-counting when events overlap.
signals.