Bug
Both list-models.ts and compare-models.ts claim to sort by throughput, but use top_provider.max_completion_tokens as the sort key:
// list-models.ts line 29-31
} else if (sort === "throughput" || sort === "speed") {
models.sort((a: any, b: any) =>
(b.top_provider?.max_completion_tokens ?? 0) - (a.top_provider?.max_completion_tokens ?? 0)
max_completion_tokens is the maximum output length a model supports per request — a static capability limit. It has nothing to do with generation speed (tokens/sec).
Real throughput (tokens/sec, p50/p75/p90/p99) is only available from get-endpoints.ts, which calls GET /models/{id}/endpoints and gets live performance data.
Impact
An agent asking "which models have the highest throughput?" or "find the fastest model" will sort by max_completion_tokens and return models with the largest context windows — not the fastest ones. The results are silently wrong with no error or warning.
Fix
Options in order of preference:
- Remove these sort modes from
list-models.ts / compare-models.ts and direct agents to get-endpoints.ts for throughput data
- Rename them to something accurate (
--sort output-limit) and update the SKILL.md decision tree
- Remove the sort flags entirely from list/compare since throughput can only be compared per-provider, not per-model
Reviewed by Perry
Bug
Both
list-models.tsandcompare-models.tsclaim to sort by throughput, but usetop_provider.max_completion_tokensas the sort key:max_completion_tokensis the maximum output length a model supports per request — a static capability limit. It has nothing to do with generation speed (tokens/sec).Real throughput (tokens/sec, p50/p75/p90/p99) is only available from
get-endpoints.ts, which callsGET /models/{id}/endpointsand gets live performance data.Impact
An agent asking "which models have the highest throughput?" or "find the fastest model" will sort by
max_completion_tokensand return models with the largest context windows — not the fastest ones. The results are silently wrong with no error or warning.Fix
Options in order of preference:
list-models.ts/compare-models.tsand direct agents toget-endpoints.tsfor throughput data--sort output-limit) and update the SKILL.md decision treeReviewed by Perry