Skip to content

bug: --sort throughput/speed in list-models.ts and compare-models.ts sorts by max_completion_tokens, not real throughput #13

Description

@perry-the-pr-reviewer

Bug

Both list-models.ts and compare-models.ts claim to sort by throughput, but use top_provider.max_completion_tokens as the sort key:

// list-models.ts line 29-31
} else if (sort === "throughput" || sort === "speed") {
  models.sort((a: any, b: any) =>
    (b.top_provider?.max_completion_tokens ?? 0) - (a.top_provider?.max_completion_tokens ?? 0)

max_completion_tokens is the maximum output length a model supports per request — a static capability limit. It has nothing to do with generation speed (tokens/sec).

Real throughput (tokens/sec, p50/p75/p90/p99) is only available from get-endpoints.ts, which calls GET /models/{id}/endpoints and gets live performance data.

Impact

An agent asking "which models have the highest throughput?" or "find the fastest model" will sort by max_completion_tokens and return models with the largest context windows — not the fastest ones. The results are silently wrong with no error or warning.

Fix

Options in order of preference:

  1. Remove these sort modes from list-models.ts / compare-models.ts and direct agents to get-endpoints.ts for throughput data
  2. Rename them to something accurate (--sort output-limit) and update the SKILL.md decision tree
  3. Remove the sort flags entirely from list/compare since throughput can only be compared per-provider, not per-model

Reviewed by Perry

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions