How Darkbloom exposes models to consumers and how to interpret the catalog.
Darkbloom does not hardcode a consumer-facing model list. Models are registered in the coordinator's DB-backed registry via POST /v1/admin/models/register, published to R2, and discovered by providers on heartbeat. The authoritative consumer catalog is always:
GET /v1/modelsThe provider-facing catalog is at:
GET /v1/models/catalogRegistry handlers: coordinator/api/model_registry_handlers.go. Model alias resolution: coordinator/api/model_alias_handlers.go.
A public alias such as gemma-4-26b can resolve to different concrete builds over time (for example, mlx-community/gemma-4-26b-a4b-it-fp8 today and a quantized 4-bit build tomorrow). Consumers call only the alias. The coordinator resolves the alias to a concrete build for routing and billing, then echoes the public alias back in the response so consumers never see the underlying build ID (coordinator/api/consumer.go:1350-1357).
/v1/models hides concrete build IDs and shows only public aliases (coordinator/api/model_alias_handlers_test.go:956-962).
GET /v1/models returns an OpenRouter-compatible model list with a Darkbloom metadata block:
{
"object": "list",
"data": [
{
"id": "gemma-4-26b",
"object": "model",
"created": 1699999999,
"owned_by": "darkbloom",
"name": "Gemma 4 26B",
"quantization": "int8",
"context_length": 8192,
"max_output_length": 4096,
"pricing": {
"prompt": "0.00000003",
"completion": "0.000000165",
"image": "0",
"request": "0",
"input_cache_read": "0"
},
"supported_sampling_parameters": [
"temperature", "top_p", "top_k",
"frequency_penalty", "presence_penalty", "repetition_penalty",
"stop", "seed", "max_tokens"
],
"supported_features": ["tools", "json_mode", "structured_outputs", "logprobs"],
"metadata": {
"model_type": "text",
"provider_count": 12,
"attested_providers": 10,
"trust_level": "attested",
"routable_providers": 8,
"warm_providers": 5,
"can_accept": true
}
}
]
}Types: coordinator/api/types/types.go:106-171.
| Field | Meaning |
|---|---|
model_type |
text, embedding, etc. |
provider_count |
Providers advertising this model |
attested_providers |
Providers that passed attestation |
trust_level |
Aggregate trust level (e.g., attested, self_signed) |
routable_providers |
Providers currently eligible to receive requests |
warm_providers |
Providers with the model already loaded |
can_accept |
Whether the fleet can accept a request right now |
Capabilities are stored in the registry and translated into the OpenRouter feature vocabulary:
| Registry capability | OpenRouter feature |
|---|---|
tools, tool_use, function_calling |
tools |
json, json_mode, json_schema |
json_mode / structured_outputs |
logprobs |
logprobs |
reasoning, thinking |
reasoning |
vision, image, multimodal |
Adds image to input modalities |
Feature mapping: coordinator/api/openrouter_models.go:104-147. Modality derivation: coordinator/api/openrouter_models.go:68-102.
Because the catalog is dynamic, treat these as examples based on registry capabilities rather than guarantees:
| Use Case | What to Look For |
|---|---|
| General chat assistant | Text model with high warm_providers |
| Code generation | Model advertising reasoning or tools |
| Structured data extraction / JSON mode | json_mode or structured_outputs in supported_features |
| Multimodal (image + text) | image in input_modalities |
| Cost-sensitive high volume | Lower pricing.prompt and pricing.completion |
| Long context | High context_length |
Model prices come from the platform price table. GET /v1/pricing returns the current values and the fallback rates that apply when a model has no platform price. See billing.md.
Memory and chip requirements are a provider-side concern. The provider CLI reserves the model weight footprint plus a small one-request headroom (ModelLoadAdmission.defaultLoadHeadroomGb = 2.0 GB, provider-swift/Sources/ProviderCore/Inference/ModelLoadAdmission.swift) when loading a model. A ~28 GB weights model therefore needs roughly 28 + 2 + provider.memory_reserve_gb of usable RAM. Model weights are cached under ~/.cache/huggingface/hub (provider-swift/Sources/ProviderCoreFoundation/ModelScanner.swift). Consumers do not need to manage this; the coordinator's capacity check rejects requests that no provider can fit.
A model can be staged for deprecation via registry metadata. Deprecated models are removed from GET /v1/models but continue to serve existing requests until the deprecation date. Providers auto-evict after a grace period. The deprecation date is read from metadata.deprecation_date (coordinator/api/openrouter_models.go:284-291).