Skip to content

feat(sync): add Hugging Face provider sync#2678

Open
hanouticelina wants to merge 1 commit into
anomalyco:devfrom
hanouticelina:sync/huggingface-inference-providers
Open

feat(sync): add Hugging Face provider sync#2678
hanouticelina wants to merge 1 commit into
anomalyco:devfrom
hanouticelina:sync/huggingface-inference-providers

Conversation

@hanouticelina

@hanouticelina hanouticelina commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Disclaimer: This PR was implemented by Claude Code and tested and reviewed locally by me.

Summary

  • add Hugging Face to the shared SyncProvider registry and the hourly sync matrix.
  • sync model pricing, context, modalities, and advertised capabilities from the HF Inference Providers /v1/models, resolving each model through base_model metadata in /models
  • Hugging Face Inference Providers is an aggregator (many providers per model, routed to the fastest), so pricing and context are taken from the highest-throughput provider, and tool/structured-output support from any provider
  • create-only for now: existing curated TOMLs are left untouched.
  • add huggingface:sync.

base_model behavior

Each router model is matched to a canonical model already present in /models (via the shared resolveCanonicalBaseModel/factorBaseModel). When a match is found, the generated .toml contains only the fields unique to the router (cost, context, capabilities) and inherits the rest; models with no canonical match or no priced provider are skipped and reported in a notice. HF_TOKEN is optional - the model list is public.

Verification

  • unauthenticated GET https://router.huggingface.co/v1/models returned 126 models
  • dry run: Dry run: 19 created, 0 updated, 3 removed, 23 unchanged (2
    embeddings + MiMo-V2-Flash) deleted
  • bun validate passed and the core typecheck (tsc --noEmit -p packages/core) is clean
  $ bun models:sync huggingface --dry-run

Syncing Hugging Face...
Would create zai-org/GLM-5.2.toml
Would create MiniMaxAI/MiniMax-M3.toml
Would create moonshotai/Kimi-K2.7-Code.toml
Would create Qwen/Qwen3.6-35B-A3B.toml
...
Would remove XiaomiMiMo/MiMo-V2-Flash.toml
Would remove Qwen/Qwen3-Embedding-4B.toml
Would remove Qwen/Qwen3-Embedding-8B.toml
Dry run: 19 created, 0 updated, 3 removed, 23 unchanged

Sync summary
Hugging Face: 19 created, 0 updated, 3 deleted

A created entry stays minimal - the rest is inherited from its base_model:

base_model = "zhipuai/glm-5.2"
open_weights = true

[cost]
input = 1.4
output = 4.4

[limit]
context = 262_144

Mirror the existing daily model-catalog sync for the Hugging Face
Inference Providers router (https://router.huggingface.co/v1/models),
modeled on the baseten provider.

The router is an aggregator: each model is served by several inference
providers with their own pricing, context window, and capabilities, and
requests are routed to the fastest one. The provider collapses them into
the route a request would actually take -- pricing and context from the
highest-throughput provider, with tool/structured-output support taken
from any provider since a caller can pin a slower one.

New models are created via canonical base_model resolution (the same
resolveCanonicalBaseModel/factorBaseModel path baseten uses); unmappable
or unpriced models are skipped and reported in a notice. For now the sync
only creates new models -- existing curated TOMLs are left untouched via
sameModel -- and never deletes (deleteMissing: false).

HF_TOKEN is optional; the router model list is public.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01PzQSYd3VwBK5NAsC9dYmSw
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant