Skip to content

cohere-ai/vllm-skills

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vllm-skills

AI agent skills for keeping a long-lived vLLM fork in sync with upstream — automated rebase, conflict resolution, and test-driven verification.

Five composable skills that an AI coding agent reads and executes interactively to:

  • detect when a new upstream vLLM release is available,
  • rebase the fork's custom commits onto it,
  • resolve conflicts using upstream diff context,
  • and iterate on user-defined checks (tests, benchmarks, evals) until the fork is back to a healthy state.

For the design rationale and a worked case study (Cohere's transcription model on v0.19.1), see docs/auto-fork-maintenance.md. To reproduce that example end-to-end, follow docs/reproduce-cohere-transcribe-v0.19.1.md.

Compatibility

Each skill is a SKILL.md markdown file with YAML frontmatter (name, description), following the Agent Skills format used by Cursor. The skills only assume access to a shell, the file system, and git, so they should also work with other coding agents that can read and execute the same Markdown-based instructions.

The rules/skill-edit-checklist.mdc file is a Cursor Rule and is optional.

Skills

Skill Role in the loop What it does
install-vllm Environment setup Creates a uv virtualenv, installs vLLM in editable mode with the correct precompiled CUDA wheel
local-test-runner Measurement Runs Buildkite CI-equivalent tests locally on NVIDIA GPUs; parses .buildkite/test_areas/*.yaml, manages Hugging Face tokens, captures logs
detect-upstream-base Disturbance detection Finds the upstream tag (v1) the fork is currently based on via git merge-base + git describe
rebase-assistant Controller Rebases custom commits from v1 onto v2, resolves conflicts using upstream diffs, verifies with test-runner
auto-rebase Orchestrator Checks for new upstream releases via gh, invokes detect-upstream-base and rebase-assistant end-to-end

See skills/README.md for the dependency graph, shared notation (v1/v2/b1/b2), and the change-impact table contributors should follow when editing skills.

Quick start

In an agent session inside your vLLM fork checkout:

/auto-rebase sync the current branch with the latest upstream release and
make sure tests/entrypoints/openai/correctness/test_transcription_api_correctness.py passes

The agent will:

  1. detect the current upstream base tag (v1) and the latest release (v2),
  2. confirm with you before rebasing,
  3. verify the checks pass on the pre-rebase branch as a baseline,
  4. rebase the custom commits onto v2 and resolve conflicts,
  5. iterate (fix, re-run checks, repeat) until everything passes,
  6. summarize what changed and offer to push.

Prerequisites

Each skill checks its own prereqs at runtime, but at a minimum you'll need:

Tool Used by Install
uv install-vllm curl -LsSf https://astral.sh/uv/install.sh | sh
gh (authenticated) auto-rebase gh auth login
Hugging Face token local-test-runner (for tests that pull weights) hf auth login
upstream git remote detect-upstream-base, rebase-assistant git remote add upstream git@github.com:vllm-project/vllm.git

Beyond vLLM

The skills are vLLM-specific, but the underlying pattern — detect the disturbance, measure the gap, iterate until error → 0 — generalizes to any long-lived fork with a measurable definition of "working" (a test, a benchmark, an eval). The same loop has been applied to other long-lived forks at Cohere, including a Hugging Face transformers fork. For the full framing, see docs/auto-fork-maintenance.md.

License

Apache 2.0 — see LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors