AI agent skills for keeping a long-lived vLLM fork in sync with upstream — automated rebase, conflict resolution, and test-driven verification.
Five composable skills that an AI coding agent reads and executes interactively to:
- detect when a new upstream vLLM release is available,
- rebase the fork's custom commits onto it,
- resolve conflicts using upstream diff context,
- and iterate on user-defined checks (tests, benchmarks, evals) until the fork is back to a healthy state.
For the design rationale and a worked case study (Cohere's transcription model on v0.19.1), see docs/auto-fork-maintenance.md. To reproduce that example end-to-end, follow docs/reproduce-cohere-transcribe-v0.19.1.md.
Each skill is a SKILL.md markdown file with YAML frontmatter (name, description), following the Agent Skills format used by Cursor. The skills only assume access to a shell, the file system, and git, so they should also work with other coding agents that can read and execute the same Markdown-based instructions.
The rules/skill-edit-checklist.mdc file is a Cursor Rule and is optional.
| Skill | Role in the loop | What it does |
|---|---|---|
install-vllm |
Environment setup | Creates a uv virtualenv, installs vLLM in editable mode with the correct precompiled CUDA wheel |
local-test-runner |
Measurement | Runs Buildkite CI-equivalent tests locally on NVIDIA GPUs; parses .buildkite/test_areas/*.yaml, manages Hugging Face tokens, captures logs |
detect-upstream-base |
Disturbance detection | Finds the upstream tag (v1) the fork is currently based on via git merge-base + git describe |
rebase-assistant |
Controller | Rebases custom commits from v1 onto v2, resolves conflicts using upstream diffs, verifies with test-runner |
auto-rebase |
Orchestrator | Checks for new upstream releases via gh, invokes detect-upstream-base and rebase-assistant end-to-end |
See skills/README.md for the dependency graph, shared notation (v1/v2/b1/b2), and the change-impact table contributors should follow when editing skills.
In an agent session inside your vLLM fork checkout:
/auto-rebase sync the current branch with the latest upstream release and
make sure tests/entrypoints/openai/correctness/test_transcription_api_correctness.py passes
The agent will:
- detect the current upstream base tag (
v1) and the latest release (v2), - confirm with you before rebasing,
- verify the checks pass on the pre-rebase branch as a baseline,
- rebase the custom commits onto
v2and resolve conflicts, - iterate (fix, re-run checks, repeat) until everything passes,
- summarize what changed and offer to push.
Each skill checks its own prereqs at runtime, but at a minimum you'll need:
| Tool | Used by | Install |
|---|---|---|
uv |
install-vllm |
curl -LsSf https://astral.sh/uv/install.sh | sh |
gh (authenticated) |
auto-rebase |
gh auth login |
| Hugging Face token | local-test-runner (for tests that pull weights) |
hf auth login |
upstream git remote |
detect-upstream-base, rebase-assistant |
git remote add upstream git@github.com:vllm-project/vllm.git |
The skills are vLLM-specific, but the underlying pattern — detect the disturbance, measure the gap, iterate until error → 0 — generalizes to any long-lived fork with a measurable definition of "working" (a test, a benchmark, an eval). The same loop has been applied to other long-lived forks at Cohere, including a Hugging Face transformers fork. For the full framing, see docs/auto-fork-maintenance.md.
Apache 2.0 — see LICENSE.