Nuru is a FastAPI backend for a Vtuber-style assistant. It exposes text generation, safety filtering, TTS, STT, vision, search, and Discord voice endpoints.
uv syncuv run uvicorn main:app --host 0.0.0.0 --port 8000Open http://localhost:8000/docs for the generated FastAPI docs.
The legacy notebooks are Colab references. Use the project script for repeatable local or remote training:
uv run python scripts/train_text_model.py --variant normalThe normal model saves checkpoints and final artifacts under:
model/checkpointsmodel/logsmodel/finals
Train the evil model variant with:
uv run python scripts/train_text_model.py --variant evilThat writes to:
evil_model/checkpointsevil_model/logsevil_model/finals
For a quick smoke run before a full job:
uv run python scripts/train_text_model.py `
--variant normal `
--max-train-samples 32 `
--max-eval-samples 8 `
--max-steps 1The default dataset is SocialGrep/one-million-reddit-jokes, using title as
the prompt column and selftext as the answer column. Override these when using
a custom dataset:
uv run python scripts/train_text_model.py `
--dataset your-org/your-dataset `
--title-column prompt `
--body-column responseFor local data, use JSON, JSONL, or CSV files:
uv run python scripts/train_text_model.py `
--train-file data/train.jsonl `
--validation-file data/validation.jsonl `
--title-column prompt `
--body-column responseAt runtime, the text endpoints prefer local trained artifacts:
/v1/modelloadsmodel/finalswhen it contains a saved model./v1/evil_modelloadsevil_model/finalswhen it contains a saved model.
If no local final model exists, the backend falls back to gpt2 for a faster
default startup. You can override runtime model locations:
$env:NURU_MODEL_DIR = "D:\models\nuru"
$env:NURU_EVIL_MODEL_DIR = "D:\models\nuru-evil"Or override fallback model IDs:
$env:NURU_MODEL_NAME = "gpt2"
$env:NURU_EVIL_MODEL_NAME = "gpt2"POST /v1/model/stream: stream text generation as Server-Sent Events.POST /v1/vtuber/respond: accepttext,image_base64, oraudio_base64and return a reply, TTS audio, emotion state, memory matches, and expression.POST /v1/memoryandPOST /v1/memory/search: store and retrieve long-term memories from the persistent vector store.GET /v1/emotionandPOST /v1/emotion: inspect or adjust mood, energy, and interest.POST /v1/chat/{platform}/message: ingest Twitch, YouTube, or other chat messages through a shared abstraction.POST /v1/idle/tick: generate idle talk when inactivity or low chat activity is detected.POST /v1/model/stream/abort/{abort_token}: cancel an active SSE generation.GET /v1/working-memory: inspect the rolling short-term context window.POST /v1/reflection/tickandGET /v1/reflection: run or inspect private memory reflection summaries.GET /v1/tools/schemaandPOST /v1/tools/validate: validate model-requested actions such as expression changes, stickers, sounds, and commands.POST /v1/tools/execute: execute supported tool calls, including calendar, reminders, and calculator actions.GET /healthandGET /metrics: inspect service health and Prometheus-style request counters.
The backend defaults are tuned for lower latency. Raise these values when you want higher quality:
$env:NURU_MAX_NEW_TOKENS = "160"
$env:NURU_FILTER_MAX_NEW_TOKENS = "64"
$env:NURU_STT_MODEL_SIZE = "medium.en"
$env:NURU_STT_BEAM_SIZE = "5"
$env:NURU_TTS_MAX_CHARS = "500"
$env:NURU_VISION_MAX_IMAGE_SIZE = "1024"
$env:NURU_VISION_MAX_NEW_TOKENS = "160"Fastest useful local settings:
$env:NURU_MODEL_NAME = "gpt2"
$env:NURU_EVIL_MODEL_NAME = "gpt2"
$env:NURU_STT_MODEL_SIZE = "base.en"
$env:NURU_STT_COMPUTE_TYPE = "int8"
$env:NURU_STT_BEAM_SIZE = "1"
$env:NURU_VISION_MAX_IMAGE_SIZE = "512"Personality and runtime state can be adjusted without code changes:
$env:NURU_PERSONA = "More teasing and curious, but still concise"
$env:NURU_MOOD_BASE = "playful"
$env:NURU_MEMORY_DB_PATH = "memories/vector_memory.json"
$env:NURU_IDLE_BACKGROUND_ENABLED = "false"
$env:NURU_WORKING_MEMORY_SIZE = "20"
$env:NURU_REFLECTION_INTERVAL_MESSAGES = "20"
$env:NURU_RATE_LIMIT_REQUESTS = "30"
$env:NURU_OBSERVABILITY_LOG_PATH = "memories/observability.jsonl"
$env:NURU_REMINDERS_PATH = "memories/reminders.json"
$env:NURU_CALCULATOR_MAX_CHARS = "160"Build and run the API container:
docker build -t nuru-api .
docker run --rm -p 8000:8000 --env-file .env.example nuru-apiMount memories/, model/finals/, and evil_model/finals/ when you want
persistent memory or local trained model artifacts available inside the
container.
uv run pytest