Nuru

Nuru is a FastAPI backend for a Vtuber-style assistant. It exposes text generation, safety filtering, TTS, STT, vision, search, and Discord voice endpoints.

Install

uv sync

Run The API

uv run uvicorn main:app --host 0.0.0.0 --port 8000

Open http://localhost:8000/docs for the generated FastAPI docs.

Train Text Models

The legacy notebooks are Colab references. Use the project script for repeatable local or remote training:

uv run python scripts/train_text_model.py --variant normal

The normal model saves checkpoints and final artifacts under:

model/checkpoints
model/logs
model/finals

Train the evil model variant with:

uv run python scripts/train_text_model.py --variant evil

That writes to:

evil_model/checkpoints
evil_model/logs
evil_model/finals

For a quick smoke run before a full job:

uv run python scripts/train_text_model.py `
  --variant normal `
  --max-train-samples 32 `
  --max-eval-samples 8 `
  --max-steps 1

The default dataset is SocialGrep/one-million-reddit-jokes, using title as the prompt column and selftext as the answer column. Override these when using a custom dataset:

uv run python scripts/train_text_model.py `
  --dataset your-org/your-dataset `
  --title-column prompt `
  --body-column response

For local data, use JSON, JSONL, or CSV files:

uv run python scripts/train_text_model.py `
  --train-file data/train.jsonl `
  --validation-file data/validation.jsonl `
  --title-column prompt `
  --body-column response

Runtime Model Loading

At runtime, the text endpoints prefer local trained artifacts:

/v1/model loads model/finals when it contains a saved model.
/v1/evil_model loads evil_model/finals when it contains a saved model.

If no local final model exists, the backend falls back to gpt2 for a faster default startup. You can override runtime model locations:

$env:NURU_MODEL_DIR = "D:\models\nuru"
$env:NURU_EVIL_MODEL_DIR = "D:\models\nuru-evil"

Or override fallback model IDs:

$env:NURU_MODEL_NAME = "gpt2"
$env:NURU_EVIL_MODEL_NAME = "gpt2"

VTuber Runtime Endpoints

POST /v1/model/stream: stream text generation as Server-Sent Events.
POST /v1/vtuber/respond: accept text, image_base64, or audio_base64 and return a reply, TTS audio, emotion state, memory matches, and expression.
POST /v1/memory and POST /v1/memory/search: store and retrieve long-term memories from the persistent vector store.
GET /v1/emotion and POST /v1/emotion: inspect or adjust mood, energy, and interest.
POST /v1/chat/{platform}/message: ingest Twitch, YouTube, or other chat messages through a shared abstraction.
POST /v1/idle/tick: generate idle talk when inactivity or low chat activity is detected.
POST /v1/model/stream/abort/{abort_token}: cancel an active SSE generation.
GET /v1/working-memory: inspect the rolling short-term context window.
POST /v1/reflection/tick and GET /v1/reflection: run or inspect private memory reflection summaries.
GET /v1/tools/schema and POST /v1/tools/validate: validate model-requested actions such as expression changes, stickers, sounds, and commands.
POST /v1/tools/execute: execute supported tool calls, including calendar, reminders, and calculator actions.
GET /health and GET /metrics: inspect service health and Prometheus-style request counters.

Speed Controls

The backend defaults are tuned for lower latency. Raise these values when you want higher quality:

$env:NURU_MAX_NEW_TOKENS = "160"
$env:NURU_FILTER_MAX_NEW_TOKENS = "64"
$env:NURU_STT_MODEL_SIZE = "medium.en"
$env:NURU_STT_BEAM_SIZE = "5"
$env:NURU_TTS_MAX_CHARS = "500"
$env:NURU_VISION_MAX_IMAGE_SIZE = "1024"
$env:NURU_VISION_MAX_NEW_TOKENS = "160"

Fastest useful local settings:

$env:NURU_MODEL_NAME = "gpt2"
$env:NURU_EVIL_MODEL_NAME = "gpt2"
$env:NURU_STT_MODEL_SIZE = "base.en"
$env:NURU_STT_COMPUTE_TYPE = "int8"
$env:NURU_STT_BEAM_SIZE = "1"
$env:NURU_VISION_MAX_IMAGE_SIZE = "512"

Personality and runtime state can be adjusted without code changes:

$env:NURU_PERSONA = "More teasing and curious, but still concise"
$env:NURU_MOOD_BASE = "playful"
$env:NURU_MEMORY_DB_PATH = "memories/vector_memory.json"
$env:NURU_IDLE_BACKGROUND_ENABLED = "false"
$env:NURU_WORKING_MEMORY_SIZE = "20"
$env:NURU_REFLECTION_INTERVAL_MESSAGES = "20"
$env:NURU_RATE_LIMIT_REQUESTS = "30"
$env:NURU_OBSERVABILITY_LOG_PATH = "memories/observability.jsonl"
$env:NURU_REMINDERS_PATH = "memories/reminders.json"
$env:NURU_CALCULATOR_MAX_CHARS = "160"

Docker

Build and run the API container:

docker build -t nuru-api .
docker run --rm -p 8000:8000 --env-file .env.example nuru-api

Mount memories/, model/finals/, and evil_model/finals/ when you want persistent memory or local trained model artifacts available inside the container.

Tests

uv run pytest

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.vscode		.vscode
evil_model		evil_model
memories		memories
model		model
modules		modules
scripts		scripts
sounds		sounds
tests		tests
tts		tts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
evil_train.ipynb		evil_train.ipynb
main.py		main.py
pyproject.toml		pyproject.toml
train.ipynb		train.ipynb
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Nuru

Install

Run The API

Train Text Models

Runtime Model Loading

VTuber Runtime Endpoints

Speed Controls

Docker

Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Nuru

Install

Run The API

Train Text Models

Runtime Model Loading

VTuber Runtime Endpoints

Speed Controls

Docker

Tests

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages