Skip to content

XiaoYuan-AI/Nuru

Repository files navigation

Nuru

Nuru is a FastAPI backend for a Vtuber-style assistant. It exposes text generation, safety filtering, TTS, STT, vision, search, and Discord voice endpoints.

Install

uv sync

Run The API

uv run uvicorn main:app --host 0.0.0.0 --port 8000

Open http://localhost:8000/docs for the generated FastAPI docs.

Train Text Models

The legacy notebooks are Colab references. Use the project script for repeatable local or remote training:

uv run python scripts/train_text_model.py --variant normal

The normal model saves checkpoints and final artifacts under:

  • model/checkpoints
  • model/logs
  • model/finals

Train the evil model variant with:

uv run python scripts/train_text_model.py --variant evil

That writes to:

  • evil_model/checkpoints
  • evil_model/logs
  • evil_model/finals

For a quick smoke run before a full job:

uv run python scripts/train_text_model.py `
  --variant normal `
  --max-train-samples 32 `
  --max-eval-samples 8 `
  --max-steps 1

The default dataset is SocialGrep/one-million-reddit-jokes, using title as the prompt column and selftext as the answer column. Override these when using a custom dataset:

uv run python scripts/train_text_model.py `
  --dataset your-org/your-dataset `
  --title-column prompt `
  --body-column response

For local data, use JSON, JSONL, or CSV files:

uv run python scripts/train_text_model.py `
  --train-file data/train.jsonl `
  --validation-file data/validation.jsonl `
  --title-column prompt `
  --body-column response

Runtime Model Loading

At runtime, the text endpoints prefer local trained artifacts:

  • /v1/model loads model/finals when it contains a saved model.
  • /v1/evil_model loads evil_model/finals when it contains a saved model.

If no local final model exists, the backend falls back to gpt2 for a faster default startup. You can override runtime model locations:

$env:NURU_MODEL_DIR = "D:\models\nuru"
$env:NURU_EVIL_MODEL_DIR = "D:\models\nuru-evil"

Or override fallback model IDs:

$env:NURU_MODEL_NAME = "gpt2"
$env:NURU_EVIL_MODEL_NAME = "gpt2"

VTuber Runtime Endpoints

  • POST /v1/model/stream: stream text generation as Server-Sent Events.
  • POST /v1/vtuber/respond: accept text, image_base64, or audio_base64 and return a reply, TTS audio, emotion state, memory matches, and expression.
  • POST /v1/memory and POST /v1/memory/search: store and retrieve long-term memories from the persistent vector store.
  • GET /v1/emotion and POST /v1/emotion: inspect or adjust mood, energy, and interest.
  • POST /v1/chat/{platform}/message: ingest Twitch, YouTube, or other chat messages through a shared abstraction.
  • POST /v1/idle/tick: generate idle talk when inactivity or low chat activity is detected.
  • POST /v1/model/stream/abort/{abort_token}: cancel an active SSE generation.
  • GET /v1/working-memory: inspect the rolling short-term context window.
  • POST /v1/reflection/tick and GET /v1/reflection: run or inspect private memory reflection summaries.
  • GET /v1/tools/schema and POST /v1/tools/validate: validate model-requested actions such as expression changes, stickers, sounds, and commands.
  • POST /v1/tools/execute: execute supported tool calls, including calendar, reminders, and calculator actions.
  • GET /health and GET /metrics: inspect service health and Prometheus-style request counters.

Speed Controls

The backend defaults are tuned for lower latency. Raise these values when you want higher quality:

$env:NURU_MAX_NEW_TOKENS = "160"
$env:NURU_FILTER_MAX_NEW_TOKENS = "64"
$env:NURU_STT_MODEL_SIZE = "medium.en"
$env:NURU_STT_BEAM_SIZE = "5"
$env:NURU_TTS_MAX_CHARS = "500"
$env:NURU_VISION_MAX_IMAGE_SIZE = "1024"
$env:NURU_VISION_MAX_NEW_TOKENS = "160"

Fastest useful local settings:

$env:NURU_MODEL_NAME = "gpt2"
$env:NURU_EVIL_MODEL_NAME = "gpt2"
$env:NURU_STT_MODEL_SIZE = "base.en"
$env:NURU_STT_COMPUTE_TYPE = "int8"
$env:NURU_STT_BEAM_SIZE = "1"
$env:NURU_VISION_MAX_IMAGE_SIZE = "512"

Personality and runtime state can be adjusted without code changes:

$env:NURU_PERSONA = "More teasing and curious, but still concise"
$env:NURU_MOOD_BASE = "playful"
$env:NURU_MEMORY_DB_PATH = "memories/vector_memory.json"
$env:NURU_IDLE_BACKGROUND_ENABLED = "false"
$env:NURU_WORKING_MEMORY_SIZE = "20"
$env:NURU_REFLECTION_INTERVAL_MESSAGES = "20"
$env:NURU_RATE_LIMIT_REQUESTS = "30"
$env:NURU_OBSERVABILITY_LOG_PATH = "memories/observability.jsonl"
$env:NURU_REMINDERS_PATH = "memories/reminders.json"
$env:NURU_CALCULATOR_MAX_CHARS = "160"

Docker

Build and run the API container:

docker build -t nuru-api .
docker run --rm -p 8000:8000 --env-file .env.example nuru-api

Mount memories/, model/finals/, and evil_model/finals/ when you want persistent memory or local trained model artifacts available inside the container.

Tests

uv run pytest

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors