Simple NodeJs server with only Gemini-Live backend for now. #1869

5ch4um1 · 2026-03-20T17:41:08Z

5ch4um1
Mar 20, 2026

First of all, thank you so much for this amazing project.
Shared the same in the Discord already, it is still very experimental, supports only gemini-live as the LLM backend and it has a simplified approval mechanism (admin needs to approve all devices manually in the dashboard, approved mcp devices need to be exposed to the Xiaozhi devices manually in an extra step by marking the respective checkboxes in the device settings). Also no language settings because gemini-live just speaks like ~40 languages out of the box. 🤯
Might also be a token-saving way to "teach" a coding agent how to implement your own solutions. I have only tried this on a real world small Linux VPS behind Nginx so far, because that's how I want to use it, but I think it should run just as well on a local machine, will give this a try as well these days.
https://github.com/5ch4um1/xiaozhi-server-nodejs (You'd need your own Gemini API key to run it).
And here a short video showcasing the MCP and language capabilities:
https://www.youtube.com/shorts/OddE0rnGxaY

5ch4um1 · 2026-03-21T12:41:57Z

5ch4um1
Mar 21, 2026
Author

Added a .env variable for the HOST now and tested it locally on my PC, works just fine as expected, you'd need to specify your local IP (or listen on any) and the port, and use ws instead of wss, added a section to the readme that explains a bit more detailed how to set it up locally and also a .env.example.local:

If anybody would like to see some other llm backends integrated, feel free to contact me.
Also if you have any comments or questions, please let me know.

0 replies

5ch4um1 · 2026-03-21T21:39:29Z

5ch4um1
Mar 21, 2026
Author

Added voice support for qwen3-omni-flash-realtime (and maybe others?) via Alibaba DashScope API now. Still trying to figure out how tool calls would work with Alibaba?

0 replies

78 · 2026-03-21T23:17:40Z

78
Mar 21, 2026
Maintainer

That's the simplest MCP-capable xiaozhi-esp32 server I have ever seen. Thumbs UP!

0 replies

HonestQiao · 2026-03-21T23:27:39Z

HonestQiao
Mar 21, 2026

wonderful!

0 replies

5ch4um1 · 2026-03-22T11:46:27Z

5ch4um1
Mar 22, 2026
Author

Thank you so much for the heads up, that's very encouraging. Next I will try to get the tool calls working for some Alibaba model, I'll probably have a look at "fun audio chat" next, that seemed really interesting. I might also try to run this "locally" on a cloud VPS and see how this can be integrated. I don't have a gaming rig at home and google gave me way too much starting credit, and I haven't even looked at the free credits from Alibaba yet, these need to go somewhere too I guess.
And the simplicity is partly due to many things being done by the node modules i guess, but yeah, I wanted to have something simple I can easily run myself, the existing servers seemed all very feature rich and mature, and I wanted something really simple to start with, and to better understand how it works. Using NodeJs was also more of a "just works" decision, this probably won't scale all that well if you get a NodJs worker for every connected device. It's really more geared towards diy folks and tinkerers, maybe also a bit with the hope to make it more popular in the west, like, this should be easy enough for western tech influencers to install, and the normal population probably still won't be too keen on running their own servers.
But in terms of publicity among developers, I thought this might be useful. I really don't understand how Xiaozhi has not become much more popular here in Europe for example, it often feels like we are lagging at least one year or so behind the latest trends. I can see just a thousand possibilities where you could build really amazing products with Xiaozhi, it's like having both Spike Jonze's "her" and the Babelfish from the Hitchhikers guide to the Galaxy right at your wrist. But you tell it to talk like a pirate and "her" becomes "harrrr" that's just amazing. Or in the tourism industry, you could give every hotel guest a personalized guide that knows its ways around the location, and I'd have many more examples like these.
If you have any suggestions for things you'd like to see implemented, please let me know, I'd be very open to any hints!
I'll probably try to get some local model working with tool calls, and then maybe clean up the UI a bit, not sure what next then?

0 replies

5ch4um1 · 2026-03-22T22:58:15Z

5ch4um1
Mar 22, 2026
Author

Tool calls are working now with qwen3-omni-flash, but I did a few changes to the core logic, still keeping it simple and allowing for easier addition of new models with one "base.js" provider that handles the communication with Xiaozhi, and different files for each model like qwen_realtime.js, gemini.js, etc. Will test this a bit more thoroughly tomorrow before uploading, but it still seems to work nicely.

1 reply

5ch4um1 Mar 23, 2026
Author

Despite all the careful testing, I accidentally pushed something broken to Github this afternoon and spent the last few hours in panic mode trying to fix it again, but now the qwen_omni backend should work again (with tool calling).
And the server dashboard itself got its own MCP capabilities now, so you could tell gemini: "switch the llm backend to qwen omni" and on the next session it will use the other model, and vice versa of course, tell qwen_omni to switch back to gemini. Only switching to qwen realtime is a one-way street, didn't get the tool calls working there yet, so obviously you can only set this once via MCP and then only change again via the dashboard.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple NodeJs server with only Gemini-Live backend for now. #1869

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 6 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Simple NodeJs server with only Gemini-Live backend for now. #1869

Uh oh!

Uh oh!

5ch4um1 Mar 20, 2026

Replies: 6 comments · 1 reply

Uh oh!

5ch4um1 Mar 21, 2026 Author

Uh oh!

Uh oh!

5ch4um1 Mar 21, 2026 Author

Uh oh!

78 Mar 21, 2026 Maintainer

Uh oh!

HonestQiao Mar 21, 2026

Uh oh!

5ch4um1 Mar 22, 2026 Author

Uh oh!

5ch4um1 Mar 22, 2026 Author

Uh oh!

5ch4um1 Mar 23, 2026 Author

5ch4um1
Mar 20, 2026

Replies: 6 comments 1 reply

5ch4um1
Mar 21, 2026
Author

5ch4um1
Mar 21, 2026
Author

78
Mar 21, 2026
Maintainer

HonestQiao
Mar 21, 2026

5ch4um1
Mar 22, 2026
Author

5ch4um1
Mar 22, 2026
Author

5ch4um1 Mar 23, 2026
Author