Engineering

The system underneath the charm.

This is the layer for the people who build these machines. No real file paths, addresses, or keys live here — but the architecture is honest. Everything below is running in our house right now.

Five personas, one brain

The household runs on five distinct AI agents — one per family member — sharing a single tool layer but with separate identities, memory, and permissions. A parent agent has full administrative reach; the kids’ agents are sandboxed, age-tuned, and never expose each other’s conversations. Each agent is a system prompt plus a registered set of tools; tool calls are dispatched in a single round and return plain strings, which keeps the behavior legible and easy to debug.

Per-member identity + memory, shared tool registry
Allowlist routing — the robot knows who it is talking to
Capability scoping by role (admin vs. kid-safe)

Motion as composable primitives

Every gesture is built from a small set of head primitives — pitch, yaw, roll, and translation — plus antenna and body motion, layered and time-scaled. Higher-level moves like a wave or a peek-a-boo are compositions of those primitives, which means a new behavior is authored, not hard-coded. Co-rotation keeps the head and body moving as one believable unit instead of a stack of independent servos, and a live beat-tracker drives dance mode so motion locks to whatever music is actually playing.

Head pitch / yaw / roll / translate as the base layer
Co-rotated body + head for believable motion
Real-time beat detection driving dance choreography

The wake-word journey

Getting a robot to reliably wake to its name in a loud house with three kids is harder than any single feature. We tuned through false-positive storms (the TV setting it off), false negatives (a quiet kid getting ignored), and the latency that makes a robot feel slow versus present. The current path streams audio continuously and gates the expensive brain behind a cheap always-on listener, so it feels instant without burning compute on silence.

Always-on cheap listener gating an expensive brain
Tuned against real household noise, not a quiet lab
Latency budget treated as a first-class feature

Streaming voice + music

Conversation runs through a streaming speech pipeline — listen, transcribe, think, and speak as one low-latency loop instead of discrete request/response steps, so turns feel like talking rather than querying. The same audio stack pipes a music backend to the robot’s speaker, which is what lets dance mode react to real songs. Everything degrades gracefully: if the robot is unreachable, the system logs and no-ops — it never takes the household assistant down with it.

Streaming STT → LLM → TTS in one warm socket
Music routed to the robot for beat-matched play
Hard timeouts + graceful no-op on every network call

An edge-hosted backend

The robot doesn’t depend on a laptop being awake. The brain is deployed to run on the robot’s own onboard computer, with an edge backend in front for the pieces that benefit from the cloud. Secrets are entered through a local settings page rather than baked into the image, and the whole thing is shipped as a versioned package and pushed over the air — no cables, no manual surgery on the device.

Runs on the robot’s onboard computer, laptop-optional
Edge backend for cloud-side work
Over-the-air, versioned deploys; secrets stay local

Design principles

The opinions that shaped every decision above.

Kids program it, not just use it

The most important design decision in the whole project. A child can teach, name, save, and replay a gesture — and it is credited to them. We are deliberately raising authors of robot behavior, not operators of a gadget.

Privacy that is actually private

When the household wants the robot to stop watching and listening, it genuinely stops — not a UI toggle over an always-on stream. Trust in a home robot is binary; we engineer for the strict reading.

It must never take the house down

The robot is an enhancement, never a dependency. If it is unplugged, unreachable, or mid-update, every other part of the family system keeps working. Failure is always a quiet no-op.

Ship small, ship often

Dozens of versioned releases, each a single legible change. New behaviors are added as compositions, not rewrites, so the system stays understandable as it grows.

Building a humanoid for the home?

We will happily go deeper than this page does — failure modes, what kids actually do with it, which capabilities survive contact with a real family. We are a year-long, in-home data point, and we like comparing notes.

Talk with us