KULVEXApril 26, 2026·11 min read

Replacing Home Assistant with a local-first AI core

Home Assistant is the de-facto self-hosted home automation: 500+ integrations, a strong community, the YAML engine everyone's either fluent in or quietly resentful of. We ran it for years. The thing that finally broke it for us wasn't the YAML — it was the gap between the automation engine and the AI part. Once an LLM is in the loop, two products glued together stops being the right shape.

KULVEX is a self-hosted AI platform. Chat, voice, agents, cameras, presence, and home control — same server, same local model, no Home Assistant in the path. The home stack talks Zigbee, Z-Wave, Tuya, and weather UDP directly. The LLM controls everything by intent. This post is the architecture, the latency numbers, and the honest list of things HA still does better today.

Home Assistant's shape, and where it strains

HA's strength is hardware support. Zigbee2MQTT, Z-Wave-JS, Matter, ZHA, and a long tail of integrations that mean almost any device you own works on day one. That's real and we don't want to dismiss it.

The shape we kept hitting friction with:

Two brains. HA holds device state. An LLM elsewhere (Assist, OpenAI conversation agent, custom extras) holds conversation context. They communicate by calling each other's APIs. Every interesting AI feature lives at the seam — and seams are where things get slow, or wrong, or both.
YAML for non-trivial logic. The blueprint system has come a long way, but anything past "at sunset, turn the lights on" lands in YAML, where control flow is awkward and debugging is a re-deploy. People who get good at HA YAML are paying a tax to use a non-language.
Intent loss in two-step pipelines. "Turn off the kitchen lights and start dinner music" in HA Assist is parsed by an intent recogniser, translated to two service calls, and routed to two integrations. Each step is a place for the user's intent to flatten into something less interesting. With an LLM that owns the state and the protocols directly, the original intent stays in flight all the way to the device.
Camera/AI is a bolt-on. Frigate and the like are great pieces of software, glued to HA via MQTT and entity bridges. When the AI side wants to react to a camera ("tell me when an unfamiliar person is at the gate") the bolt-on shape is felt.

None of these are dealbreakers if you love HA — and many people do, justifiably. But they're the reasons we stopped trying to wrap HA and built the home stack into KULVEX directly.

KULVEX Home: one core, native protocols

The shape is unified. A single server hosts the local LLM, the protocol stack, the chat/voice/channel front-ends, the agents that orchestrate them, and the long-term memory. Every integration speaks directly to its native protocol — no HA in the middle.

     YOUR SERVER (one box)
     ┌─────────────────────────────────────────────────────────┐
     │                                                         │
     │  Chat / Voice / Signal / WhatsApp / Telegram            │
     │                       │                                 │
     │                       ▼                                 │
     │           ┌─────────────────────────┐                   │
     │           │   Local LLM (Mark VII)  │                   │
     │           │   Tool calling enabled  │                   │
     │           └────────────┬────────────┘                   │
     │                        │                                │
     │             ┌──────────┴──────────┐                     │
     │             ▼                     ▼                     │
     │    ┌────────────────┐    ┌────────────────┐             │
     │    │ home_manager   │    │ presence /     │             │
     │    │ (intent layer) │    │ camera / YOLO  │             │
     │    └───────┬────────┘    └────────┬───────┘             │
     │            │                       │                    │
     │  ┌─────────┴──────────┐  ┌─────────┴──────────┐         │
     │  ▼         ▼          ▼  ▼                    ▼         │
     │ ┌──┐    ┌────┐     ┌──────┐  ┌──────┐    ┌──────────┐  │
     │ │Zigbee │Z-Wave│   │ Tuya │  │ RTSP │    │  Tempest  │  │
     │ │2MQTT │ JS   │   │local │  │cams  │    │  weather  │  │
     │ └──┬──┘ └─┬──┘     └──┬───┘  └──┬───┘    │ UDP listen│  │
     │    │      │           │         │        └─────┬─────┘  │
     │    ▼      ▼           ▼         ▼              ▼        │
     │  ZIGBEE  Z-WAVE     WiFi      LAN           LAN UDP     │
     │  RADIO   RADIO       Tuya     RTSP                      │
     │                      devices                            │
     └─────────────────────────────────────────────────────────┘

Concretely, the integrations that live in KULVEX's home stack:

Zigbee via Zigbee2MQTT (any sonoff ZBDongle, slzb-06, etc.)
Z-Wave via Z-Wave-JS (Aeotec stick + the long-supported devices)
Tuya via the local LAN protocol, no cloud account required after pairing (we don't use the Tuya cloud SDK)
Sonoff via the local API (eWeLink LAN mode)
RTSP cameras direct to the YOLO person detector (Hikvision, Reolink, generic ONVIF)
WeatherFlow Tempest via the UDP broadcast listener — every hub on the LAN is auto-discovered and streams in real-time to the Climate panel
Matter / Thread on the roadmap, OTBR integration drafted

Natural language → device action, in one hop

The pipeline that runs when you say "apagá las luces de la cocina" over voice or type it in the dashboard:

The chat agent (or voice transcription) lands the message at the local LLM with a tool definition for home_turn_off exposed.
The model decides this is a home-control intent and emits a structured tool call: { tool: "home_turn_off", room: "kitchen", entity_match: ["light"] }.
home_manager resolves "kitchen" against the live device registry (kitchen_main_light, kitchen_under_cabinet, kitchen_fan) and applies the action across the matching entities via the right protocol — Zigbee2MQTT for the bulb, Tuya-local for the LED strip.
The result returns to the LLM, which forms a one-line confirmation in the same language the user used.

There's also a fast-path for trivial commands. If the agent has the home tools enabled and the message matches an obvious on/off pattern, KULVEX skips the full LLM call entirely — a regex + entity matcher fires the tool directly. This is the latency table:

"Turn off the kitchen lights"~80 ms (fast-path, no LLM)"Apagá las luces y bajá la persiana" (compound)~1.4 s (LLM tool call, 2 actions)"Si Tkay llega antes de las 20h, prendé las luces del living"~2.8 s (LLM agent, presence trigger registered)"What's the kitchen humidity?"~600 ms (Tempest-cached read + LLM phrasing)

The fast-path matters: routine commands shouldn't pay for the model. The LLM steps in only when the command is ambiguous or compound. We built the same shape into the agent runtime — agents try the fast-path before reaching for the LLM.

Presence, cameras, and the "reactive" mode

One place where the unified-core shape pays off cleanly: presence-driven AI behaviour. The presence stack runs YOLO person-detection on RTSP frames from the cameras, plus phone-MAC tracking on the LAN, and emits an event stream (arrived, left, unfamiliar_at_gate).

Because the LLM is in the same process, those events become first-class triggers for agents. You configure an agent like:

Agent: "Doorbell"
Triggers:
  - presence.unfamiliar_at_gate (debounce: 30s)
Tools:
  - home_security_history
  - channel_send_message
Instructions:
  When an unknown person appears at the gate, send a Signal
  message to Bruno with the camera frame and a one-line summary
  of who was at the gate in the last hour.

No YAML. No automation editor. No service-call gluing. The trigger is a Python event, the agent prompt is text, the tools are typed Python functions. When the event fires the agent runs end-to-end in one process.

The Tempest weather integration, as an example

One of our recent integrations is a good case study for the "native protocol, not HA" choice. WeatherFlow Tempest is a personal weather station that broadcasts JSON-over-UDP on the local network every few seconds. Air temp, dew point, wind, lightning, etc.

HA has a Tempest integration that polls their cloud API. We didn't want the cloud round-trip — Tempest is on our LAN, the data is right there. So KULVEX listens directly:

# core/home/protocols/tempest.py (sketch)
import socket, json, asyncio

async def listen():
    sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
    sock.bind(("0.0.0.0", 50222))
    sock.setblocking(False)
    while True:
        data, _ = await asyncio.get_event_loop().sock_recvfrom(sock, 4096)
        msg = json.loads(data)
        if msg.get("type") == "obs_st":
            await store_observation(msg)
            await broadcast_climate_panel(msg)

Hubs are auto-discovered (any UDP packet from a new serial registers a station). The Climate panel updates over Socket.IO in real-time. There's no cloud account, no polling cadence to tune, no integration to install. If you put a Tempest on the LAN, KULVEX picks it up.

The same shape applies to every protocol: own the transport, store locally, push to the UI.

Hardware footprint

One server, one stack. From a real install (ours, the lab box):

CPU idle~3% (Ryzen 7 7700X)CPU active (chat + home tool call)~25% peak, ~10% sustainedRAM (model loaded + DB + everything)~12 GBVRAM (Mark VII Q4_K_M, dual 4090+5090)~35 GB across bothDisk (full state + 6 months memory)~14 GBDevices online (Zigbee + Z-Wave + Tuya)42RTSP cameras processed4 (1080p, 5 fps each)Tempest stations2

For lighter rigs, KULVEX picks a smaller model automatically (the model selector probes VRAM at install and chooses the best fit). Cloud-only mode runs the home stack with the LLM disabled — useful for very small machines that just want the home brain.

What Home Assistant still does better today

Honest list. KULVEX is two months into the home pivot; HA is a decade ahead on integration breadth. Things HA handles better:

Long-tail integrations. Obscure brands, one-off cloud APIs, the very-specific German thermostat, older Z-Wave devices, every Insteon model. HA covers it; we cover the popular protocols and add by demand.
Mature blueprint library. The community has years of shareable automations. KULVEX doesn't have an equivalent yet.
Energy dashboard depth. HA's energy tracking is a very polished feature. We have basic consumption views; full energy reporting is on the roadmap.
Mobile companion app. HA Companion has years of polish. The KULVEX iOS app exists and works, but it's newer (we archived the React Native and Flutter versions to focus on native iOS — see the changelog).

If you live in HA today and your setup is humming, this isn't a "rip and replace" pitch. KULVEX is for the case where the seam between AI and home automation has started to bother you, or where you're standing up a new home and want one stack that does both.

Try it on your hardware

One-line install. The home stack auto-discovers what it can on the LAN; manual pairing is in the dashboard.

# Linux / macOS
curl -fsSL https://kulvex.ai/install.sh | bash

# Open the dashboard at https://localhost:3000
# Settings → Devices → pair Zigbee, Z-Wave, Tuya
# Home → Climate auto-detects Tempest hubs on UDP
# Cameras → add RTSP URLs

For pricing, hardware tiers, and the feature comparison with HA, see kulvex.ai/pricing.

What we want to hear

We're explicitly tracking gaps versus HA. If your install depends on an integration we don't cover, or a feature class HA does well that we're missing, we'd like to know — those are the slots we triage into the roadmap. Especially: Matter rollouts, energy tracking, and whatever your weird device is.

— [email protected]