ink-ai

The backend behind real-time, conversational AI characters for an open-world Unreal Engine game set in New York.

Contract work, 2023–2024 · Demoed at the Tribeca Film Festival, June 2024

A game studio wanted NPCs you could actually talk to — characters that hold a conversation, in voice, and react to what's happening around them in the game world. I built the Python/FastAPI backend that made that work, and the part I'm proudest of is the context engineering: the characters don't just have backstories, they have a live sense of their surroundings.

The headline: context that changes on every turn

Each character carries a mutable scenario object — not static lore, but current world-state:

The Unreal Engine client pushes updates to that state through dedicated endpoints — one to merge new context, one to overwrite it — and the scenario gets composed into the character's system prompt at request time. So the same character speaks differently depending on what's happening around it. If it starts raining in-game, the NPC's next line knows. That's the core loop: structured, mutable world-state injected into the prompt so dialogue tracks the world.

On top of that, the prompt assembly layers in hand-authored few-shot example dialogs that lock each character's voice and dialect before any real input arrives, plus per-player conversation memory keyed on player-character pairs in MongoDB, with session expiry.

Putting a real person in the game

One character, Prince J, was a real New York City resident. We recorded a 45-minute interview with him that did double duty: it captured a clean sample of his voice, which we cloned with ElevenLabs, and it gave us the raw material to engineer his character's persona and dialect (a Rasta/Spanglish patois). One session, two deliverables — a voice and a personality. He came to the Tribeca demo in person and met his AI self.

The model decision

The LLM layer was model-agnostic, routed through OpenRouter so models could be swapped without code changes. We progressed through Hermes → Hermes 2 → Hermes-2-Pro-Llama-3-8B, Nous Research open-weights models — and the reason was concrete. The game needed characters who could authentically hold realistic, oppositional worldviews. One was modeled on Mother Sister from Do the Right Thing, and a defining trait was that she doesn't trust the police. Heavily aligned commercial models refused or sanitized that; they couldn't stay in character. The less-restricted Hermes models could, and the colorful, diverse NYC cast consistently came out better on them.

How we kept the characters good

Dialogue quality is hard to eyeball one line at a time, so I built a separate SvelteKit evaluation harness — an app for testing and iterating on character responses before they reached the game. Scaffolding for emotion/sentiment classification on player input (a DistilBERT model) fed back into how characters reacted.

Stack

Python FastAPI (fully async) MongoDB (Motor) Pydantic v2 OpenRouter Hermes-2-Pro-Llama-3-8B ElevenLabs TTS SvelteKit (eval harness) Unreal Engine (REST)

Outcome

The prototype was demoed as a live, interactive installation at the Tribeca Film Festival in June 2024 — players holding real-time, in-voice conversations with characters that knew where they were and what the weather was doing. It's the project I reach for when someone asks about context engineering, because it's the most literal version of it I've built: the world changes, and the characters notice.