Nghia Dang · Software Engineer

/ selected work

Things I've built

rok-data-pipelineflagship

2.4M game profiles a day, taken out of a live client's memory and served to the web in tens of milliseconds — a reverse-engineered fetch layer, a multi-machine harvesting fleet, and a ClickHouse serving model built so the whole thing costs almost nothing to run.

stack Python · Win32 thread hijack · Lua C API · ClickHouse · Next.js · Fly.io · Cloudflare · Tailscale

scale 4,000 kingdoms · 600 deep · 2.4M profiles/day · ~876M rows/year

scope solo build, two repos, end to end

status live; no longer actively developed

ProblemThe numbers that actually decide a Rise of Kingdoms KvK — a governor's lifetime total kill points, the T1–T5 kill breakdown, lifetime deaths and healing — exist in no public API. Lilith's official endpoint returns only timeframe stats, and it 403s for any kingdom you don't have a character in. Those totals live in exactly one place: the memory of a running game client. Getting one governor out is a reverse-engineering problem. Getting every governor in every kingdom, every day, is a systems problem.

What it doesTwo halves. The fetcher runs a Lua chunk inside the live PC client by briefly hijacking an engine thread, calls the game's own by-ID profile fetch, and hooks the reply handler so the profile card never opens — which is what makes hundreds of back-to-back fetches crash-free instead of fatal. A coordinator then hands kingdoms to a fleet of sandboxed clients across several machines over Tailscale, each self-sizing its scan. The serving half ingests the results into ClickHouse and precomputes every aggregate in refreshable materialized views, so a page view is an indexed lookup rather than a query over history.

ScaleA full sweep is 4,000 kingdoms at 600 governors each — 2.4M profiles a day, appended to a history that is never truncated, which compounds to roughly 876M rows a year at 49 columns apiece. That number is what shapes both halves. A per-IP rate limit paces the fetch at ~1.2s per governor, so one sweep is ~800 hours of single-client time and only exists as a fleet — which makes not fetching the real optimisation. And on the serving side it is precisely the volume where a row-store starts costing real money, so the whole data model is built for a columnar engine: month partitions, dictionary-encoded columns, wide rows kept narrow, and every aggregate precomputed on the ingest cadence. The result serves in tens of milliseconds off one 2GB machine that suspends when idle.

What I learnedMost of the project was being wrong in public. PROGRESS.md runs 27 sections across a dozen sessions, including a section that declares arbitrary-kingdom fetch solved and the next one that retracts it with proof. The wins came from reading, not guessing: passive disassembly found the crash root cause that a debugger couldn't (the anti-tamper layer fights debuggers but ignores ReadProcessMemory), and the fix that finally shipped was noticing why the client crashed on a second fetch rather than out-engineering it.

read the write-up →

stock-vetter

An LLM-driven research engine for fundamental stock analysis.

stack TypeScript · pnpm monorepo · Next.js · Turso · Vercel

scope solo design & build

status live

ProblemDoing real fundamental analysis on a company means reading hundreds of pages of SEC filings and earnings transcripts. This is slow work that's easy to do inconsistently. I wanted a tool that could perform a rigorous, repeatable first pass: pull the primary sources, reason about them the way a value investor actually does, and surface only what deserves a human's attention.

What it doesFor any ticker it fetches the company's SEC filings and analyst-call transcripts, runs a structured multi-pass LLM analysis scoring the business across six dimensions, cross-checks the numbers with a reverse-DCF to expose the growth the market is implying, and produces a weighted verdict with its reasoning shown rather than hidden.

What I learnedThe hardest failures lived in the data, not the model. Careful review of outputs surfaced a fiscal-year bug that corrupted results for companies that don't report on a December calendar, and a parser quietly feeding the wrong 10-Q sections into the model. Fixing those, plus prompt caching and adaptive sampling that cut cost per ticker by roughly 40%, drove home that in LLM pipelines correctness lives in the source-data plumbing.

explore the interactive demo →

signal-tracker

Thesis-based change detection for an investment portfolio.

stack shared core · GitHub Actions cron · Turso · Resend · Vercel

scope solo build

status live, daily

ProblemMost investing tools tell you what happened. Very few tell you when something has changed relative to why you invested in the first place. I wanted to write down my actual theses for each holding and be alerted only when new information genuinely tests them.

What it doesI encode a thesis per position: the kind of claim that, if it broke, would change my mind. A daily job watches for new filings, transcripts, and signals, uses an LLM to judge whether anything materially moves a thesis, and sends a concise email digest. A remote trigger lets me kick off a run from my phone.

OutcomeIt converts a vague intention to "keep up with my positions" into a specific, low-noise feed of thesis-relevant change. It also shares its core infrastructure with stock-vetter, so the two compound.

read the write-up →

self-hosted coding agent

A single-file terminal coding agent on a local model — built to make the harness self-describing enough that the model can recover from its own mistakes.

stack Python · Ollama · qwen3-coder:30b · DeepSeek (fallback) · ripgrep · OpenAI client · Tailscale

scope solo build, single file (agent.py)

status v1.2

ProblemI wanted to understand how agentic coding tools really work under the hood, and to run one entirely on my own hardware against open models — local-first, with no per-token cost in the common case and only an optional hosted fallback for when the local box is unreachable.

What it doesA single-file agent (agent.py) drives a local Qwen model through a six-tool loop: read a file, create a file, edit with fenced search/replace blocks, search the tree with ripgrep, run bash (the model itself flags risky commands for approval), and signal task_complete. It streams the model's output and feeds tool results back, ending only when the agent calls task_complete with file-path evidence the harness verifies — a self-describing harness that states the working directory and tool invariants and returns actionable errors, so the model can recover from its own mistakes.

OutcomeA genuinely usable, local-first coding agent in a single file — and a much sharper feel for tool-use design, harness legibility, evidence-gated termination, and where agentic systems break.

read the write-up →

StarCraft II theorycrafter

A factually-grounded strategy tool that refuses to hallucinate.

stack Python · SQLite · LLM reasoning layer · replay parsing · uv

scope solo build

status in progress

ProblemLLMs reason well but are unreliable on precise facts. For a game where unit costs and timings have to be exact, a model that confidently invents numbers is worse than useless.

What it doesIt answers build-order and strategy questions for a specific patch by grounding every factual claim in a database and using the model only for the reasoning on top. It's a two-layer design where an SQLite fact store (populated from a replay parser) holds the truth and the LLM is never asked to recall it.

OutcomeA concrete pattern for building grounded LLM tools. It's the same idea that powers my investing pipeline, applied here to a game I know at the top level.

read the write-up →

/ experience

Where I work

2022 to present · Reston, VA

Software Engineer III at Walmart Global Tech · Store Entities Enrichment Platform

I build Java / Spring Boot data pipelines on Kafka that enrich inventory data at scale, on the order of 40K+ events per second and 10M+ messages per day, running on Kubernetes with GCP and BigQuery. The work is squarely about reliability, throughput, and correctness in a system where being wrong is expensive and being slow is visible.

Fall 2026 · George Mason University

M.S., Artificial Intelligence · in progress

Formalizing the AI/ML foundations I've been building toward through self-study and the projects above, with the aim of moving from distributed-systems engineering into AI engineering as a primary focus.

/ stack

Tools I reach for

Languages

Java · TypeScript · Python

Systems

Kafka · Spring Boot · Kubernetes · GCP / BigQuery

AI / LLM

RAG · LLM orchestration · prompt & cost optimization · local serving (Ollama)

Data & web

ClickHouse · Turso / SQLite · Next.js · Vercel · Fly.io · Cloudflare · GitHub Actions

/ beyond code

Reading systems under pressure

StarCraft II Grandmaster (top ~0.1%)

At the top level it's the same exercise as the work I care about: read a complex state, weigh incomplete information, and commit to the highest-value move before the clock runs out. It's not a coincidence that I like building tools that do the analog of that, taking a noisy pile of real-world information and turning it into a clear, defensible decision. The instinct that makes a good game is the one I try to encode in software.

/ contact

Get in touch

Happy to talk about distributed systems, applied LLM work, or anything I've built here. The fastest way to reach me is email, and I'm on LinkedIn and GitHub.