Things I've built
stock-vetter
An LLM-driven research engine for fundamental stock analysis.
ProblemDoing real fundamental analysis on a company means reading hundreds of pages of SEC filings and earnings transcripts. This is slow work that's easy to do inconsistently. I wanted a tool that could perform a rigorous, repeatable first pass: pull the primary sources, reason about them the way a value investor actually does, and surface only what deserves a human's attention.
What it doesFor any ticker it fetches the company's SEC filings and analyst-call transcripts, runs a structured multi-pass LLM analysis scoring the business across six dimensions, cross-checks the numbers with a reverse-DCF to expose the growth the market is implying, and produces a weighted verdict with its reasoning shown rather than hidden.
What I learnedThe hardest failures lived in the data, not the model. Careful review of outputs surfaced a fiscal-year bug that corrupted results for companies that don't report on a December calendar, and a parser quietly feeding the wrong 10-Q sections into the model. Fixing those, plus prompt caching and adaptive sampling that cut cost per ticker by roughly 40%, drove home that in LLM pipelines correctness lives in the source-data plumbing.
explore the interactive demo →signal-tracker
Thesis-based change detection for an investment portfolio.
ProblemMost investing tools tell you what happened. Very few tell you when something has changed relative to why you invested in the first place. I wanted to write down my actual theses for each holding and be alerted only when new information genuinely tests them.
What it doesI encode a thesis per position: the kind of claim that, if it broke, would change my mind. A daily job watches for new filings, transcripts, and signals, uses an LLM to judge whether anything materially moves a thesis, and sends a concise email digest. A remote trigger lets me kick off a run from my phone.
OutcomeIt converts a vague intention to "keep up with my positions" into a specific, low-noise feed of thesis-relevant change. It also shares its core infrastructure with stock-vetter, so the two compound.
read the write-up →self-hosted coding agent
A single-file terminal coding agent on a local model — built to make the harness self-describing enough that the model can recover from its own mistakes.
ProblemI wanted to understand how agentic coding tools really work under the hood, and to run one entirely on my own hardware against open models — local-first, with no per-token cost in the common case and only an optional hosted fallback for when the local box is unreachable.
What it doesA single-file agent (agent.py) drives a local Qwen model through a six-tool loop: read a file, create a file, edit with fenced search/replace blocks, search the tree with ripgrep, run bash (the model itself flags risky commands for approval), and signal task_complete. It streams the model's output and feeds tool results back, ending only when the agent calls task_complete with file-path evidence the harness verifies — a self-describing harness that states the working directory and tool invariants and returns actionable errors, so the model can recover from its own mistakes.
OutcomeA genuinely usable, local-first coding agent in a single file — and a much sharper feel for tool-use design, harness legibility, evidence-gated termination, and where agentic systems break.
read the write-up →StarCraft II theorycrafter
A factually-grounded strategy tool that refuses to hallucinate.
ProblemLLMs reason well but are unreliable on precise facts. For a game where unit costs and timings have to be exact, a model that confidently invents numbers is worse than useless.
What it doesIt answers build-order and strategy questions for a specific patch by grounding every factual claim in a database and using the model only for the reasoning on top. It's a two-layer design where an SQLite fact store (populated from a replay parser) holds the truth and the LLM is never asked to recall it.
OutcomeA concrete pattern for building grounded LLM tools. It's the same idea that powers my investing pipeline, applied here to a game I know at the top level.
read the write-up →Where I work
Software Engineer III at Walmart Global Tech · Store Entities Enrichment Platform
I build Java / Spring Boot data pipelines on Kafka that enrich inventory data at scale, on the order of 40K+ events per second and 10M+ messages per day, running on Kubernetes with GCP and BigQuery. The work is squarely about reliability, throughput, and correctness in a system where being wrong is expensive and being slow is visible.
M.S., Artificial Intelligence · in progress
Formalizing the AI/ML foundations I've been building toward through self-study and the projects above, with the aim of moving from distributed-systems engineering into AI engineering as a primary focus.
Tools I reach for
Languages
Java · TypeScript · Python
Systems
Kafka · Spring Boot · Kubernetes · GCP / BigQuery
AI / LLM
RAG · LLM orchestration · prompt & cost optimization · local serving (Ollama)
Data & web
Turso / SQLite · Next.js · Vercel · GitHub Actions
Reading systems under pressure
StarCraft II Grandmaster (top ~0.1%)
At the top level it's the same exercise as the work I care about: read a complex state, weigh incomplete information, and commit to the highest-value move before the clock runs out. It's not a coincidence that I like building tools that do the analog of that, taking a noisy pile of real-world information and turning it into a clear, defensible decision. The instinct that makes a good game is the one I try to encode in software.