Thinking out loud on hard problems
- Accountability Isn't Blame: What a 2:30 AM Rollback Taught Me About Production Incidents
A real production incident, three mistakes I own without flinching, and the four systemic gaps that made them inevitable. The release engineering playbook to prevent it.
- Three Layers Between Your Code and Your Database
Drivers, ORMs, and wire protocol adapters do three different jobs. Conflating them is the line between a developer who uses databases and an engineer who designs systems around them.
- Voicebox.sh — The Local-First Voice Stack That Just Made MCP Voice Agents Practical
An engineer's tour of voicebox.sh — what it actually is, how the local API and MCP server work, and what developers are already building on top of it.
- The AI Stack Nobody Drew You a Map For: Agents, RAG, MCP, Skills, and the LLM Beneath Them All
Everyone says they're building AI. Almost nobody agrees on what that means. Here's the five-layer stack that actually explains what enterprise AI is made of — and how to stop picking the wrong piece for the wrong problem.
- AWS Just Made Its Platform Agent-Native: What the AWS MCP Server GA Actually Means
The AWS MCP Server is generally available — and it's not just a developer convenience. It's AWS making a bet that agents, not humans, will be the primary interface to cloud infrastructure.
- AI Concepts Glossary: A Principal Engineer's Reference
A living reference for AI terminology organized into five conceptual domains — model access & licensing, training & customization, architecture & deployment, evaluation & use, and representations, search & agents.
- The Infrastructure That Enforces Itself: Compliance-Grade Multi-Tenant SaaS on Amazon EKS
A deep dive into building compliance-grade multi-tenant SaaS on Amazon EKS — using Flux GitOps, Terraform Enterprise workspace versioning, Vault 2.0 workload identity, and Argo Workflows for fully automated, auditable tenant onboarding.
- Deploying Production-Grade LLM Inference on AWS EKS — A Hands-On Deep Dive
An architectural walkthrough of the GenAI on EKS workshop — vLLM, Ray Serve, Karpenter, DCGM + AMP observability, and AWS Strands Agents — with the design decisions behind each layer.
- PageIndex and Vectorless RAG — A Structural Alternative for Professional Documents
Reasoning-based retrieval as an alternative to vector similarity search for structured professional documents — how PageIndex achieves 98.7% on FinanceBench, applied across healthcare, wealth management, banking, and travel with full domain use cases and implementation pathway.
- Mamba and SSMs — What the Generation Backbone Change Means for RAG
A systems-level analysis of replacing the Transformer backbone with Mamba/SSM architectures in RAG systems — covering linear context scaling, constant KV cache memory, selective state tracking (Mamba-3), and the hybrid Transformer+Mamba pattern for enterprise deployment.
20 posts