// Writing

Thinking out loud on hard problems

Topic

May 31, 2026
Accountability Isn't Blame: What a 2:30 AM Rollback Taught Me About Production Incidents
A real production incident, three mistakes I own without flinching, and the four systemic gaps that made them inevitable. The release engineering playbook to prevent it.
May 24, 2026
Three Layers Between Your Code and Your Database
Drivers, ORMs, and wire protocol adapters do three different jobs. Conflating them is the line between a developer who uses databases and an engineer who designs systems around them.
May 21, 2026
Voicebox.sh — The Local-First Voice Stack That Just Made MCP Voice Agents Practical
An engineer's tour of voicebox.sh — what it actually is, how the local API and MCP server work, and what developers are already building on top of it.
May 11, 2026
The AI Stack Nobody Drew You a Map For: Agents, RAG, MCP, Skills, and the LLM Beneath Them All
Everyone says they're building AI. Almost nobody agrees on what that means. Here's the five-layer stack that actually explains what enterprise AI is made of — and how to stop picking the wrong piece for the wrong problem.
May 10, 2026
AWS Just Made Its Platform Agent-Native: What the AWS MCP Server GA Actually Means
The AWS MCP Server is generally available — and it's not just a developer convenience. It's AWS making a bet that agents, not humans, will be the primary interface to cloud infrastructure.
May 2, 2026
AI Concepts Glossary: A Principal Engineer's Reference
A living reference for AI terminology organized into five conceptual domains — model access & licensing, training & customization, architecture & deployment, evaluation & use, and representations, search & agents.
April 28, 2026
The Infrastructure That Enforces Itself: Compliance-Grade Multi-Tenant SaaS on Amazon EKS
A deep dive into building compliance-grade multi-tenant SaaS on Amazon EKS — using Flux GitOps, Terraform Enterprise workspace versioning, Vault 2.0 workload identity, and Argo Workflows for fully automated, auditable tenant onboarding.
April 21, 2026
Deploying Production-Grade LLM Inference on AWS EKS — A Hands-On Deep Dive
An architectural walkthrough of the GenAI on EKS workshop — vLLM, Ray Serve, Karpenter, DCGM + AMP observability, and AWS Strands Agents — with the design decisions behind each layer.
April 2, 2026
PageIndex and Vectorless RAG — A Structural Alternative for Professional Documents
Reasoning-based retrieval as an alternative to vector similarity search for structured professional documents — how PageIndex achieves 98.7% on FinanceBench, applied across healthcare, wealth management, banking, and travel with full domain use cases and implementation pathway.
April 1, 2026
Mamba and SSMs — What the Generation Backbone Change Means for RAG
A systems-level analysis of replacing the Transformer backbone with Mamba/SSM architectures in RAG systems — covering linear context scaling, constant KV cache memory, selective state tracking (Mamba-3), and the hybrid Transformer+Mamba pattern for enterprise deployment.

20 posts