Deep dives into cloud architecture, DevOps practices, AI engineering, and the journey to senior technical leadership.
Most LLM benchmarks evaluate text. We needed to evaluate entire websites. Here's the 4-layer evaluation framework we built to score AI-generated multi-file artifacts using a violation-deduction model.
A practical guide to deploying a local Python MCP server to Amazon Bedrock AgentCore Runtime — from localhost prototype to production-grade cloud service with session isolation, authentication, and observability.
A six-layer memory architecture for persistent AI agent knowledge — from CLAUDE.md foundations to auto memory, plans, and permissions — managing 14 specialized agents across multiple AWS projects.