Blog

Deep dives into cloud architecture, DevOps practices, AI engineering, and the journey to senior technical leadership.

AWS & Cloud Architecture Apr 29, 2026

AI Governance in Practice: FastAPI on EKS with Model Cards, Audit Logging, and Helm

How I built an AI governance platform on AWS EKS — FastAPI inference endpoint with per-request audit logging, model card endpoint, fairness metadata, and Helm-packaged deployment with HPA.

aws kubernetes python

AWS & Cloud Architecture Apr 29, 2026

Building a Production MLOps Pipeline on AWS SageMaker for Telecom Churn

How I built an end-to-end MLOps pipeline with SageMaker Pipelines, automated retraining via EventBridge, and drift monitoring using KS tests and CloudWatch — for a telecom churn use case.

mlops aws sagemaker

AWS & Cloud Architecture Apr 29, 2026

Predicting Telecom Customer Churn with scikit-learn, Keras, and Amazon SageMaker

Learn how to build a telecom customer churn predictor using Random Forest, Keras neural networks, and deploy it to a real-time SageMaker endpoint. Full code included.

machinelearning aws sagemaker

AWS & Cloud Architecture Apr 29, 2026

Building a Real-Time IoT Telemetry Pipeline with Kinesis, Lambda, and DynamoDB

How I built a real-time IoT data pipeline on AWS — device simulator → Kinesis stream → Lambda consumer → DynamoDB — with anomaly detection that fires SNS alerts and CloudWatch metrics.

aws iot kinesis

GenAI & AI Engineering Apr 13, 2026

How I Run Over 20 AI Agents Locally and Deploy One to Production at a Time

The industry ships agents fast and debugs them in production. Here's the opposite approach — local-first agentic development, liftability by design, and selective promotion to AWS Bedrock AgentCore.

mlops llmops agentops

GenAI & AI Engineering Apr 2, 2026

The Missing Test Suite: Why AI Projects Fail Before Production

Most AI projects never reach production. The missing piece is prompt testing — with the same rigour as TDD. Here's the strategy for shipping AI systems that actually work.

ai testing software-engineering

GenAI & AI Engineering Mar 31, 2026

Building an LLM Judge That Doesn't Lie to You

Our first LLM judge gave a 9/10 to a page with invisible text. Here's how we fixed it with structural guardrails, multimodal inputs, and a fixed-weight violation catalogue.

ai evaluation llm-as-judge

GenAI & AI Engineering Mar 30, 2026

Beyond Text: How We Built an Evaluation Framework for Multi-File AI Outputs

Most LLM benchmarks evaluate text. We needed to evaluate entire websites. Here's the 4-layer evaluation framework we built to score AI-generated multi-file artifacts using a violation-deduction model.

ai evaluation llm

GenAI & AI Engineering Mar 30, 2026

5 Models, 467 Actions, 1 Winner — What We Learned Comparing LLMs on Real Code Generation

We tested Claude Sonnet, Kimi K2.5, Claude Haiku, DeepSeek V3.2, and DeepSeek R1 on the same 16-action website generation pipeline. The results weren't what we expected.

ai llm claude

AWS & Cloud Architecture Mar 3, 2026

From IDE to Cloud: Lifting Your Local Agent into an MCP Server on Amazon Bedrock AgentCore

A practical guide to deploying a local Python MCP server to Amazon Bedrock AgentCore Runtime — from localhost prototype to production-grade cloud service with session isolation, authentication, and observability.

ai aws python

GenAI & AI Engineering Mar 3, 2026

How I Create Memory for My Agents on Claude Code

A six-layer memory architecture for persistent AI agent knowledge — from CLAUDE.md foundations to auto memory, plans, and permissions — managing 14 specialized agents across multiple AWS projects.

ai claude-code agent-memory