Architecting Resilient
Backend Systems & Intelligent AI Agents.
I'm Aaron Wu. Heavily inspired by system design literature, I focus on turning complex ML research into production code and building high-concurrency infrastructure to protect AI services from traffic spikes.

About Me
As an international student, rapid adaptation and navigating complex constraints are my baseline. I am a Backend Software Engineer deeply focused on AI Infrastructure and Distributed Systems.
From deploying company-wide GraphRAG agents at AMD to translating federated learning math into code, my engineering philosophy is shaped by a constant intake of system design literature. I believe in prioritizing underlying technical competency over hypothetical impact—building systems designed to degrade gracefully and remain resilient under extreme load.
Technical Skills
Backend & Systems Languages
Data & Message Infrastructure
AI & Search Engine Stack
Cloud, DevOps & Tooling
Experience
Real-world experience architecting resilient backend systems and enterprise AI infrastructure.
Systems Infrastructure Engineer Intern
Architected a scalable GraphRAG-based AI agent to automate codebase understanding across 6 languages. Engineered a prompt-driven graph retrieval pipeline and optimized memory management with a plan-and-act algorithm.
Research Assistant (Federated Learning)
Conducted empirical experiments on federated hyperparameter learning algorithms, specifically addressing challenges with heterogeneous datasets and data sparsity.
Technical Lead
Directed the architectural shift toward a real-time voice assistant app. Engineered a serverless RAG pipeline utilizing AWS Lambda and Pinecone vector database for automated daily data ingestion.
R&D Intern
Designed and deployed an end-to-end ML data annotation platform on AWS (EC2, S3, RDS) in 12 days. Automated Python ETL workflows to securely process ECG data, increasing doctor efficiency by 70%.
Featured Projects
A showcase of my systems engineering work—tackling high-concurrency bottlenecks, distributed rate limiting, and peak-load shaving.
Distributed AI API Gateway
Resilient AI Reverse Proxy & Rate Limiter
Engineered a high-performance distributed AI Gateway in Go to route and protect LLM endpoints. Features a robust 3-layer rate limiting defense (Global, IP, Identity) using Redis Lua scripts for atomic operations, and supports highly resilient Fail-Open/Fail-Closed failure modes.
FlashForm
High-Performance Seckill Backend
Architected a high-concurrency Seckill (flash sale) backend handling 1,000+ QPS with a 0.00% error rate. Utilizes Redis DECR for atomic quota enforcement to guarantee zero overselling, and RabbitMQ for asynchronous peak shaving to protect downstream PostgreSQL databases from thundering herds.
Real-Time Data Streaming Pipeline
Containerized Big Data Orchestration
Built an end-to-end real-time data streaming pipeline orchestrating containerized services via Docker Compose. Leverages Apache Kafka for robust message brokering, Apache Airflow for DAG orchestration, Apache Spark for distributed data processing, and Cassandra for NoSQL storage.
RiskFreeRX
AI-Powered Medication Risk Analysis
Developed an award-winning medication risk validation system (HackRU Winner). Extracted FDA data using Google Cloud Vision and LLM-powered NLP, delivering a secure API via AWS Lambda and Supabase.
Featured Articles
Writing about backend architecture, system design, career journeys, and observations on the tech industry.
Beyond the Code: From 770 Resumes to AMD & Apple
A journey of dropping university prestige, reverse-engineering the job market, and using Interview-Prompted Learning to land top-tier infrastructure roles.
Architecting a Resilient AI API Gateway: Deep Dive into Distributed Rate Limiting
A deep dive into building a production-grade AI API Gateway in Go, exploring dual-layer caching, Redis Lua atomic operations, and Fail-Open vs. Fail-Closed distributed strategies.
Designing for the Surge: Architectural Trade-offs in Building a High-Concurrency Ticketing System
An inside look at the engineering decisions behind FlashForm, exploring asynchronous load leveling, Redis mutex locks, and event-driven architecture.
What I'm Reading
A curated list of books that have fundamentally shaped my approach to systems architecture, peak productivity, and the psychology of human behavior.
Designing Data-Intensive Applications — Martin Kleppmann
The "holy grail" of distributed systems. It taught me how to reason about reliability, scalability, and maintainability in complex data-driven architectures.
System Design Interview: An Insider’s Guide — Alex Xu
A masterclass in breaking down massive, ambiguous problems into scalable, modular components. It shaped my framework for communicating complex technical decisions.
Getting Things Done (GTD) — David Allen
The operating system for my life. It taught me how to offload cognitive load into a trusted system, allowing my brain to focus on creative problem-solving rather than remembering tasks.
Make Time — Jake Knapp & John Zeratsky
The practical guide to daily focus. It taught me how to intentionally design my day around a single "Highlight" and protect my energy from the thundering herd of distractions.
The Psychology of Money — Morgan Housel
A profound exploration of behavior and ego. It taught me that technical problems are often human problems, and that long-term success comes from managing one's own psychology.
Owning Your Own Shadow — Robert A. Johnson
A deep dive into Jungian psychology. It taught me the importance of radical self-honesty and integrating all aspects of the self to become a more grounded and effective leader.
Let's Connect
I'm currently looking for new opportunities. Whether you have a question or just want to say hi, my inbox is always open!