Software Engineer @ Amazon · LLM Infrastructure & Multimodal Platforms

Tung-Sheng (Sean) Lee

Building reliable cloud platforms for AI-driven products

I build and scale mission-critical backend systems for AI-driven products, with deep experience in distributed systems, LLM infrastructure, and multimodal platforms serving global traffic.

100K+ TPS at scale
2.5+ yrs Production engineering
3x Throughput improvements

About Me

Hands-on engineer focused on resilient architecture and measurable business impact.

I'm a Software Engineer with expertise in building large-scale distributed systems and cloud infrastructure. Currently at Amazon, I build and scale mission-critical backend services powering Echo Show's LLM-driven multimodal experiences, sustaining 100K+ TPS with high availability across global traffic.

My work spans multi-agent LLM orchestration, RAG and hybrid retrieval, scalable data pipelines, and model-serving systems that enable real-time, context-aware AI experiences in production.

Programming

Python Java Kotlin Go JavaScript C# C/C++

Data Engineering

Kafka Spark Flink Iceberg DynamoDB MySQL Tableau

Cloud & DevOps

AWS CDK Docker Kubernetes CloudWatch

AI/ML

TensorFlow PyTorch BERT LLM RAG

Work Experience

A track record of shipping stable systems, improving efficiency, and scaling platforms.

Software Development Engineer

Amazon Aug 2024 - Present Sunnyvale, CA
  • Built and scaled mission-critical backend services powering Echo Show's LLM-driven multimodal experiences, sustaining 100K+ TPS with high availability across global traffic
  • Led the architecture and delivery of a multi-agent LLM orchestration platform with RAG and hybrid retrieval, reducing onboarding time and operational load across teams
  • Architected an LLM-driven conversational system for AI-based visual experiences through scalable data pipelines, model serving infrastructure, and continuous fine-tuning

Graduate Research Assistant

Carnegie Mellon University Jan 2024 - May 2024 Mountain View, CA
  • Designed data sharding proxy using consistent hashing, improving query performance by 140%
  • Implemented Raft algorithm and Two-Phase Commit (2PC) protocol in C++ for data consistency

Software Engineer (Full-Stack)

Advantech Oct 2023 - Jan 2024 Remote
  • Built containerized storage service with GraphQL and MongoDB, achieving 80% cost reduction with 500 QPS
  • Developed LLM-powered RAG chatbot, indexing 10K+ documents for 5K+ global users

Software Engineer Intern

Alifecom Jun 2023 - Aug 2023 Remote
  • Designed cloud-native validation framework for DSP hardware testing on AWS
  • Built multi-threaded TCP testing service, improving throughput by 3x while reducing cost by 50%

Software Engineer (AI/ML)

Advantech Sep 2021 - Aug 2022 Taipei, Taiwan
  • Designed Market Intelligence Platform with BERT-based NER models and ETL pipelines
  • Developed C# .NET Core MVC BI platform, increasing dashboard delivery from 1-2 to dozens per month
  • Improved prediction accuracy by 20% using XGBoost forecasting with market insights

Key Projects

Selected work across platform engineering, AI systems, and cloud architecture.

Alexa+ Experiences

Built and scaled backend services for Echo Show's LLM-driven multimodal experiences, sustaining 100K+ TPS and enabling real-time, context-aware AI interactions at global scale.

AWS LLM High Availability

Cloud-Based DSP Validation Platform

Designed automated, cloud-native hardware validation framework, improving test throughput by 3x while reducing infrastructure cost by 50%.

AWS Docker Python

Market Intelligence Platform

Engineered distributed data pipelines and ML systems providing real-time supply insights for 1M+ materials using BERT NER and XGBoost.

AI/ML ETL Tableau

Blog Posts

I write about distributed systems, AI engineering, and lessons from production.

Blog posts coming soon...

Get In Touch

Open to backend, distributed systems, and AI platform opportunities.