Software Engineer @ Amazon · LLM Infrastructure & Multimodal Platforms

Tung-Sheng (Sean) Lee

Building reliable cloud platforms for AI-driven products

I build and scale mission-critical backend systems for AI-driven products, with deep experience in distributed systems, LLM infrastructure, and multimodal platforms serving global traffic.

100K+ TPS at scale
2.5+ yrs Production engineering
3x Throughput improvements

About Me

Hands-on engineer focused on resilient architecture and measurable business impact.

I'm a Software Engineer with expertise in building large-scale distributed systems and cloud infrastructure. Currently at Amazon, I build and scale mission-critical backend services powering Echo Show's LLM-driven multimodal experiences, sustaining 100K+ TPS with high availability across global traffic.

My work spans multi-agent LLM orchestration, RAG and hybrid retrieval, scalable data pipelines, and model-serving systems that enable real-time, context-aware AI experiences in production.

Programming

Python Java Kotlin Go JavaScript C# C/C++

Data Engineering

Kafka Spark Flink Iceberg DynamoDB MySQL Tableau

Cloud & DevOps

AWS CDK Docker Kubernetes CloudWatch

AI/ML

TensorFlow PyTorch BERT LLM RAG

Work Experience

A track record of shipping stable systems, improving efficiency, and scaling platforms.

Software Development Engineer

Amazon Aug 2024 - Present Sunnyvale, CA
  • Built and scaled mission-critical backend services powering Echo Show's LLM-driven multimodal experiences, sustaining 100K+ TPS with high availability across global traffic
  • Led the architecture and delivery of a multi-agent LLM orchestration platform with RAG and hybrid retrieval, reducing onboarding time and operational load across teams
  • Architected an LLM-driven conversational system for AI-based visual experiences through scalable data pipelines, model serving infrastructure, and continuous fine-tuning

Graduate Research Assistant

Carnegie Mellon University Jan 2024 - May 2024 Mountain View, CA
  • Designed data sharding proxy using consistent hashing, improving query performance by 140%
  • Implemented Raft algorithm and Two-Phase Commit (2PC) protocol in C++ for data consistency

Software Engineer (Full-Stack)

Advantech Oct 2023 - Jan 2024 Remote
  • Built containerized storage service with GraphQL and MongoDB, achieving 80% cost reduction with 500 QPS
  • Developed LLM-powered RAG chatbot, indexing 10K+ documents for 5K+ global users

Software Engineer Intern

Alifecom Jun 2023 - Aug 2023 Remote
  • Designed cloud-native validation framework for DSP hardware testing on AWS
  • Built multi-threaded TCP testing service, improving throughput by 3x while reducing cost by 50%

Software Engineer (AI/ML)

Advantech Sep 2021 - Aug 2022 Taipei, Taiwan
  • Designed Market Intelligence Platform with BERT-based NER models and ETL pipelines
  • Developed C# .NET Core MVC BI platform, increasing dashboard delivery from 1-2 to dozens per month
  • Improved prediction accuracy by 20% using XGBoost forecasting with market insights

Key Projects

Selected work across platform engineering, AI systems, and cloud architecture.

Alexa+ Experiences

Built and scaled backend services for Echo Show's LLM-driven multimodal experiences, sustaining 100K+ TPS and enabling real-time, context-aware AI interactions at global scale.

AWS LLM High Availability

Cloud-Based DSP Validation Platform

Designed automated, cloud-native hardware validation framework, improving test throughput by 3x while reducing infrastructure cost by 50%.

AWS Docker Python

Market Intelligence Platform

Engineered distributed data pipelines and ML systems providing real-time supply insights for 1M+ materials using BERT NER and XGBoost.

AI/ML ETL Tableau

Blog Posts

I write about distributed systems, AI engineering, and lessons from production.

Blog posts coming soon...

Get In Touch

Open to backend, distributed systems, and AI platform opportunities.

I'm currently open to new opportunities. Feel free to reach out!