LinkedIn
Staff Machine Learning Engineer in Agent and Agent Platform | Jun 2024 – Present
Architect of production-grade LLM agent platforms and agent memory systems — spanning agentic memory, prompt engineering, RAG, and agent orchestration — that power enterprise AI agents serving millions of users.
- Agentic Long-Term Semantic Memory: Designed and shipped an agent memory backbone from zero to full production, serving millions of users and generating significant revenue. Lead engineer owning the core algorithm and full-stack delivery — offline data ingestion, nearline indexing pipelines, and online retrieval & agentic-flow integration — achieving SOTA memory quality at low latency and scale. Published at SIGKDD 2026. (paper)
- LLM Prompt Compression (ProCut): Inventor of a prompt attribution & compression algorithm adopted across multiple agent teams — saving 78% of LLM token costs in production with no loss in prompt performance. Published at EMNLP 2025 Industry Track (Oral, top 2.5%). (paper)
Sr. Software Engineer | Mar 2022 – Aug 2024
- Graph-RAG Agent for Customer Service: Inventor of a knowledge-graph-augmented retrieval agent deployed in production, cutting median issue-resolution time by 28.6%. Published at SIGIR 2024. (paper)
Software Engineer | Mar 2020 – Feb 2022
- ML Model Health Monitoring (AlerTiger): Core algorithm innovator of a deep-learning anomaly-detection system safeguarding the health of LinkedIn's AI models. Published at SIGKDD 2023. (paper, code)
- AI Model Understanding: Core developer of a central model-interpretation library for instance- and global-level feature-importance analysis.
Smule
Senior Machine Learning Engineer | Jul 2018 – Mar 2020
Built large-scale music recommendation systems serving millions of users across Smule's Feed, Explore, and Songbook surfaces.
- Batched Recommendation: Completed adaptive music recommendation systems that recommend trillions songs and millions users. Administered and optimized various recommendation algorithms behind Explore and Songbook pages.
- Real-time Recommendation: Improved web user engagement by 10% by building as a chief developer a brand-new real-time music recommendation system. Implemented modules including real-time feature extraction, ranking, database communication, and training modules. Used tools including Hadoop, Scala, Spark, Flink, Cassandra, Redis, Kafka etc.
- Features Extraction: Improved users’ average music listening time by 30%, 12%, and 5% separately by developing three new features for personalized recommendation models behind Feed, Explore, and Songbook pages. Co-authored Simultaneous Relevance and Diversity: A New Recommendation Inference Approach. (paper)
- Intention Extractor (NLP Research): Led researches on text classification and keywords extraction, trained model on comments data with tools TensorFlow and NTLK.
- Other: Learn new techniques in industrial recommendation system through paper, conference, or Coursera. Apply the new techniques in to Smule's machine learning model.
Isuzu
Machine Learning Engineer — Internship | Feb 2018 – Jul 2018
Developed ML-based anomaly detection and diagnostic tooling for vehicle sensor data.
- Anomaly Detection: Largely reduced company’s costs in data labeling by inventing a semi-supervised anomaly detection algorithm used for detecting vehicle’s problematic sensors while still keeping high accuracies. Co-authored On-Board Predictive Maintenance with Machine Learning (SAE 2019). (paper)
- Data Visualization: Facilitated engineers’ diagnosis work by developing a ML-based visualization tool, leading to 20% time saving.