LinkedInLinkedIn

Staff Machine Learning Engineer in Agent and Agent Platform  |  Jun 2024 – Present

Architect of production-grade LLM agent platforms and agent memory systems — spanning agentic memory, prompt engineering, RAG, and agent orchestration — that power enterprise AI agents serving millions of users.

  • Agentic Long-Term Semantic Memory: Designed and shipped an agent memory backbone from zero to full production, serving millions of users and generating significant revenue. Lead engineer owning the core algorithm and full-stack delivery — offline data ingestion, nearline indexing pipelines, and online retrieval & agentic-flow integration — achieving SOTA memory quality at low latency and scale. Published at SIGKDD 2026. (paper)
  • LLM Prompt Compression (ProCut): Inventor of a prompt attribution & compression algorithm adopted across multiple agent teams — saving 78% of LLM token costs in production with no loss in prompt performance. Published at EMNLP 2025 Industry Track (Oral, top 2.5%). (paper)

Sr. Software Engineer  |  Mar 2022 – Aug 2024

  • Graph-RAG Agent for Customer Service: Inventor of a knowledge-graph-augmented retrieval agent deployed in production, cutting median issue-resolution time by 28.6%. Published at SIGIR 2024. (paper)

Software Engineer  |  Mar 2020 – Feb 2022

  • ML Model Health Monitoring (AlerTiger): Core algorithm innovator of a deep-learning anomaly-detection system safeguarding the health of LinkedIn's AI models. Published at SIGKDD 2023. (paper, code)
  • AI Model Understanding: Core developer of a central model-interpretation library for instance- and global-level feature-importance analysis.

SmuleSmule

Senior Machine Learning Engineer  |  Jul 2018 – Mar 2020

Built large-scale music recommendation systems serving millions of users across Smule's Feed, Explore, and Songbook surfaces.

  • Batched Recommendation: Completed adaptive music recommendation systems that recommend trillions songs and millions users. Administered and optimized various recommendation algorithms behind Explore and Songbook pages.
  • Real-time Recommendation: Improved web user engagement by 10% by building as a chief developer a brand-new real-time music recommendation system. Implemented modules including real-time feature extraction, ranking, database communication, and training modules. Used tools including Hadoop, Scala, Spark, Flink, Cassandra, Redis, Kafka etc.
  • Features Extraction: Improved users’ average music listening time by 30%, 12%, and 5% separately by developing three new features for personalized recommendation models behind Feed, Explore, and Songbook pages. Co-authored Simultaneous Relevance and Diversity: A New Recommendation Inference Approach. (paper)
  • Intention Extractor (NLP Research): Led researches on text classification and keywords extraction, trained model on comments data with tools TensorFlow and NTLK.
  • Other: Learn new techniques in industrial recommendation system through paper, conference, or Coursera. Apply the new techniques in to Smule's machine learning model.

IsuzuIsuzu

Machine Learning Engineer — Internship  |  Feb 2018 – Jul 2018

Developed ML-based anomaly detection and diagnostic tooling for vehicle sensor data.

  • Anomaly Detection: Largely reduced company’s costs in data labeling by inventing a semi-supervised anomaly detection algorithm used for detecting vehicle’s problematic sensors while still keeping high accuracies. Co-authored On-Board Predictive Maintenance with Machine Learning (SAE 2019). (paper)
  • Data Visualization: Facilitated engineers’ diagnosis work by developing a ML-based visualization tool, leading to 20% time saving.