CV

I lead a 100+ member product-facing AI team at Alibaba AMAP, building foundation systems for spatial intelligence and generative intelligence. My research traces an arc from neural architecture search through Vision Transformer design and multimodal foundation models to LLM reasoning, world models, agent systems, and large-scale AI products that ship to hundreds of millions of users. I have published 110+ papers at top venues, with 14,000+ citations and 10,000+ GitHub stars across open-source projects.

110+ Publications
14,000+ Citations
100+ Team Members
10,000+ GitHub Stars

Awards & Recognition


Professional Experience

Alibaba Group — AMAP Mar 2024 – Present
Senior Director & Head of AMAP-ML

Leading a 100+ member product-facing AI team with over 50% recruited from top AI labs worldwide, across spatial intelligence, generative intelligence, reasoning agents, world models, foundation architectures, and multimodal understanding.

  • Published 45+ papers at top venues (ICLR, CVPR, ACL, ICCV, NeurIPS, AAAI, EMNLP) and open-sourced 30+ AMAP-ML projects with 10,000+ cumulative GitHub stars
  • Key first-author works: GPG (ICLR’26, adopted by ByteDance’s VERL framework), USP (ICCV’25); key team works: SkillClaw (1,300+ Stars), DreamX-World, Tree-GRPO, FASA, CoEvolve
  • Multimodal technology powers AMAP’s Saojie Bang (扫街榜) pipeline; large-scale industrial Agent system drives AI Companion (AI 伴行) — both serving hundreds of millions of users
Meituan — Visual Intelligence Department May 2020 – Mar 2024
Senior Technical Manager

Built the Visual Intelligence team from scratch. Directed research in Vision Transformers, multimodal large models, and industrial AI systems.

  • Created Twins (NeurIPS’21), CPVT (ICLR’23), VisionLLaMA (ECCV’24) — widely adopted Vision Transformer architectures; VisionLLaMA pioneered auto-scaling 2D RoPE, later adopted by Qwen-VL and others
  • Built MobileVLM, the first real-time mobile VLM; reproduced LLaMA 7B from scratch
  • Open-sourced YOLOv6 (5,700+ Stars), an industry-standard detection framework deployed across the industry
  • Shipped 3D perception for autonomous delivery vehicles and drones; saved millions in annotation and serving costs annually
Xiaomi — Artificial Intelligence Department Mar 2017 – May 2020
Senior Technical Manager

Founded Xiaomi’s AutoML team and produced a series of influential neural architecture search works.

  • FairNAS (ICCV’21), FairDARTS (ECCV’20), DARTS- (ICLR’21), FALSR — establishing new standards for fair and robust architecture search
  • Won 2nd place in Xiaomi’s first “Million Dollar Prize” (Automated Neural Network Design)
  • FALSR super-resolution algorithm personally endorsed by CEO Lei Jun
Beijing KingStar System Control Co., Ltd. Jun 2013 – Mar 2017
Deputy Director
  • Core contributor to “Complex Power Grid Autonomy — Collaborative Automatic Voltage Control” project
  • Contributed 20 invention patents; awarded National Science and Technology Progress First Prize (2018)
IBM Research China (CRL) Jul 2012 – May 2013
Research Scientist
  • Large-scale data analytics and machine learning solutions at IBM China Research Lab

Selected Publications

LLM Reasoning

  • GPG: Simple & Strong RL for Reasoning — ICLR’26 · 1st Author
  • Tree-GRPO: Tree Search for Agent RL — ICLR’26
  • CoEvolve: Agent-Data Co-Evolution — ACL’26
  • MathForge: Difficulty-Aware GRPO — ICLR’26
  • AutoDrive-R2: Reasoning VLA for Driving — ICLR’26

Generative AI & World Models

  • USP: Unified Pretraining for Gen & Understanding — ICCV’25 · 1st Author
  • DCW: SNR-t Bias of Diffusion Models — CVPR’26
  • S2-Guidance: Training-Free Diffusion Enhancement — ICLR’26
  • EPG: End-to-End Pixel Generation without VAE — ICLR’26
  • DreamX-World: Interactive World Model

AI Agents & Intelligent Mobility

Foundation Architectures

  • VisionLLaMA: Unified LLaMA for Vision — ECCV’24 · 1st Author
  • Twins: Spatial Attention in ViTs — NeurIPS’21 · 1st Author · Most Influential
  • CPVT: Conditional Positional Encodings — ICLR’23 · 1st Author · Most Influential
  • FASA: Frequency-Aware Sparse Attention — ICLR’26
  • QARepVGG: Quantization-Aware RepVGG — AAAI’24 · 1st Author

Vision-Language & Detection

  • MobileVLM: First Real-Time Mobile VLM · 1st Author
  • YOLOv6: Industrial Object Detection — 5,700+ Stars
  • SpatialGenEval: Spatial Intelligence Benchmark — ICLR’26
  • PromptDet: Open-Vocabulary Detection — ECCV’22

AutoML & Neural Architecture Search

  • FairNAS: Rethinking NAS Fairness — ICCV’21 · 1st Author · Most Influential
  • FairDARTS: Fair Differentiable NAS — ECCV’20 · 1st Author
  • DARTS-: Robustly Out of Collapse — ICLR’21 · 1st Author

Full publication list (110+ papers)


Professional Service

Area Chair: ICLR Area Chair: NeurIPS SPC: AAAI SPC: IJCAI

Education


Patents