CV

I lead a 100+ member product-facing AI team at Alibaba AMAP, building foundation systems for spatial intelligence and generative intelligence. My research traces an arc from neural architecture search through Vision Transformer design and multimodal foundation models to LLM reasoning, world models, agent systems, and large-scale AMAP products serving 300M+ users every day. I have authored 120+ research papers and preprints, including publications at top venues, with 15,000+ citations (6,000+ from first-authored works) across open-source projects.

15,000+ Citations

6,000+ First-Author Citations

120+ Publications

100+ Team Members

Awards & Recognition

Top 100 AI Scholars, AMiner 2023 — selected from hundreds of thousands of AI researchers worldwide
3 first-authored papers on PaperDigest's Most Influential Paper List: FairNAS, Twins, CPVT
2nd Place, Xiaomi "Million Dollar Prize" — Automated Neural Network Design

Professional Experience

Alibaba Group — AMAP Mar 2024 – Present

Senior Director & Head of AMAP-ML

Leading a 100+ member product-facing AI team across spatial intelligence, generative intelligence, reasoning agents, world models, foundation architectures, and multimodal understanding.

Published 50+ papers at top venues (ICLR, CVPR, ICML, ACL, KDD, ICCV, ECCV, NeurIPS, AAAI, EMNLP, SIGGRAPH) and open-sourced 30+ AMAP-ML projects
Key first-author works: GPG (ICLR 2026, adopted by ByteDance’s VERL framework), USP (ICCV 2025); key team works: SkillClaw, DreamX-World, Tree-GRPO, FASA, CoEvolve
Multimodal technology supports AMAP’s Saojie Bang (扫街榜) pipeline; large-scale industrial Agent work contributes to AI Companion (AI 伴行) — alongside AMAP products serving 300M+ users every day

Meituan — Visual Intelligence Department May 2020 – Mar 2024

Senior Technical Manager

Built the Visual Intelligence team from scratch. Directed research in Vision Transformers, multimodal large models, and industrial AI systems.

Created Twins (NeurIPS 2021), CPVT (ICLR 2023), VisionLLaMA (ECCV 2024) — influential Vision Transformer architectures; VisionLLaMA introduced auto-scaling 2D RoPE for LLaMA-style vision backbones
Built MobileVLM, a compact VLM designed for real-time on-device deployment; reproduced LLaMA 7B from scratch
Open-sourced YOLOv6, a widely used industrial object detection framework; developed QARepVGG to address quantization challenges in RepVGG-style deployment
Shipped 3D perception for autonomous delivery vehicles and drones, reducing annotation and serving costs

Xiaomi — Artificial Intelligence Department Mar 2017 – May 2020

Senior Technical Manager

Founded Xiaomi’s AutoML team and produced a series of influential neural architecture search works.

FairNAS (ICCV 2021), FairDARTS (ECCV 2020), DARTS- (ICLR 2021), FALSR — establishing new standards for fair and robust architecture search
Won 2nd place in Xiaomi’s first “Million Dollar Prize” (Automated Neural Network Design)
FALSR super-resolution algorithm personally endorsed by CEO Lei Jun

Beijing KingStar System Control Co., Ltd. Jun 2013 – Mar 2017

Deputy Director

Core contributor to “Complex Power Grid Autonomy — Collaborative Automatic Voltage Control” project
Contributed 20 invention patents; awarded National Science and Technology Progress First Prize (2018)

IBM Research China (CRL) Jul 2012 – May 2013

Research Scientist

Large-scale data analytics and machine learning solutions at IBM China Research Lab

Selected Publications

LLM Reasoning

GPG: Simple & Strong RL for Reasoning — ICLR 2026 · 1st Author
Tree-GRPO: Tree Search for Agent RL — ICLR 2026
CoEvolve: Agent-Data Co-Evolution — ACL 2026
MathForge: Difficulty-Aware GRPO — ICLR 2026
AutoDrive-R2: Reasoning VLA for Driving — ICLR 2026

Generative AI & World Models

USP: Unified Pretraining for Gen & Understanding — ICCV 2025 · 1st Author
DCW: SNR-t Bias of Diffusion Models — CVPR 2026
S2-Guidance: Training-Free Diffusion Enhancement — ICLR 2026
EPG: End-to-End Pixel Generation without VAE — ICLR 2026
DreamX-World: Interactive World Model

AI Agents & Intelligent Mobility

SkillClaw: Collective Skill Evolution
Code2World: GUI World Model via Renderable Code
MobilityBench: Route-Planning Agent Benchmark — KDD 2026 Oral

Foundation Architectures

VisionLLaMA: Unified LLaMA for Vision — ECCV 2024 · 1st Author
Twins: Spatial Attention in ViTs — NeurIPS 2021 · 1st Author · Most Influential
CPVT: Conditional Positional Encodings — ICLR 2023 · 1st Author · Most Influential
FASA: Frequency-Aware Sparse Attention — ICLR 2026
QARepVGG: Quantization-Aware RepVGG — AAAI 2024 · 1st Author

Vision-Language & Detection

MobileVLM: Real-Time Mobile Vision-Language Model · 1st Author
YOLOv6: Industrial Object Detection
SpatialGenEval: Spatial Intelligence Benchmark — ICLR 2026
PromptDet: Open-Vocabulary Detection — ECCV 2022

AutoML & Neural Architecture Search

FairNAS: Rethinking NAS Fairness — ICCV 2021 · 1st Author · Most Influential
FairDARTS: Fair Differentiable NAS — ECCV 2020 · 1st Author
DARTS-: Robustly Out of Collapse — ICLR 2021 · 1st Author

→ Full publication list (120+ papers)

Professional Service

Area Chair: ICLR Area Chair: NeurIPS SPC: AAAI SPC: IJCAI

Education

M.S. in Electrical Engineering, Tsinghua University, 2012
B.S. in Electrical Engineering, Southeast University, 2010

Patents

40+ domestic invention patents
7 international invention patents

Xiangxiang Chu（初祥祥）