|
Yuxiang Zhao
Computer Vision Researcher @ Alibaba Group
Reviewer for ICCV, ECCV, CVPR, AAAI, NeurIPS, ICML, ICLR
Research Area: Human-Centric Perception, Interaction and Generation
Email: zeusyux@gmail.com
|
|
Biography
I'm Yuxiang Zhao (赵煜翔), currently a Computer Vision Researcher at Alibaba AMAP CV Lab, where I explore the cutting-edge frontiers of human-centric interactive virtual reality technologies. I graduated from Sun Yat-sen University in 2024. Previously, I worked at Baidu on controllable generation technologies for anime-style content.
My research focuses on building scalable models for virtual humans, aiming to faithfully represent appearance, motion, and behavior in a unified manner, while achieving high realism, human-like intelligence and adaptive interaction capabilities. My ultimate vision is to build a virtual world where people can reunite with the loved ones they deeply miss.
News
- 03/2026 Invited to serve as a reviewer for NeurIPS 2026.
- 02/2026 Invited to serve as a reviewer for ICML 2026.
- 01/2026 Invited talk on human mesh recovery at the AI Time, hosted by QingQi.
- 11/2025 One paper on expressive human pose and shape estimation was accepted to AAAI 2026, see you in Singapore!
- 08/2025 Invited to serve as a reviewer for AAAI 2026.
- 07/2025 One paper on embodied artificial intelligence was accepted to ACMMM 2025, see you in Dublin!
Selected Research
|
Bridging the Indoor-Outdoor Gap: Vision-Centric Instruction-Guided Embodied Navigation
First Author. Technical Report.
Novel prior-free instruction-driven out-to-in embodied navigation with vision-centric framework.
|
|
ABot-N0: Technical Report on the VLA Foundation Model for Versatile Embodied Navigation
Alibaba Group.
Unified VLA foundation model for grand multi-task embodied navigation unification.
|
|
A Simple Baseline for Expressive Human Animation
First Author. The 19th European Conference on Computer Vision.
A simple yet efficient and accurate method for character animation and talking head generation.
|
|
CoEvoer: Collaborative Evolution Transformer for Upper-Body Expressive Human Pose and Shape Estimation
First Author. The 40th Annual AAAI Conference on Artificial Intelligence.
One-stage mesh recovery with explicit modeling of human body part interactions.
|
|
Fine-Grained and Controllable Conditional Video Generation
Baidu Inc.
Mitigating blurring in video generation under first-last frame constraints.
|
|
Cross Time Domain Intention Interaction for Conditional Trajectory Prediction
First Author. The 33rd ACM International Conference on Multimedia.
Cross-time intention-interactive trajectory prediction for human-robot interaction.
|
|
Personalized Image Prompt Generation and Correction
Meituan Inc.
Generates overall image descriptions and conceptual keywords.
|
|