About me

I am a Senior Staff Engineer at Huawei Singapore Research Center. I received my Ph.D. in Electrical and Computer Engineering from the National University of Singapore (NUS), advised by Prof. Jiashi Feng and Prof. Xinchao Wang, and my B.Eng. (Honours) in Engineering Science from NUS.

My research interests:

Motion capture & generation: 3D human pose estimation, dance generation, and unified motion capture & retargeting for arbitrary skeletons (the MoCapAnything series).
Video generation & world models: joint audio–video generation, motion- and physics-aware video generation, and world models for interactive simulation; a recent direction I am exploring.
Multimodal LLMs & VLMs: multimodal understanding and reasoning, unifying understanding and generation, grounding vision–language models in 3D, motion, and embodied tasks; and efficient, deployable foundation models.

News

[2026] MoCapAnything V2 accepted to SIGGRAPH Asia 2026 (ACM TOG)
[2026] Started research on video generation and world models
[2026] MoCapAnything V2 released on arXiv
[2026] MoCapAnything accepted to CVPR 2026
[2025] MoCapAnything and SWiT-4D released on arXiv
[2025] Worked on the virtual-pet project
[2024] Worked on LLM/VLM inference acceleration (quantization, sparsification, attention)
[2024] MotionMix accepted to AAAI 2024
[2023] On-device LLM for smart-home terminals released at the China Mobile Global Partners Conference 2023
[2023] TM2D and Priority-centric accepted to ICCV 2023
[2023] PoseAug journal extension published in IEEE TPAMI
[2022] PoseTriplet accepted to CVPR 2022 (Oral)
[2021] PoseAug accepted to CVPR 2021 (Oral, Best Paper Candidate)

Publications

You can also find my articles on my Google Scholar profile.

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

Kehong Gong, Zhengyu Wen, Dao Thien Phong, Mingxi Xu, Weixia He, Qi Wang, Ning Zhang, Zhengyu Li, Guanli Hou, Dongze Lian, Xiaoyu He, Mingyuan Zhang, Hanwang Zhang

ACM TOG (SIGGRAPH Asia), 2026

arXiv Project Demo Code

MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos

Kehong Gong, Zhengyu Wen, Weixia He, Mingxi Xu, Qi Wang, Ning Zhang, Zhengyu Li, Dongze Lian, Wei Zhao, Xiaoyu He, Mingyuan Zhang

CVPR, 2026

arXiv Project Code

SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation

Kehong Gong, Zhengyu Wen, Mingxi Xu, Weixia He, Qi Wang, Ning Zhang, Zhengyu Li, Chenbin Li, Dongze Lian, Wei Zhao, Xiaoyu He, Mingyuan Zhang

arXiv, 2025

arXiv Project Code

MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation

Nhat M. Hoang, Kehong Gong (corresponding author), Chuan Guo, Michael Bi Mi

AAAI, 2024

arXiv Project Code

Priority-centric Human Motion Generation in Discrete Latent Space

Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, Xinchao Wang

ICCV, 2023

arXiv

TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration

Kehong Gong, Dongze Lian, Heng Chang, Chuan Guo, Zihang Jiang, Xinxin Zuo, Michael Bi Mi, Xinchao Wang

ICCV, 2023

arXiv Project Code

Learning to Augment Poses for 3D Human Pose Estimation in Images and Videos

Jianfeng Zhang*, Kehong Gong*, Xinchao Wang, Jiashi Feng (*equal contribution)

IEEE TPAMI, 2023

Paper

PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision

Kehong Gong, Bingbing Li, Jianfeng Zhang, Tao Wang, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang

CVPR (Oral), 2022

arXiv Code

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation

Kehong Gong, Jianfeng Zhang, Jiashi Feng

CVPR (Oral, Best Paper Candidate), 2021

arXiv Code

CV

📄 Download CV (PDF)

Education

Ph.D. in Electrical and Computer Engineering, National University of Singapore, 2019–2022
- Dissertation: Deep Learning in Human Pose Generation and Its Application
- Supervisors: Asst. Prof. Jiashi Feng and Assoc. Prof. Xinchao Wang
B.Eng. (Honours), Engineering Science Programme, National University of Singapore, 2013–2017

Professional experience

Dec 2022 – Present: Senior Staff Engineer, Huawei International Pte. Ltd.
- 2026–present: Pre-research on video generation models (world models, joint audio–video generation).
- 2025: Unified 3D motion capture for arbitrary skeletons from monocular videos; deployed in the virtual-pet feature of Huawei smartphones (MoCapAnything series).
- 2024: LLM/VLM inference & training acceleration: low-bit quantization, sparse/efficient attention, and model compression.
- 2023: Led low-level vision model development and on-device deployment; our on-device LLM for smart-home terminals was released at the China Mobile Global Partners Conference 2023.

Honors & awards

Best Paper Candidate, CVPR 2021 (PoseAug)
Oral Presentation, CVPR 2021 & CVPR 2022

Academic service

Reviewer: CVPR, ECCV, NeurIPS, AAAI, ACM MM, Pattern Recognition, Robotics and Autonomous Systems

Gong Kehong

News

Publications

CV

Education

Professional experience

Honors & awards

Academic service