Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
publications
PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation
Published in CVPR (Oral, Best Paper Candidate), 2021
A differentiable pose augmentation framework for 3D human pose estimation. Oral presentation, Best Paper Candidate.
Recommended citation: Kehong Gong, Jianfeng Zhang, Jiashi Feng. (2021). "PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation." CVPR (Oral, Best Paper Candidate).
Download Paper
PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision
Published in CVPR (Oral), 2022
Self-supervised co-evolution of 3D human pose estimation, imitation, and hallucination. Oral presentation.
Recommended citation: Kehong Gong, Bingbing Li, Jianfeng Zhang, Tao Wang, Jing Huang, Michael Bi Mi, Jiashi Feng, Xinchao Wang. (2022). "PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision." CVPR (Oral).
Download Paper
Learning to Augment Poses for 3D Human Pose Estimation in Images and Videos
Published in IEEE TPAMI, 2023
Journal extension on learning to augment poses for 3D human pose estimation in images and videos (equal contribution).
Recommended citation: Jianfeng Zhang*, Kehong Gong*, Xinchao Wang, Jiashi Feng (*equal contribution). (2023). "Learning to Augment Poses for 3D Human Pose Estimation in Images and Videos." IEEE TPAMI.
Download Paper
TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration
Published in ICCV, 2023
Bimodality-driven 3D dance generation that integrates music and text.
Recommended citation: Kehong Gong, Dongze Lian, Heng Chang, Chuan Guo, Zihang Jiang, Xinxin Zuo, Michael Bi Mi, Xinchao Wang. (2023). "TM2D: Bimodality Driven 3D Dance Generation via Music-Text Integration." ICCV.
Download Paper
Priority-centric Human Motion Generation in Discrete Latent Space
Published in ICCV, 2023
Priority-centric human motion generation in a discrete latent space.
Recommended citation: Hanyang Kong, Kehong Gong, Dongze Lian, Michael Bi Mi, Xinchao Wang. (2023). "Priority-centric Human Motion Generation in Discrete Latent Space." ICCV.
Download Paper
MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation
Published in AAAI, 2024
Weakly-supervised diffusion for controllable motion generation (corresponding author).
Recommended citation: Nhat M. Hoang, Kehong Gong (corresponding author), Chuan Guo, Michael Bi Mi. (2024). "MotionMix: Weakly-Supervised Diffusion for Controllable Motion Generation." AAAI.
Download Paper
SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation
Published in arXiv, 2025
Sliding-window transformer for lossless and parameter-free temporal 4D generation. arXiv preprint.
Recommended citation: Kehong Gong, Zhengyu Wen, Mingxi Xu, Weixia He, Qi Wang, Ning Zhang, Zhengyu Li, Chenbin Li, Dongze Lian, Wei Zhao, Xiaoyu He, Mingyuan Zhang. (2025). "SWiT-4D: Sliding-Window Transformer for Lossless and Parameter-Free Temporal 4D Generation." arXiv preprint.
Download Paper
MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons
Published in arXiv, 2026
End-to-end motion capture for arbitrary skeletons. arXiv preprint.
Recommended citation: Kehong Gong, Zhengyu Wen, Dao Thien Phong, Mingxi Xu, Weixia He, Qi Wang, Ning Zhang, Zhengyu Li, Guanli Hou, Dongze Lian, Xiaoyu He, Mingyuan Zhang, Hanwang Zhang. (2026). "MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons." arXiv preprint.
Download Paper
MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos
Published in CVPR, 2026
Unified 3D motion capture for arbitrary skeletons from monocular videos. Deployed in the virtual-pet feature of Huawei smartphones.
Recommended citation: Kehong Gong, Zhengyu Wen, Weixia He, Mingxi Xu, Qi Wang, Ning Zhang, Zhengyu Li, Dongze Lian, Wei Zhao, Xiaoyu He, Mingyuan Zhang. (2026). "MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos." CVPR.
Download Paper
