avatar_circle.jpeg

Yongtao Ge

I am a Researcher at Shanda AI Research Institute, working on world models and game agents. Previously, I was a Research Scientist at SpreeAI, working with Dr. Minh Vo. I obtained my Ph.D. in Computer Science from the University of Adelaide, advised by Prof. Chunhua Shen, and my Master’s degree from Southeast University.

My current research focuses on world models, game agents, and dynamic 4D reconstruction. Previously, I worked on 3D human reconstruction and 2D perception.

🔥 We are hiring!  Looking for self-motivated interns, researchers, and engineers with experience in 3D synthetic data generation with rendering engines (UE, Blender), world models, or related areas, based in Shanghai or Tokyo. Please reach out to yongtaoo.ge@gmail.com if you are interested.

News

Oct 2025 HumanWild is accepted by TPAMI. An updated demo is available on Huggingface.
Jun 2025 POMATO is accepted by ICCV 2025, with pointmap representation for dynamic 3D reconstruction.
Mar 2025 GVM, a generative video matting framework, has been accepted by SIGGRAPH 2025. 🎉
Jan 2025 GenPercept is accepted by ICLR 2025. 🎉
Jun 2024 Release GeoBench, a monocular geometry benchmark for analyzing SOTA monocular geometry estimation models. 🎉
Mar 2024 Release HumanWild, feel free to try our Huggingface Demo. 🎉
Jul 2023 Zolly is accepted by ICCV 2023, selected as oral (top 1.8%). :sparkles:
Mar 2023 Release HumanWild, focusing on perspective-distorted 3D human pose and shape estimation. :sparkles:
Nov 2022 Point-Teaching is accepted by AAAI 2023. :sparkles:
Jul 2022 Poseur is accepted by ECCV 2022. :sparkles:

Selected Publications

  1. pomato_teaser.png
    POMATO: Marrying Pointmap Matching with Temporal Motions for Dynamic 3D Reconstruction
    Songyan Zhang*, Yongtao Ge*, Jinyuan Tian*, Guangkai Xu, Chen Lv, Hao Chen, and Chunhua Shen
    In Proc. of the IEEE International Conf. on Computer Vision, 2025
  2. humanwild.gif
    3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models
    Yongtao Ge, Wenjia Wang, Yongfan Chen, Hao Chen, and Chunhua Shen
    In IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 2025
  3. genpercept_pipeline.png
    What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
    Guangkai Xu, Yongtao Ge, Mingyu Liu, Chengxiang Fan, Kangyang Xie, Zhiyue Zhao, Hao Chen, and Chunhua Shen
    In Proc. of the IEEE International Conf. on Learning Representations, 2025
  4. geobench_logo.png
    GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models
    Yongtao Ge, Guangkai Xu, Zhiyue Zhao, Zheng Huang, Libo Sun, Yanlong Sun, Hao Chen, and Chunhua Shen
    arXiv preprint arXiv:2406.12671, 2024
  5. zolly.png
    Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction
    Wenjia Wang, Yongtao Ge, Haiyi Mei, Zhongang Cai, Qingping Sun, Yanjun Wang, Chunhua Shen, Lei Yang, and Komura Taku
    In Proc. of the IEEE International Conf. on Computer Vision (ICCV Oral), 2023
  6. poseur_arch.jpg
    Poseur: Direct Human Pose Regression with Transformers
    Weian Mao*, Yongtao Ge*, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, and Anton van den Hengel
    In Proc. of the European Conf. on Computer Vision (ECCV), 2022
  7. point_teaching_arch.jpg
    Point-Teaching: Weakly Semi-Supervised Object Detection with Point Annotations
    Yongtao Ge*, Qiang Zhou*, Xinlong Wang, Chunhua Shen, Zhibin Wang, and Hao Li
    In AAAI Conference on Artificial Intelligence (AAAI), 2023