Yongtao Ge

I am a Researcher at Shanda AI Research Institute, working on world models and game agents. Previously, I was a Research Scientist at SpreeAI, working with Dr. Minh Vo. I obtained my Ph.D. in Computer Science from the University of Adelaide, and my Master’s degree from Southeast University.

My current research focuses on world models, game agents, and dynamic 4D reconstruction. Previously, I worked on 3D human reconstruction and 2D perception.

🔥 We are hiring! Looking for self-motivated interns, researchers, and engineers with experience in 3D synthetic data generation with rendering engines (UE, Blender), world models, or related areas, based in Shanghai or Tokyo. Please reach out to yongtaoo.ge@gmail.com if you are interested.

News

Oct 2025 HumanWild is accepted by TPAMI. An updated demo is available on Huggingface.

Jun 2025 POMATO is accepted by ICCV 2025, with pointmap representation for dynamic 3D reconstruction.

Mar 2025 GVM, a generative video matting framework, has been accepted by SIGGRAPH 2025. 🎉

Jan 2025 GenPercept is accepted by ICLR 2025. 🎉

Jun 2024 Release GeoBench, a monocular geometry benchmark for analyzing SOTA monocular geometry estimation models. 🎉

Mar 2024 Release HumanWild, feel free to try our Huggingface Demo. 🎉

Jul 2023 Zolly is accepted by ICCV 2023, selected as oral (top 1.8%).

Mar 2023 Release HumanWild, focusing on perspective-distorted 3D human pose and shape estimation.

Nov 2022 Point-Teaching is accepted by AAAI 2023.

Jul 2022 Poseur is accepted by ECCV 2022.

Selected Publications

POMATO: Marrying Pointmap Matching with Temporal Motions for Dynamic 3D Reconstruction

Songyan Zhang^*, Yongtao Ge^*, Jinyuan Tian^*, Guangkai Xu, Chen Lv, Hao Chen, and Chunhua Shen

In Proc. of the IEEE International Conf. on Computer Vision, 2025

Bib PDF

@inproceedings{zhang2025pomato,
  title = {POMATO: Marrying Pointmap Matching with Temporal Motions for Dynamic 3D Reconstruction},
  author = {Zhang, Songyan and Ge, Yongtao and Tian, Jinyuan and Xu, Guangkai and Lv, Chen and Chen, Hao and Shen, Chunhua},
  booktitle = {Proc. of the IEEE International Conf. on Computer Vision},
  year = {2025}
}

3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models

Yongtao Ge, Wenjia Wang, Yongfan Chen, Hao Chen, and Chunhua Shen

In IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI), 2025

Bib PDF Code

@inproceedings{ge2024humanwild,
  title = {3D Human Reconstruction in the Wild with Synthetic Data Using Generative Models},
  booktitle = {IEEE Trans. on Pattern Analysis and Machine Intelligence (TPAMI)},
  author = {Ge, Yongtao and Wang, Wenjia and Chen, Yongfan and Chen, Hao and Shen, Chunhua},
  year = {2025}
}

What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?

Guangkai Xu, Yongtao Ge, Mingyu Liu, Chengxiang Fan, Kangyang Xie, Zhiyue Zhao, Hao Chen, and Chunhua Shen

In Proc. of the IEEE International Conf. on Learning Representations, 2025

Bib PDF

@inproceedings{xu2024genpercept,
  title = {What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?},
  author = {Xu, Guangkai and Ge, Yongtao and Liu, Mingyu and Fan, Chengxiang and Xie, Kangyang and Zhao, Zhiyue and Chen, Hao and Shen, Chunhua},
  booktitle = {Proc. of the IEEE International Conf. on Learning Representations},
  year = {2025}
}

GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models

Yongtao Ge, Guangkai Xu, Zhiyue Zhao, Zheng Huang, Libo Sun, Yanlong Sun, Hao Chen, and Chunhua Shen

arXiv preprint arXiv:2406.12671, 2024

Bib Code

@article{ge2024geobench,
  title = {GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models},
  author = {Ge, Yongtao and Xu, Guangkai and Zhao, Zhiyue and Huang, Zheng and Sun, Libo and Sun, Yanlong and Chen, Hao and Shen, Chunhua},
  journal = {arXiv preprint arXiv:2406.12671},
  booktitle = {arXiv.org},
  year = {2024}
}

Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction

Wenjia Wang, Yongtao Ge, Haiyi Mei, Zhongang Cai, Qingping Sun, Yanjun Wang, Chunhua Shen, Lei Yang, and Komura Taku

In Proc. of the IEEE International Conf. on Computer Vision (ICCV Oral), 2023

Bib PDF Code

@inproceedings{wang2023zolly,
  title = {Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction},
  booktitle = {Proc. of the IEEE International Conf. on Computer Vision (ICCV Oral)},
  stars = {<a href="https://github.com/WenjiaWang0312/Zolly">
            <img alt="GitHub Repo stars" style="vertical-align: top" 
            src="https://img.shields.io/github/stars/WenjiaWang0312/Zolly?style=social">
          </a>},
  author = {Wang, Wenjia and Ge, Yongtao and Mei, Haiyi and Cai, Zhongang and Sun, Qingping and Wang, Yanjun and Shen, Chunhua and Yang, Lei and Taku, Komura},
  year = {2023}
}

Poseur: Direct Human Pose Regression with Transformers

Weian Mao^*, Yongtao Ge^*, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, and Anton van den Hengel

In Proc. of the European Conf. on Computer Vision (ECCV), 2022

Bib PDF Supp Code

@inproceedings{mao2022poseur,
  title = {Poseur: Direct Human Pose Regression with Transformers},
  booktitle = {Proc. of the European Conf. on Computer Vision (ECCV)},
  stars = {
        <a href="https://github.com/aim-uofa/Poseur">
          <img alt="GitHub Repo stars" 
            style="vertical-align: top"
            src="https://img.shields.io/github/stars/aim-uofa/Poseur?style=social">
        </a>},
  author = {Mao, Weian and Ge, Yongtao and Shen, Chunhua and Tian, Zhi and Wang, Xinlong and Wang, Zhibin and Hengel, Anton van den},
  year = {2022}
}

Point-Teaching: Weakly Semi-Supervised Object Detection with Point Annotations

Yongtao Ge^*, Qiang Zhou^*, Xinlong Wang, Chunhua Shen, Zhibin Wang, and Hao Li

In AAAI Conference on Artificial Intelligence (AAAI), 2023

Bib PDF Code

@inproceedings{ge2023point,
  title = {Point-Teaching: Weakly Semi-Supervised Object Detection with Point Annotations},
  author = {Ge, Yongtao and Zhou, Qiang and Wang, Xinlong and Shen, Chunhua and Wang, Zhibin and Li, Hao},
  booktitle = {AAAI Conference on Artificial Intelligence (AAAI)},
  year = {2023}
}