About

I am a Research Scientist in Huawei Noah’s Ark Laboratory. I obtained my master’s degree from Nanyang Technological University. My current research interests primarily focus on Vision-Language Models (VLM), post-training data flywheel, pretrained visual encoders, 3D scene understanding, and embodied intelligence. Additionally, I have a strong passion for portrait photography and have been practicing it intensively lately. The application of large language models (LLMs) in quantitative finance is another area I actively follow in my spare time—I’ve recently been replicating and designing quantitative investment agents related to this field.

We are now recruiting project/research interns. If you are interested in, please directly send your CV to yzhou037@e.ntu.edu.sg.

Recent Publications

UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting

Authors: Haoyuan Li, Yanpeng Zhou, Tao Tang, Jifei Song, Yihan Zeng, Michael Kampffmeyer, Hang Xu, Xiaodan Liang

Conference: The International Conference on Learning Representations (ICLR) , 2025.

Abstract Image

UNIT: Unifying Image and Text Recognition in One Vision Encoder

Authors: Yi Zhu, Yanpeng Zhou, Chunwei Wang, Yang Cao, Jianhua Han, Lu Hou, Hang Xu

Conference: Neural Information Processing Systems (NeurIPS), 2024.

Abstract Image