I am a final-year Ph.D. candidate in the School of Software Engineering at South China University of Technology, advised by Prof. Mingkui Tan and Prof. Chuang Gan. I engage in developing an agent that can understand and interact with the multi-modal world. Toward this goal, my research mainly focus on:
- Embodied AI: Visual Navigation; Robot Manipulation
- Multi-Modal Video Understanding: Self-Supervised Video Representation Learning; Temporal Action Localization; Visually-Aligned Sound Generation
I am currently seeking opportunities in a company specializing in embodied AI or multi-modal video understanding. If you have a suitable position available, please feel free to contact me.
- 2023.09: Two papers is accepted by NeurIPS 2023 and one is seleceted as Spotlight!
- 2023.09: Happy to join UMass Amherst as a visiting scholar working closely with Prof. Chuang Gan!
- 2023.07: One paper is accepted by ICCV 2023!
- 2023.06: Happy to join MIT-IBM Watson Lab for intership!
- 2023.02: One paper is accepted by CVPR 2023!
- 2023.02: The code for MGMap and ActiveCamera is now available.
- 2022.11: Two NeurIPS 2022 papers are selected as Spotlight!
- 2022.10: Two papers are accepted by NeurIPS 2022!
- 2021.01: One paper is accepted by AAAI 2021!
A2Nav: Action-Aware Zero-Shot Robot Navigation by Exploiting Vision-and-Language Ability of Foundation Models
- 2023: The Principle’s Scholarship of SCUT
- 2020: The Principle’s Scholarship of SCUT
- 2018: The First Prize Scholarship of SCUT
- 2017: The Second Prize of the NXP Cup National University Students Intelligent Car Race