Short Bio

Zhanyu Wang (王占宇) is currently an algorithm engineer at TikTok, based in Sydney, where he focuses on multimodal pretraining and multimodal large language model research.

He received his Ph.D. in 2024 from the Medical Computer Vision Lab at the University of Sydney, under the supervision of Prof.Luping Zhou and co-supervision of Prof. Wanli Ouyang. During his Ph.D., he was also fortunate to receive mentorship from Prof. Lei Wang and Prof. lingqiao Liu.

Prior to that, he obtained a Master’s degree from the Department of Automation at Tsinghua University in July 2019 and a Bachelor’s degree from the Department of Automation at Tianjin University of Science and Technology in July 2015.

News

[03/2025]: One paper accepted by TPAMI (Impact Factor: 20.8).

[10/2024]: GPT4Video received a Best Paper Nomination at ACM MM 2024 (top 0.6%).

[07/2024]: GPT4Video was accepted as an oral presentation at ACM MM 2024 (3.97% acceptance rate).

[05/2024]: Two papers were early accepted by MICCAI 2024.

[11/2023]: We release GPT4Video，a unified MLLM for video understanding and generation. See our website for details.

[11/2023]: First quantitative evaluation of GPT-4V on various medical imaging tasks, detailed in our study here.

[09/2023]: One paper accepted by Meta-Radiology.

[03/2023]: One paper accepted by CVPR.

[02/2023]: One paper accepted by IPMI (Oral).

[12/2022]: One paper accepted by TCSVT (IF=5.859).

[08/2022]: One paper accepted by MICCAI.

[04/2022]: One paper accepted by TMI (IF=11.037).

[06/2021]: One paper accepted by CVPR.

Educations

2020.03 - , Ph.D. student, Faculty of Engineering and Information Technology, University of Sydney.

2016.09 - 2019.07, Master, Department of Automation, Tsinghua University.

2011.09 - 2015.09, Bachelor, Department of Automation, Tianjin University of Science and Technology.

Experiences

Tencent: Research Intern, Tencent AI Lab, 2023.07 - present.
Mentored by Dr. Longyue Wang.

Tencent: Algorithm Researcher, QQ Multimedia AI Lab, 2021.12 - 2022.03.
Mentored by Dr. Dian Li.

Tencent: Algorithm Researcher, Kandian Content AI Lab, 2021.03 - 2021.12.
Mentored by Dr. Fengyun Rao.

Tsinghua: Visiting Student, Intelligent Computation Lab, 2020.07 - 2021.03.
Supervised by Prof.Xiu Li.

SandStar: Algorithm Researcher, 2019.07 - 2020.01.
Mentored by Dr. Zexi Yang.

SenseTime：Research Intern, Intelligent Automative Group, 2018.09 - 2019.03.
Mentored by Dr. Zhe Wang.

Publications

METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens[C].
Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou.
In Computer Vision and Pattern Recognition (CVPR), 2023.

Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder[C].
Yunyi Liu, Zhanyu Wang, Dong Xu, Luping Zhou.
In Information Processing in Medical Imageing (IPMI), 2023.

Stay in Grid: Improving Video Captioning via Fully Grid-level Representation.
Mingkang Tang, Zhanyu Wang(co-first author), Zhaoyang Zeng, Xiu Li, Luping Zhou.
In IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022

Automated Radiographic Report Generation Purely On Transformer: A Multi-criteria Supervised Approach[J].
Zhanyu Wang, Hongwei Han, Lei Wang, Xiu Li, Luping Zhou.
In IEEE Transactions on Medical Imaging (TMI), 2022.

A Medical Semantic-Assisted Transformer for Radiographic Report Generation[C].
Zhanyu Wang, Mingkang Tang, Lei Wang, Xiu Li, Luping Zhou.
In Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2022.

A Self-boosting Framework for Automated Radiographic Report Generation[C].
Zhanyu Wang, Luping Zhou, Lei Wang, Xiu Li.
In Computer Vision and Pattern Recognition (CVPR), 2021.

Clip4caption: Clip for video caption[C].
Mingkang Tang, Zhanyu Wang, Zhenhua Liu, Fengyun Rao, Dian Li, Xiu Li.
In international Conference on Multimedia (MM). 2021

Blind quality assessment of night-time image[J].
Ruize Hu, Yutao Liu, Zhanyu Wang, Xiu Li.
In Displays, 2021.

Intelligent glaucoma diagnosis via active learning and adversarial data augmentation[C].
Wang Zhanyu, Wang Zhe, Qu Guoxiang, Fei Li, Ye Yuan, Dennis S.C. Lam, Xiulan Zhang, Yue Zhang, Yu Qiao.
In International Symposium on Biomedical Imaging (ISBI), 2019.

Competitions

No.1 in VAULE Challenge on video captioning 2021 hosted at ICCV 2021. [paper] [report]

No.2 in Multimedia Grand Challenge hosted at ACM MM 2021. [paper]

Services

Journal Reviewer: TMI, Pattern Recognition, Neurocomputing

Conference Reviewer: AAAI2023, MICCAI2022