Short Bio

I am currently a final year Ph.D student at Medical Computer Vision Lab, University of Sydney, Australia, supervised by Prof.Luping Zhou, co-supervised by Prof. Wanli Ouyang. I was also very fortunate to be mentored by Prof. Lei Wang and Prof. lingqiao Liu. Before that I received my master degree from the Department of Automation, Tsinghua University, China, in Jul. 2019 and B.E degree from the Department of Automation, Tianjin University of Science and Technology, China, in Jul. 2015.

My research interest are on vision-language content understanding and their applications, including image/video captioning, image/video-text retrival, and medical report generation.

News

[11/2023]: We release GPT4Video,a unified MLLM for video understanding and generation. See our website for details.

[11/2023]: First quantitative evaluation of GPT-4V on various medical imaging tasks, detailed in our study here.

[09/2023]: One paper accepted by Meta-Radiology.

[03/2023]: One paper accepted by CVPR.

[02/2023]: One paper accepted by IPMI (Oral).

[12/2022]: One paper accepted by TCSVT (IF=5.859).

[08/2022]: One paper accepted by MICCAI.

[04/2022]: One paper accepted by TMI (IF=11.037).

[06/2021]: One paper accepted by CVPR.

Educations

  2020.03 - , Ph.D. student, Faculty of Engineering and Information Technology, University of Sydney.

  2016.09 - 2019.07, Master, Department of Automation, Tsinghua University.

  2011.09 - 2015.09, Bachelor, Department of Automation, Tianjin University of Science and Technology.

Experiences

Tencent: Research Intern, Tencent AI Lab, 2023.07 - present.
Mentored by Dr. Longyue Wang.

Tencent: Algorithm Researcher, QQ Multimedia AI Lab, 2021.12 - 2022.03.
Mentored by Dr. Dian Li.

Tencent: Algorithm Researcher, Kandian Content AI Lab, 2021.03 - 2021.12.
Mentored by Dr. Fengyun Rao.

Tsinghua: Visiting Student, Intelligent Computation Lab, 2020.07 - 2021.03.
Supervised by Prof.Xiu Li.

SandStar: Algorithm Researcher, 2019.07 - 2020.01.
Mentored by Dr. Zexi Yang.

SenseTime:Research Intern, Intelligent Automative Group, 2018.09 - 2019.03.
Mentored by Dr. Zhe Wang.

Publications

METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens[C].
Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou.
In Computer Vision and Pattern Recognition (CVPR), 2023.

Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder[C].
Yunyi Liu, Zhanyu Wang, Dong Xu, Luping Zhou.
In Information Processing in Medical Imageing (IPMI), 2023.

Stay in Grid: Improving Video Captioning via Fully Grid-level Representation.
Mingkang Tang, Zhanyu Wang(co-first author), Zhaoyang Zeng, Xiu Li, Luping Zhou.
In IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2022

Automated Radiographic Report Generation Purely On Transformer: A Multi-criteria Supervised Approach[J].
Zhanyu Wang, Hongwei Han, Lei Wang, Xiu Li, Luping Zhou.
In IEEE Transactions on Medical Imaging (TMI), 2022.

A Medical Semantic-Assisted Transformer for Radiographic Report Generation[C].
Zhanyu Wang, Mingkang Tang, Lei Wang, Xiu Li, Luping Zhou.
In Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2022.

A Self-boosting Framework for Automated Radiographic Report Generation[C].
Zhanyu Wang, Luping Zhou, Lei Wang, Xiu Li.
In Computer Vision and Pattern Recognition (CVPR), 2021.

Clip4caption: Clip for video caption[C].
Mingkang Tang, Zhanyu Wang, Zhenhua Liu, Fengyun Rao, Dian Li, Xiu Li.
In international Conference on Multimedia (MM). 2021

Blind quality assessment of night-time image[J].
Ruize Hu, Yutao Liu, Zhanyu Wang, Xiu Li.
In Displays, 2021.

Intelligent glaucoma diagnosis via active learning and adversarial data augmentation[C].
Wang Zhanyu, Wang Zhe, Qu Guoxiang, Fei Li, Ye Yuan, Dennis S.C. Lam, Xiulan Zhang, Yue Zhang, Yu Qiao.
In International Symposium on Biomedical Imaging (ISBI), 2019.

Competitions

No.1 in VAULE Challenge on video captioning 2021 hosted at ICCV 2021. [paper] [report]

No.2 in Multimedia Grand Challenge hosted at ACM MM 2021. [paper]

Services

Journal Reviewer: TMI, Pattern Recognition, Neurocomputing

Conference Reviewer: AAAI2023, MICCAI2022