Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
WWW 2024 in Singapore
Published:
Post!
The Web Conference 2024 was held at Resorts World Convention Centre located at 8 Sentosa Gateway, Singapore 098269. Resorts World Sentosa (RWS), Asia’s premium lifestyle destination resort, is located on Singapore’s resort island of Sentosa.
portfolio
publications
SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant
Published in Proceedings of the 18th European Conference on Computer Vision (ECCV), 2024
This work has introduced a new training method that enhances general-purpose vision-language understanding and image-oriented question answering through visual self-questioning.
Recommended citation: Sun, G., Qin, C., Wang, J., Chen, Z., Xu, R., & Tao, Z. (2024). SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant. ECCV
Download Paper
STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical
Published in Empirical Methods in Natural Language Processing (EMNLP), 2024
This work have introduced a novel self-training approach to enhance the data efficiency of training LVLMs for medical tasks
Recommended citation: Sun, G., Qin, C., Fu, H., Wang, L., & Tao, Z. (2024). STLLaVA-Med: Self-Training Large Language and Vision Assistant for Medical.
Download Paper
Aligning Out-of-Distribution Web Images and Caption Semantics via Evidential Learning
Published in Proceedings of the ACM on Web Conference (WWW), 2024
This work efficiently improve the pre-trained vision-language networks in terms of robustness and performance when handling ID and OOD cases in image-text retrieval tasks via evidence knowledge.
Recommended citation: Guohao Sun, Yue Bai, Xueying Yang, Yi Fang, Yun Fu, and Zhiqiang Tao. 2024. Aligning Out-of-Distribution Web Images and Caption Semantics via Evidential Learning. WWW.
Download Paper
Prototypical Transformer as Unified Motion Learners
Published in Proceedings of the 41st International Conference on Machine Learning (ICML), 2024
This work refines the feature representations via prototype-feature association
Recommended citation: Han, C., Lu, Y., Sun, G., Liang, J., Cao, Z., Wang, Q., Guan, Q., Dianat, S.A., Rao, R.M., Geng, T., Tao, Z., & Liu, D. (2024). Prototypical Transformer as Unified Motion Learners. ICML
Download Paper
Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Published in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
We introduced T-MASS, where text is modeled as a stochastic embedding, facilitating joint learning of the text mass and video points.
Recommended citation: Wang, J., Sun, G., Wang, P., Liu, D., Dianat, S.A., Rabbani, M., Rao, R.M., & Tao, Z. (2024). Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval. CVPR
Download Paper
talks
Talk 1 on Relevant Topic in Your Field
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Conference Proceeding talk 3 on Relevant Topic in Your Field
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.