Introduction

Vision and Multimedia (VIM) Research Group

Vision and Multimedia (VIM), University of Science and Technology of China (USTC), is one of research groups in National Engineering Laboratory for Brain-inspired Intelligence Technology and Application. Relying on the primary disciplines of Pattern Recognition and Intelligent System, the VIM group does the research and development (R&D) on aritificial intelligence. The interests include multi-modality foundation model, deep learning, artificial intelligence system, etc. Our goal is to propose innovate theory and techniques, implement practical system and equipment, and serve the national important demands by project/funding, etc.

The principal investigator is currently Dr. Zilei Wang, Associate Professor, USTC.

[Computer Vision]

Task: Research modern framework, model, algorithm on fundamental issues and emerging problems. Develop the useful application system into the real world.

Image/video processing and enhancement;
Object classification/segmentation/detection/localization/tracking;
Action recognition and detection;
Multi-modality Large model.

[Deep Learning]

Task: Build the novel training framework and algorithm for various training data.

Domain and category generalization;
Test-time adaptation;
Continuous learning;
Few-shot and zero-shot learning.

[Artificial Intelligence System]

Task: Develop some end-to-end system on artificial intelligence.

Semantic map, 3D scene generation, visual odometer;
Visual perception system;
Autonomous intelligence system;
Biomedical image analysis.

For these research topics, the VIM group has developed multiple teams to track the frontiers or build a real/demo system. Many of them are supported by various projects/funds, including national science foundation of China (NSFC), CAS, Anhui province, etc.

If you have interests in one of the topics, join us!