![]() |
UNO: Towards Unified Baseline for Video Scene Graph Generation via Object-Centric Representation Learning |
For the up-to-date publication list, please visit the Google Scholar page.
Filter by type:
![]() |
UNO: Towards Unified Baseline for Video Scene Graph Generation via Object-Centric Representation Learning |
![]() |
BiMa: Towards Biases Mitigation for Text-Video Retrieval via Scene Element Guidance |
![]() |
WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models through Open-Vocabulary Knowledge |
![]() |
Tracked-Vehicle Retrieval by Natural Language Descriptions With Multi-Contextual Adaptive Knowledge |
![]() |
Multi-camera People Tracking With Mixture of Realistic and Synthetic Knowledge |
![]() |
Tracked-vehicle Retrieval by Natural Language Descriptions with Domain Adaptive Knowledge |
![]() |
Multi-Camera Multi-Vehicle Tracking with Domain Generalization and Contextual Constraints |