What is: Composed Video Retrieval?
Source | CoVR: Learning Composed Video Retrieval from Web Video Captions |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
The composed video retrieval (CoVR) task is a new task, where the goal is to find a video that matches both a query image and a query text. The query image represents a visual concept that the user is interested in, and the query text specifies how the concept should be modified or refined. For example, given an image of a fountain and the text during show at night, the CoVR task is to retrieve a video that shows the fountain at night with a show.