1st CALL FOR PARTICIPATION & DEVELOPMENT DATA RELEASE
Predicting Video Memorability Task
2021 MediaEval Benchmarking Initiative for Multimedia Evaluation
https://multimediaeval.github.io/editions/2021/tasks/memorability/
*******************************************************
Register to participate by filling in the MediaEval 2021 Registration form: https://docs.google.com/forms/d/e/1FAIpQLSchIcIaSlM1fNeWGCSoSBMR6HS48HKMhWEY151vvCmb5KhO-w/viewform
*******************************************************
Annotations: https://annotator.uk/mediaeval/index.php
*******************************************************
The Predicting Video Memorability Task focuses on the problem of predicting how memorable a video will be. It requires participants to automatically predict memorability scores for videos, which reflect the probability of a video being remembered.
Participants will be provided with an extensive dataset of videos with memorability annotations, and pre-extracted state-of-the-art visual features. The ground truth has been collected through recognition tests, and, for this reason, reflects objective measures of memory performance. In contrast to previous work on image memorability prediction, where memorability was measured a few minutes after memorisation, the dataset comes with short-term and long-term memorability annotations. Because memories continue to evolve in long-term memory, in particular during the first day following memorisation, we expect long-term memorability annotations to be more representative of long-term memory performance, which is used preferably in numerous applications.
*******************************************************
Video-based prediction task
*******************************************************
Participants will be required to train computational models capable of inferring video memorability from visual content. Optionally, descriptive titles attached to the videos may be used. Models will be evaluated through standard evaluation metrics used in ranking tasks.
*******************************************************
Generalization task (optional)
*******************************************************
The aim of the Generalization subtask is to check system performance on other types of video data. Participants will use their systems, trained on one of the two sources of data we propose, to predict the memorability of videos from the testing set of the other source of data. We believe this would provide interesting insights into the performance of the developed systems, given that, while the two sources of data measure memorability in a similar way, the videos may be somewhat different with regards to their content, general subjects or length. As this will be an optional task, participants are not required to participate in it.
Pilot demonstration task (pilot)
*******************************************************
The aim of the Memorability-EEG pilot task is to promote interest in the use of neural signals—either alone, or in combination with other data sources—in the context of predicting video memorability by demonstrating what EEG data can provide. The dataset will be a set of features pre-extracted from the EEG for a subset of videos from task 1. This demonstration pilot will enable interested researchers to see how they could use neural signals without any of the requisite domain knowledge in a future Memorability task, potentially increasing interdisciplinary interest in the subject of memorability, and opening the door to novel EEG-computer vision combined approaches to predicting video memorability.
Pre-selected participants in this pilot demonstration will use the dataset to explore all manners of machine learning and processing strategies to predict video memorability. This will lead to a presentation on their findings, which will ultimately contribute towards the collaborative definition of a fully-fledged task at MediaEval 2022, where participating teams will submit runs and be benchmarked.
***********************
Target communities
***********************
Researchers will find this task interesting if they work in the areas of human perception and scene understanding, such as image and video interestingness, memorability, attractiveness, aesthetics prediction, event detection, multimedia affect and perceptual analysis, multimedia content analysis, machine learning (though not limited to).
***********************
Data
***********************
The first dataset is composed of a subset of 6,000 short videos retrieved from TRECVid 2019 Video to Text dataset [1]. Each video consists of a coherent unit in terms of meaning and is associated with two scores of memorability that refer to its probability to be remembered after two different durations of memory retention. Similar to previous editions of the task [2], memorability has been measured using recognition tests, i.e., through an objective measure, a few minutes after the memorisation of the videos (short term), and then 24 to 72 hours later (long term). The videos are shared under Creative Commons licenses that allow their redistribution. They come with a set of pre-extracted features, such as: Histograms in the HSV and RGB spaces, HOG, LBP, and deep features extracted from AlexNet, VGG and C3D. In comparison to the videos used for this task in 2018 and 2019, the TRECVid videos have much more action happening in them and thus are more interesting for subjects to view.
Additionally, we will open the Memento10k dataset to participants. This dataset contains 10.000 three-second videos depicting in-the-wild scenes, with their associated short term memorability scores, memorability decay values, action labels, and 5 accompanying captions. 7000 videos will be released as a training set, and 1500 will be given for validation. The last 1500 videos will be used as the test set for scoring submissions. The scores are computed with 90 annotations per video on average, and the videos were deafened before being shown to participants. We will also distribute a set of features for each video analogous to the Trecvid set.
***********************
Annotations
***********************
We need more annotations for the dataset. We kindly ask for your help to get more annotations. Please visit the link (https://annotator.uk/mediaeval/index.php) and participate in the funny game to contribute to the dataset and get familiar with the data. Thanks in advance for your contribution
******************************
Workshop
******************************
Participants to the task are invited to present their results during the annual MediaEval Workshop, which will be held in Bergen, Norway with opportunity for online, on 6-8 December 2021. Working notes proceedings are to appear with CEUR Workshop Proceedings (ceur-ws.org).
******************************
Important dates (tentative)
******************************
(open) Participant registration: July
Data release: 15 September
Runs due: 11 November
Working notes papers due: 22 November
MediaEval Workshop: 6-8 December, in Bergen, Norway with opportunity for online participation
***********************
Task coordination
***********************
Alba García Seco de Herrera, <alba.garcia(at)essex.ac.uk>, University of Essex, UK
Rukiye Savran Kiziltepe, <rs16419(at)essex.ac.uk>, University of Essex, UK
Mihai Gabriel Constantin, <cmihaigabriel(at)gmail.com>, University Politehnica of Bucharest, Romania
Bogdan Ionescu, University Politehnica of Bucharest, Romania
Alan Smeaton, Graham Healy, Dublin City University, Ireland
Claire-Hélène Demarty, InterDigital, R&I, France
Sebastian Halder, University of Essex, UK
Ana Matrán-Fernández, University of Essex, UK
Camilo Fosco, Massachusetts Institute of Technology Cambridge, Massachusetts, USA
Lorin Sweeney, Dublin City University, Ireland
Graham Healy, Dublin City University, Ireland
On behalf of the Organizers,
Alba García Seco de Herrera
Dr Alba García Seco de Herrera
Department of Computer Science and Electronic Engineering (CSEE)
University of Essex
https://www.essex.ac.uk/people/garci58409/alba-garcia-seco-de-herrera