International Challenge on Activity Recognition (ActivityNet) at CVPR 2022: Release Announcement

127 views
Skip to first unread message

Humam Alwassel

unread,
May 3, 2022, 10:49:51 AM5/3/22
to ActivityNet

Dear ActivityNet Participants,  


The time has come! We are ready to officially release the 7th installment of the annual International Challenge on Activity Recognition held in conjunction with CVPR22 on June 19, 2022. Since the success of the previous ActivityNet Challenges (2016, 2017, 2018, 2019, 2020, 2021) and based on your feedback, we have worked hard on making this round richer and more inclusive. We are proud to announce that this year's challenge will be a packed full-day workshop with parallel tracks and will host 8 diverse challenges, aiming to push the limits of semantic visual understanding of videos and bridge visual content with human captions.  

Two out of the twelve challenges  are based on the ActivityNet Dataset. These tasks focus on tracing evidence of activities in time in the form of class labels and captions. In this installment of the challenge, we will host six guest challenges, which enrich the understanding of visual information in videos. These tasks focus on complementary aspects of the activity recognition problem at a large scale and involve challenging and recently compiled datasets. Please see more information about each task appended to this message.

We encourage you to visit the challenge website and go through its details (e.g., task/dataset specifications, important dates, evaluation metrics, toolkits/baselines, and submission guidelines). We have designated one or more contact people for each task. Like last year, the ActivityNet Google Group can help you get ActivityNet Challenge and dataset-specific questions answered. 


We are looking forward to your submissions and are committed to making your participation a pleasant experience. If you have any questions or comments about the challenge, please contact us.  Please share this email with any individuals who may be interested in the International Challenge on Activity Recognition (ActivityNet). If you would no longer like to receive communications about the challenge, you may reply “unsubscribe” to this email.


Regards,

ActivityNet Challenge Team


~~~~~~~~~~~~~ Task Descriptions ~~~~~~~~~~~~~

ActivityNet Temporal Action Localization: Despite the recent advances in large-scale video analysis, temporal action localization remains one of the most challenging unsolved problems in computer vision. This search problem hinders various real-world applications ranging from consumer video summarization to surveillance, crowd monitoring, and elderly care. This task is intended to encourage computer vision researchers to design high-performance action localization systems.

ActivityNet Event Dense-Captioning: Most natural videos contain numerous events. For example, in a video of a 'man playing a piano', the video might also contain another 'man dancing' or 'a crowd clapping'. This challenge studies the task of dense-captioning events, which involves both detecting and describing events in a video. This challenge uses the ActivityNet Captions dataset, a large-scale benchmark for dense-captioning events.

TinyActions: In this challenge, the focus is on recognizing tiny actions in videos. The existing research in action recognition is mostly focused on high-quality videos where the action is distinctly visible. Therefore, the available action recognition models are not designed for low-resolution videos and their performance is still far from satisfactory when the action is not distinctly visible. This challenge invites solutions for recognizing tiny actions in real-world videos.

AVA-Kinetics: The AVA-Kinetics task is an umbrella for a crossover of the previous AVA and Kinetics tasks, where Kinetics has now been annotated with AVA labels (but AVA has not been annotated with Kinetics labels). There has always been some interactions between the two datasets, e.g. many of the AVA methods are pre-trained on Kinetics. The new annotations should allow for improved performance on both tasks and also increase the diversity of the AVA evaluation set (which now also includes Kinetics clips).

AVA-ActiveSpeaker: The goal of this task is to evaluate whether algorithms can determine when a visible face is speaking. For this task, participants will use the new AVA-ActiveSpeaker dataset. The purpose of this dataset is to extend the AVA Actions dataset to the task of active speaker detection, and to push the state-of-the-art in multimodal perception: participants are encouraged to use both the audio and video data.

ActEV Self-Reported Leaderboard (SRL):   This task seeks to encourage the development of robust automatic activity detection algorithms for an extended video. Challenge participants will develop algorithms to detect and temporally localize instances of Known Activities using an ActEV Command Line Interface (CLI) submission on the Unknown Facility EO video dataset.

SoccerNet: The SoccerNet challenges are back! We are proud to announce new soccer video challenges at CVPR 2022, including (i) Action Spotting and Replay Grounding, (ii) Camera Calibration and Field Localization, (iii) Player Re-Identification and (iv) Ball and Player Tracking. Feel free to check out our presentation video for more details. The challenges end on the 30th of May 2022. Each challenge has a 1000$ prize sponsored by EVS Broadcast Equipment, SportRadar and Baidu Research. Latest news including leaderboards will be shared throughout the competition on our discord server.

HOMAGE: This challenge leverages the Home Action Genome dataset, which contains multi-view videos of indoor daily activities. We use scene graphs to describe the relationship between a person and the object used during the execution of an action. In this track, the algorithms need to predict per-frame scene graphs, including how they change as the video progresses.


Reply all
Reply to author
Forward
0 new messages