From the paper "We define out-of-context use of images as presenting the image as evidence of untrue or unrelated event(s). If the two captions refer to the same object in the image but are semantically different, i.e. correspond to different events, then it indicates out-of-context use of the image. However, if the captions correspond to the same event irrespective of the object(s) the captions describe, it is defined as not-out-of-context."
To clarify, for this task, we consider the relationship of both the captions with the provided image and not between the captions.
The GT labels are already provided for the test dataset. Please use them for evaluation. You can design the algorithm any way you like, but please make sure to have the same setup for all the samples.