LVIS Taxonomy and Labeling Questions

353 views

Skip to first unread message

John Lambert

unread,

Dec 12, 2019, 6:16:56 PM12/12/19

to lvis-d...@googlegroups.com, Zhuang Liu

Hello LVIS Dataset Creators,

Thank you very much for preparing the LVIS dataset. I believe this dataset is and will be a very significant contribution to the community.

I have a few questions about the labeling process and dataset "vocabulary" or "taxonomy". I've also grappled with similar questions in my work, and would be very happy to hear your thoughts.

I read the "hierarchy" paragraph in section 2.3 of the paper. Is there a hierarchy to the LVIS dataset categories? Or are the categories selected in such a way to be "flat"? Are there any cases with parent-child relationships that you are aware of? None were immediately obvious to me after looking through the 1230 train/val categories.
Is there a "person" category? When looking through the v0.5 val and train categories (which are identical, as I understood), I could not see any person category, other than "baby".
In the LVIS arxiv paper in section 2.2, we read that "In other words, P_c is exhaustively annotated for category c." However, in section 2.3, we read that, "We also collect an image-level boolean label, e_i^c, indicating if image i ∈ P_c is exhaustively annotated for category c." These statements seem to potentially be contradictory (or show 2 different definitions for the word exhaustive). Which definition in the actual dataset is provided for 'not_exhaustive_category_ids'?
I looked through image 282298 of the val set, which shows a crowd of people. However, only 4 classes are annotated here: Christmas_tree, cigarette, and clock. Will the humans be annotated in v1.0, or will only a subset of instances be labeled in such a case, as noted in the paper? I also noticed that "person" is not included in the 'not_exhaustive_category_ids' for this image.
Must each category ID in 'not_exhaustive_category_ids' also be present in the positive annotation category IDs for that image?
I definitely agree with your observation that multiple ground truth classes for an instance mask can exist. Consider the LVIS val set, with "pet" and "shepherd-dog" both potentially representing the same instance. As I've understood it, rather than annotating such pixels with multi-modal ground truth (>1 correct category), such cases would be annotated only once (say as pet), but the other possible classes (shepherd-dog) could *not* appear in the 'neg_category_ids' list. Have I understood correctly?
Can 2 masks (for 2 separate ann_ids) overlap?
Can an "ann_id" include more than 1 category?
Suppose I have an image depicting a deer toy. This object is a child's toy, in the shape/color of a deer. Suppose we are evaluating the deer class at test time, and your model must predict the correct class. Our model predicts toy, instead of deer, and is thus penalized. Does this represent how LVIS evaluation is carried out for such a scenario? In such an evaluation scenario, it appears that a federated dataset may penalize a good model which made a correct prediction (but prefers toy predictions to deer predictions in such a case).

Thank you in advance for your time and thoughts.

Best wishes,

John Lambert

Agrim Gupta

unread,

Dec 12, 2019, 7:28:44 PM12/12/19

to LVIS Dataset

Hi John,

1. There is no official hierarchy for the vocabulary. However, since we provide the synsets you can use the WordNet hierarchy. You can use nltk for some quick prototyping.

2. There is a synset person.n.01.

3. For each image there are two types of categories: P_c category which we aimed for exhaustive annotation and P_n categories which are not present. However, it is possible that even after best efforts some instances for a category in P_c were not annotated. We use not_exhaustive_category_ids to identify those categories.

4. Refer https://groups.google.com/forum/#!topic/lvis-dataset/YxD3VxW15Js

5. Yes

6. Yes. However, it is also possible that it is labelled both dog and shepherd-dog. Essentially, we have double annotations.

7. Yes. We have in our tooling provided snap functionality to reduce this BUT we don't provide any guarantees.

8. No

9. If the gt does not have the label as toy you won't be penalized but you won't be rewarded also.

Hope this answers your queries!

Regards,

The LVIS Team

Reply all

Reply to author

Forward

0 new messages