Hello Vladimir,
As VOC we define the vocabulary of the properties used in the HTML pages of CC, which we use in order to extract structured data. You are right that some of them are invalid. Taking into consideration the size of the corpus, one can realize that there will also be faulty annotated entities. In our extraction, we want to keep track of these errors as it could be further investigated as a topic [1]. Another scenario for extracting an invalid Vocabulary would be that something went wrong while parsing.
I hope this answers your question.
Best Regards,
Anna
[1]: Meusel, R., & Paulheim, H. (2015, May). Heuristics for fixing common errors in deployed schema. org microdata. In European Semantic Web Conference (pp. 152-168). Springer International Publishing.