Hi Christophe,
> more ~/Downloads/NELL.08m.695.ontology.csv | grep evemtatlocation
Yes, "evemtatlocation" was an old typo that was not entirely expunged from
the KB when we corrected it a while back. It just so happens that we
added a few more categories and relations to NELL recently, and the
ontology update process has removed those references to "evemtatlocation".
The story for leaderofcontry is similar -- we had changed the name to
personleadscountry a while back but there were still some vestigial
references to the old name until recently.
> By the way, why is there a duplication for all assertions involving classes
> ? (each triple invilving a class is duplicated for a version where the
> class name is its localname, and another where the class is prefixed by
> "cocnept:") ?
This is NELL's way of dealing with the fact that a single noun phrase can
refer to more than one concept and vise-versa. For instance, "apple"
could refer to the fruit or the computer company. Similarly, both "Apple"
and "Apple Inc." can refer to the computer company.
Most of NELL's learning methods are not directly aware of the different
senses that a given noun phrase can have, and simply recognize patterns of
text around the words, or look at the structure of the words, or something
like that. NELL then uses a clustering process that looks at similarities
between noun phrases and also takes into consideration ontological
constraints (e.g. nothing can be both a fruit and a company) to generate a
many-to-many mapping between noun phrases and the concepts to which they
may refer. Those assertions involving the "concept:" prefix are
assertions among concepts, and the rest are assertions among noun phrases.
company), and with the fact that a single concept (e.g. Apple Inc.)
All of NELL's promoted beliefs are in terms of concepts, which is what
we're really iterested in. Maybe you can think of the beliefs about noun
phrases as being direct sensory observations of the world, and the beliefs
about concepts as the understanding, or at least the conclusions, drawn
from those observations.
> Also, I was wondering about the meaning of few relations declared in the
> ontology, which are :
>
> - populate
> - instancetype
> - domainwithinrange
> - rangewithindomain
> - visible
populate is a flag that controls whether or not that relation should be
learned. Some relations exist mainly to group other relations under a
common parent. The personHasResidenceInLocation relation is not learned,
for instance, because we are more interested in learning its child
relations to do with residence in a particular city, a particular state,
and a particular country.
instanceType is a constraint for noun phrases, and was originally meant to
restrict some categories to only proper nouns or only common nouns. We
eventually decided against enforcing this strictly, and instanceType is
mainly used these days for very specific restrictions, like recognizing
noun phrases that are URLs, dates, or numbers.
domainWithinRange and rangeWithinDomain are our first shot at constraining
certain relations that wound up being too general, like
animalIsTypeOfAnimal. The domain and range here are both animal, but that
is actually too broad because if we know one argument to be an insect,
then we know the other argument can't be a non-insect. Requiring the
domain to be equal to or more specfic than the range or vise-versa, then
the domain and range constraints become tight enough to prevent a vast
number of common mistakes.
visible is a flag used on a few very general predicates that we, as
researchers, are curious to observe, but that are difficult to interpret
or that don't make sense for NELL to tweet about.
If there's anything else I can try to explain, please don't hesitate to
ask!
bki...@cs.cmu.edu