Hi Arnab,
Nominally, the "every belief in the KB" link from our resources page at
http://rtw.ml.cmu.edu/rtw/resources contains source information in the
"Candidate Source" column, although the formatting is irregular and this
can be a difficult column to parse.
Most of NELL's evidence comes from two of its learning subcomponents, SEAL
and CPL. SEAL will provide a list of URLs for the facts that it proposes.
CPL will provide a list of textual extraction patterns (e.g. it might
provide a pattern like "mayor of _" for a noun phrase believed to be a
city).
The two next most common kinds of source information available are top-N
features from linear models that match against a given fact. In the case
of category instances, this comes from a subcomponent called CMC that uses
a feature space constructed from orthographical features of noun phrases,
like prefixes, suffixes, word length, and patterns of capitalization. In
the case of relation instances, this comes from a subcomponent called PRA
that is somewhat like a first-order logic rule learner in that each of the
features in its feature space is a chain of relations connecting the two
category instances that are the arguments to the relation instance in
question.
Then there are some less-common sources, like seeds, human feedback, a
FOIL rule learner that we used to run, a component that attempts to match
category instances against geolocation databases, and a component that
attempts to match category instances against Wikipedia pages.
Are you interested in any of these sources in particular? Depending on
what you're looking for, I might be able to generate a file that would be
easier to process than the "every belief in the KB" file.
bki...@cs.cmu.edu
On Mon, 29 Apr 2013, Arnab Dutta wrote:
> Hi all,
> I am currently working with the NELL data set. However, is it possible to
> have the extraction source page information for a fact ? It has the snippet
> which makes NELL think the fact to be true to some extent. Will be nice if
> the source page information can be readily retrieved somehow.
> Any ideas or pointers?
>
> Thanks
>
> --
> You received this message because you are subscribed to the Google Groups "NELL: Never-Ending Language Learner" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
cmunell+u...@googlegroups.com.
> For more options, visit
https://groups.google.com/groups/opt_out.
>
>
>