At a glance, I see three recentish bursts of activity for categories.
One is at iteration 883, which directly coincides with a bug fix to the
CPL sublearner (a technical thing to do with the queue of novel beliefs
getting clogged with what should have been going to the queue of belief
Second, there is a brief spike at 1018. This I hadn't noticed before.
It appears what set this off was a massive one-off culling from the KB of
beliefs that were no longer supported by available evidence but which had
no natural way of being reassesed and deleted automatically. I'm
surprised the spike is as big as it was, but it's consistent with the kind
of shift in NELL's "worldview" we see when making sudden
sufficiently-large changes to NELL's training data; in this case it would
have been the effect that the content of the KB itself has on generating
training data for the various sublearners.
But more likely, you're talking about the third one at 1036. At this
point, we roughly doubled the size of the NP-context matricies used by
CPL to find statistically interesting patterns in text. Previously, we
had been using the 500M English web pages from the ClueWeb09 corpus. As
of 1036, we began using the union of that and the ClueWeb12 corpus.
My overall take on things is that NELL's rate of learning in the long runs
tends to taper off. There are only so many countries to learn. If the
set of text NELL reads from is relatively static, then there are only so
many presidents, celebrities, or athletes to learn. A year or two ago,
the size of NELL's KB was clearly starting to stagnate. My opinion is
that giving NELL new predicates to learn and new text to read was key to
resuming what turned out to be sort-of linear growth over time overall
when you plot it out.
The heat map for relations turns out to be almost uselessly
misrepresentative. It is true that if you were to browse the KB you'd see
that most promoted relations are indicated to have been promoted within
the last handful of iterations. But that's a byproduct of a bunch of
inconvenient technicalities. One significant modality for this that can
be seen on the category heat map as well is that NELL is quite capable of
deciding to disbelieve something it had previously decided to believe.
And it's fully capable of changing its mind again. So may be X is a
person today, but not tomorrow, but then it is a person again some time
next week. This sort of behavior accounts for all the zeros early on in
the category heat map; if you were to look at the heatmap from iteration
100, you'd see much higher numbers there. It's just that our definition
of "when" NELL decided to believe something is "the most recent time" NELL
decided to believe something. There isn't enough tracking information in
the KB itself to notice that this is something that NELL had believed in
the past. Relations suffer from additional complications that we should
probably at some point look into how to handle more effectively.
> You received this message because you are subscribed to the Google Groups "NELL: Never-Ending Language Learner" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to cmunell+u...@googlegroups.com
> For more options, visit https://groups.google.com/d/optout