The difference between observation and truth.

70 views
Skip to first unread message

Dr. Shashi Ramakrishna

unread,
Jul 16, 2019, 3:05:46 AM7/16/19
to PSL Users
I'm trying to understand the difference between the concept 'observation' and 'truth'. Consider the 'simple-acquaintances' example. 

The 'knows-truth file has something like this'

Elena Steve 1
Elena Jay 1
Elena Ben 1
Elena Alex 0
Elena Arti 0
Elena Dhanya 0

We have observation files for all the predicates
[KNOWS]

Ben Elena
Ben Dhanya
Arti Alex
Sabina Dhanya

[LIKES]
Jay Trivia 1
Steve Trivia 0.8
Sabina Trivia 0.6
Steve Sports 1
Ben Sports 0.8

[LIVED]
Jay New Jersey
Jay Pennsylvania
Jay Maryland
Jay California


  1. Why does some observation have weights ?
  2. Is truth used only for training (learning the weights ?)
  3. Why are truth values only binary in nature.
  4. Difference between 'observation' and 'truth'
-
Shashi

Eriq Augustine

unread,
Jul 22, 2019, 1:59:06 PM7/22/19
to Dr. Shashi Ramakrishna, PSL Users
Hey Shashi,

1. Why does some observation have weights ?

In PSL, we use the term "weight" when talking about a rule.
We call the values associated with ground atoms in the data files "truth values".

If a file has no truth values, then the truth value for all the atoms in that file are assumed to be 1.0.
Because of the closed world assumption, atoms that do not appear at all get a value of 0.0.

2. Is truth used only for training (learning the weights ?)

No, it is also used for evaluation.
But these are the only two places it is normally used (it is not used in normal inference).

3. Why are truth values only binary in nature.

They are not.
In this particular example (simple acquaintances), we are trying to determine if people know each other, which is an inherently binary attribute.
However in other examples and problem, we may be trying to infer a continuous target.
For example, in the preference prediction example we are trying to determine the rating that a user will give a joke.
This is naturally a continuous prediction.

4. Difference between 'observation' and 'truth'

Both of these modifiers imply read-only data, data that will not get a truth value predicted for it.
However, like in more classical IID approaches, truth data is not used during prediction.
Truth data is our held-out data.
It is vital that truth data (known labels) never leaks into inference.
So there is not inherent difference in the data itself, just how we use it.

Note that for weight learning, we will hold out even more data, a validation set.
So if we were doing 10-fold cross-validation, instead of holding out 1 fold (like in IID methods), we would hold out two.
One fold would be used to lean weights and the final fold would be used for evaluation/scoring.

-eriq

--
You received this message because you are subscribed to the Google Groups "PSL Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to psl-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/psl-users/a6e6ef82-8579-490d-b093-d9d8b37353dd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages