General Feedback Lab06

37 views

Skip to first unread message

AKBC2022

unread,

Jun 8, 2022, 3:11:20 PM6/8/22

to AKBC2022

Hi everyone,

Here are some remarks and comments on previous week's submission.

- Everyone beat the baseline! Well done!
- We have a new best score for the lab! Well done!
- the hidden set had 1599 sentences, with 3.2 GT extractions per sentences on an average, min extractions per sentence being 1 -> all sentences had at least 1 SPO extraction.
- The best solution could extract an avg of 2.3 triples per sentence.

- In general precision scores were much higher than the recall.

- Highest recall: 0.82

- Highest Precision: 0.89

Notes:
1. In case you get low scores -> check you output
- submission contained same values for verb and subject
- submission had `!verb` to denote negation, but benchmark evaluation cannot handle this
- submission had white spaces for subjects/objects

2. Current skeleton code without any formatting cannot support multiple SPO triples from the same verb since the triples are stored in a dictionary keyed on the predicate verb.
You could improve your recall by
- appending subsequent same keys with some ordering
For the sentence: The French and the Portuguese captured the islands.

verbs =
{
'captured': {'subject':'The French', 'object': 'the islands'},
'captured__1': {'subject':'The Portuguese', 'object': 'the islands'},
}

such that while formatting, you take care to remove the ordering (`key.split('__')[0]`) from the key (==verb)

- modifying the returned dict to store lists of subjects and objects:
verbs =
{
'captured': [
{'subject':'The French', 'object': 'the islands'},
{'subject':'The Portuguese', 'object': 'the islands'}
]
}
Here again the formatting code would have to be slightly modified.

Sample approaches used:

1. Find the verb occurrences using POS tags, tokens upto the verb is the subject and the tokens following the verb is the object
Edge cases - when sentence begins with a verb
- compound sentences - object/subject may become very long
- may not work well in object-verb-subject, where the semantics may be lost
"James Cameron directed Avatar ." -> 'directed': {'subject': 'James Cameron', 'object': 'Avatar'}
"Avatar was directed by James Cameron ." -> 'directed': {'subject': 'Avatar', 'object': 'James Cameron'}

2. Baseline: sending over the provided baseline is not enough to pass. If you fall in this category, you have not been awarded Pass.

3. Using POS and dependency parser
- For all verbs determine if verb is active (s,p,o) or passive (o,p,s)
- account for conjunctions

4. Improving upon the baseline:
Baseline only returns single tokens for subject, predicate and verb - include phrases if present
Modifications for complex statements, where subject/object direct head may not be a VERB.

Hope this helps.

Shrestha

Reply all

Reply to author

Forward

0 new messages