question about the ground truth pair like `www.ebay.com//53562,www.ebay.com//53718,0`

117 views
Skip to first unread message

Jun

unread,
Mar 12, 2020, 3:29:03 PM3/12/20
to SIGMOD 2020 Contest Google Group
Hi:
I have a question about the groud truth samples. For example, this pair `www.ebay.com//53562,www.ebay.com//53718,0`, the ground truth marked as negative.
The details is:
```
{'<page title>': 'Near Mint Nikon D90 Black 35mm 130mm Set from Japan 018208254125 | eBay', 'brand': 'Nikon', 'bundled items': 'Extra Battery Charger, Lens, Strap (Neck or Wrist)', 'country/region of manufacture': 'Japan', 'megapixels': '10.2 MP', 'model': 'D80', 'mpn': '254122176', 'screen size': '2.5"', 'type': 'Digital SLR', 'upc': '018208254125'}
```

```
{'<page title>': 'Nikon D80 10 2 MP Digital SLR Camera Black Body Only 018208254125 | eBay', 'brand': 'Nikon', 'condition': 'Used: An item that has been used previously. The item may have some signs of cosmetic wear, but is fully\noperational and functions as intended. This item may be a floor model or store return that has been used. See the seller’s listing for full details and description of any imperfections.\nSee all condition definitions- opens in a new window or tab\n... Read moreabout the condition', 'megapixels': '10.2 MP', 'model': 'D80', 'mpn': '254122176', 'screen size': '2.5"', 'type': 'Digital SLR', 'upc': '018208254125'}

```

With first glance, the `53562` said it is a D90 but `53718` said it is a D80. But `53562` also mentioned in `model` area with `D80`, with further check we can find they both have same `upc` barcode value while this bar code do explain it is a `D80` according to online database. 

So in this condition, which information should we trust? On can you explain why ground truth mark this connection as negative?

Thans a lot.

alaska benchmark

unread,
Mar 14, 2020, 2:31:03 PM3/14/20
to SIGMOD 2020 Contest Google Group
Hi Jun,

the mentioned json contains indeed inconsistent values, as title and other specs disagree. 

We trusted the information given in the <page title> attribute for producing the labelled dataset. However, we acknowledge that one may trust the other specs for analogously valid reasons. 

For sake of comparison fairness, in the evaluation process we are accepting both solutions for the json at hand (either D90 consistently with the labelled dataset, or D80).

Best regards,
Andrea - Programming Contest Co-Chair
Reply all
Reply to author
Forward
0 new messages