Reproducing the Baseline experiments on the Gold Standards

Neiman Tal

unread,

Mar 10, 2020, 12:57:14 PM3/10/20

to Web Data Commons

Hello,

I'm interested in using the WDV dataset for academic research.

I would like to replicate the Baseline Experiments on the four categories (Computers, Cameras, Watches, Shoes).

Are there any resources available for that purpose on top of the data files?

Things that comes to mind: scripts to process the files, scripts to train the models, the trained fastText model, etc.

Thanks,

Tal

Ralph Peeters

unread,

Mar 11, 2020, 9:20:03 AM3/11/20

to Web Data Commons

Hello Tal,

I am currently working on releasing the relevant code and the fastText model. You will be able to find the link to the relevant github repo on the WDC LSPC v2 website next monday at the latest!

Cheers,

Ralph

Neiman Tal

unread,

Mar 11, 2020, 10:47:55 AM3/11/20

to Web Data Commons

Hi Ralph,

Thank you. Looking forward to use it.

Tal

Neiman Tal

unread,

Apr 21, 2020, 12:17:30 PM4/21/20

to Web Data Commons

Hi Ralph,

Thanks for publishing the repo.

Is it possible to publish the datasets with deepmatcher score?

It would be very useful for me an potentially others who are interested in comparing other models to it.

Regards,

Tal

On Wednesday, March 11, 2020 at 9:20:03 AM UTC-4, Ralph Peeters wrote:

Ralph Peeters

unread,

Apr 24, 2020, 5:34:23 AM4/24/20

to Web Data Commons

Hi Tal,

What exactly do you mean with datasets with deepmatcher score? Do you mean the gold standard datasets with an additional column showing the prediction score of the deepmatcher model?

Cheers,

Ralph

Neiman Tal

unread,

May 4, 2020, 9:18:00 AM5/4/20

to Web Data Commons

Yes, exactly.

Reply all

Reply to author

Forward