Processing WDC data sets

49 views
Skip to first unread message

Rosa Navarrete

unread,
Jun 5, 2016, 10:20:00 PM6/5/16
to Web Data Commons
Dear Robert,

Thank you for WDC Data Sets (November 2015). I need to process each file extracted from the compressed (gzip) file. It is possible that the fourth term of the last n-quad could be the same in the previous and the next file? I mean, the web page can continue in different files? 

Thanks in advance for your response.
Rosa

Robert Meusel

unread,
Jun 6, 2016, 4:17:32 AM6/6/16
to Web Data Commons
Dear Rosa,

Yes this might happen.

Cheers,
Robert
Reply all
Reply to author
Forward
0 new messages