Hi,
We have downloaded the corpus using the same method as the last year, but we have obtained different sizes from those reported by Richard.
Here are the details per topic:
topic #gpg_docs size
26 11101 7,4G
27 5377 3,0G
28 12432 3,7G
29 7567 2,4G
30 2111 421M
31 6206 5,6G
32 22807 9,4G
33 15398 8,0G
34 2941 769M
35 13265 4,8G
36 3097 1,6G
37 12350 3,7G
38 2437 1,3G
39 3607 1,1G
40 3866 837M
41 4586 1,9G
42 4593 1,5G
43 5461 2,5G
44 3415 786M
45 2159 632M
46 7590 6,1G
Is there anything wrong with our collection?
Thanks,
Regards,
Bilel