While benchmark-raindrop.py does load the enron corpus fine, a bug
limitation is that all attachments have been removed. There is a copy
of the corpus available which has attachments, but that version uses
.pst files and benchmark-raindrop can't currently do that.
So depending on what you want to benchmark, enron might, or might not,
be the best option.
An option we can consider is to grab a mbox file of Jean Reilly's
account - this should just be a matter of creating an account in
thunderbird, then copying the 'INBOX' file from the profile. We could
then put the file somewhere semi-public and all use the same data to
share meaningful results.
FYI, I can import my thunderbird account with a command-line like:
% benchmark-raindrop.py \
--my-address=mham...@skippinet.com.au \
--my-address=skippy....@gmail.com \
--mailbox=c:\Users\skip\AppData\Roaming\Thunderbird\Profiles\{salt_dir_name}\ImapMail\{imap_acct_name}\INBOX
If this proves useful, it would be fairly easy to have it walk a dir
structure looking for all mbox files (ie, to import all folders from all
accounts)
Cheers,
Mark