Discrepancy in number of emails

6 views
Skip to first unread message

Sarah Weeks

unread,
Apr 22, 2026, 1:14:46 PM (8 days ago) Apr 22
to ePADD User Forum
Hello all, 

Hoping someone can help explain this discrepancy.

When I began the import, ePADD was giving these numbers:
41 folder(s), 26409 message(s), 10847950 KB

This doesn't match up with the number of emails after import.

- 26,409 is the number ePADD said it was importing 
- 26,362 is number of emails the ePADD Report gives (47 less than Import number)
- 25,759 is number of total emails when looking at the emails themselves in ePADD (650 less than Import number)

I've looked at the errors in the report, and they are the following:
- 54 messages without To, Cc or Bcc addresses
- 1 messages without a from address
- 514 messages with no date
- 241 duplicate attachments
- 749 other errors

I'm not sure if the errors would correspond to a lack of agreement between the pre-import number of emails, and the actual number in ePADD. 

Can anyone help?

Thank you,
Sarah


Sally DeBauche

unread,
Apr 22, 2026, 2:06:00 PM (8 days ago) Apr 22
to ePADD User Forum
Hi Sarah,

Thanks for reporting this - that is an odd discrepancy. It might take some analysis of the file(s) that you imported to understand what did not make it into your imported collection and why. Please feel free to reach out to me directly to discuss how you might do this, or, if you are able to share the files with our developer, he could take a look.

Best Wishes,

Sally DeBauche

Digital Archivist

Stanford University Libraries

Stanford, CA 94305-6010

650.313.8044 | deba...@stanford.edu

Pronouns:  she/her


Sarah Weeks

unread,
Apr 23, 2026, 1:09:59 PM (7 days ago) Apr 23
to ePADD User Forum
Hi Sally,

Replying here, as I found a big clue, which may help others.

This author had put their email into 41 folders before we at the archive took possession of it. I had taken a screenshot of the corpus' folders as they were being scanned by ePADD, before import, which stated the number of emails in each of the 41 folders. I was then able to go to the Folders tile in ePADD and see which folders had a discrepancy in the number of emails. I then searched the ePADD Report for the titles of those folders. 

Here's an example of what I found. CODING was the name of one of the author's email folders:  
  • Skipping message as it seems to be very long: 788319 chars, while the max size message that will be annotated for display is 100000 chars. Message = C:\Users\sweeks\Desktop\RIS_archive\Saved.sbd\CODING Msg#8316171ddd013a29623727b3060e6b9b2ced98342e92f0c92367bad9bf44a5c0 (Subject:) guessed date Jan 1, 1960

I found 577 instances of emails that were skipped by using ctrl + F in the report.
  • "skipping message as it seems to have very long words" = 504 results
  • "skipping message as it seems to be very long" = 73 results

The folder CODING is missing 6 emails, and I found three instances of skipped emails with CODING in the path in the report.

I am missing 650 emails, so 73 are still missing for reasons unknown other than the report saying that they were skipped. I will follow up with you for further investigation, Sally. Thank you for offering :) 

Sarah
Web and Email Archives Coordinator
Washington University
Reply all
Reply to author
Forward
0 new messages