Hi all,
I emailed the list about a specific error message I was seeing in the accession module reports a while ago, and that turned out to be an actual bug that was fixed in the most recent release. So just to make sure I understand what's happening with my imports, I would appreciate any assistance interpreting these other error messages I'm seeing while importing pst files. I did spend some time trying to find explanations for these in the official ePADD documentation and GitHub issues, but didn't find much. I'm sure some (all?) of these are common for ePADD power users, but I'm an infrequent user at best.
"When building the addressbook, name _______________mapped to the following contacts..."
This seems like a non-issue, but just to be sure... I assume it just means that one name/entity is associated with more than one email address and this is something that can be cleaned up during processing (and isn't a sign that data was imported incorrectly or that I need to do anything differently pre-import).
“WARNING:
Unable to fetch attachment… The filename, directory name, or volume label
syntax is incorrect.”
I think this means that there's an unallowed character somewhere in the filepath or filename, which for Windows (which I'm using) should be limited to ?, *, <, >, :, or |. The information provided in the reports and logs only indicate the filepath/name for the mbox files or folders, which do not contain any of those characters, so there must be an illegal character in the attachment file name itself?
“Unable to decode quoted printable encoded
message…”
I assume this is due to improper text encoding (possibly for non-printable ASCII characters?), as the error message does include information like "Message
#223 type text/plain; charset=UTF-8." But I can't tell if this means an entire message was not imported, or just a portion of a message, or if there's something I should be doing to fix this before import.
"Skipping
message as it seems to have very long words...”
Google's AI feature says “In ePADD, the error message "Skipping message as it seems to have very long words" triggers when the software's text parser encounters unusually long character strings, often exceeding memory or indexing buffers. This is typically caused by parsing errors, base64-encoded attachments, or long digital signatures, rather than actual prose.” But I can’t tell where it is getting this from as that info does not appear in any of the sources it cites, nor any of the other search results. Unclear if there is something I should be doing about this or if that's just a limitation of ePADD that means we'll always lose messages with those characteristics.
“Dirty message part, has conflicting message
part headers…”
Are these email headers or file headers? Google AI (again, apologies) says this is because of "irregular MIME type structures" (so, file headers, I assume) but I'd like confirmation on that as none of the pages it cites actually confirm this interpretation.
Sarah Newhouse
Curator of Digital and Audiovisual Archives
Science History Institute