Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Creating PST files using Python

2,185 views
Skip to first unread message

Anthony Papillion

unread,
Nov 3, 2015, 2:14:53 PM11/3/15
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Does anyone know of a module that allows the wiring of Outlook PST files using Python? I'm working on a project that will require me to migrate 60gb of maildir mail (multiple accounts) to Outlook.

Thanks
Anthony
- --
Sent from my Android device with K-9 Mail. Please excuse my brevity.

-----BEGIN PGP SIGNATURE-----
Version: APG v1.1.1

iQJJBAEBCgAzBQJWOQZSLBxBbnRob255IFBhcGlsbGlvbiA8YW50aG9ueUBjYWp1
bnRlY2hpZS5vcmc+AAoJEAKK33RTsEsVwdgP/1f15VCI0GKkM54ram8oL3Bh2e5s
PqxhS/qe+9pYwYZrt+Dj2EENtpdwVQxIin67WxsruE1hOYcrrZiURiCaav8aL//3
zMiBwtEIHrYZ+0s54rfQTx1eltlET3c9O/Hehh69TtuLWiCYp1fKUW7pJEV2ZnqV
jLoGLeQ1BkXCukVeRVm8W8q7aBycF9jLP/Hg7r7p7g1RUYoIfUCOd0833AAHrbb4
CpSgvsztUfK7mJR9yxcWl4QPFTq3wpZA9MnXqnS/+UxqLSVRJp66omYn31w1AqQA
7XTuP3TQ4GjhPJC6uGiQUNQslBWj5FYRUQxlH/CzUFMeLpbK+ruG8K3QyNG8tFpj
TdUNxeWeRl+Sf5elgDYDe4Farn82ZborOlZmMspy/S87tENDgs1rND9/wIgnXtrt
g7sK3eVkmbBXkIqA9xXwkmx5GyHn3+o5a8JhFihePy5XnvlfAsnhjzsogTN3pYdO
YSKqR76cTTLzYuOjiF61DIgY1R/HqUsiloGiVXjfghV+ADcFLhADIJGa7wKwSItv
TjZmUKjF2U7CaNqn6mTR83VdtOymeoQyTZw9sB0WmsVVjVinOuExyy3qa9D3E07D
yfaa/RVgGLw6ygq/8JTXQx8B8xJ5lOl7KBxVNqaS9Kwx14F3aQdrDqvZRDCpU7U5
X/FC8f6YIksLIX7y
=254i
-----END PGP SIGNATURE-----

Chris Angelico

unread,
Nov 3, 2015, 5:13:32 PM11/3/15
to
On Wed, Nov 4, 2015 at 6:09 AM, Anthony Papillion
<ant...@cajuntechie.org> wrote:
> Does anyone know of a module that allows the wiring of Outlook PST files using Python? I'm working on a project that will require me to migrate 60gb of maildir mail (multiple accounts) to Outlook.
>

*wince*

I don't, but if there is such a thing, it'll probably be mentioned here:

https://pypi.python.org

But be careful. My gut feeling is that the format will be different in
every version of Outlook, so you'll need something exactly matching
it.

Since you're migrating your mail *to* Outlook, I'd look for some means
of importing mail. Worst case, you could create a bunch of files in
the maildir layout, spin up an off-the-shelf POP3 server, and tell
Outlook to read it all.

Personally, though, I'd refuse the job altogether. Creating a 60GB
blob of mail is going to cause nothing but pain. That might even be
beyond Outlook's current hard limit - I don't know. Bad, bad idea.

ChrisA

Cameron Simpson

unread,
Nov 3, 2015, 7:15:14 PM11/3/15
to
On 04Nov2015 09:12, Chris Angelico <ros...@gmail.com> wrote:
>On Wed, Nov 4, 2015 at 6:09 AM, Anthony Papillion
><ant...@cajuntechie.org> wrote:
>> Does anyone know of a module that allows the wiring of Outlook PST files using Python? I'm working on a project that will require me to migrate 60gb of maildir mail (multiple accounts) to Outlook.
>
>*wince*
>
>I don't, but if there is such a thing, it'll probably be mentioned here:
>https://pypi.python.org
>
>But be careful. My gut feeling is that the format will be different in
>every version of Outlook, so you'll need something exactly matching
>it.

Microsoft have a page here:

https://msdn.microsoft.com/en-us/library/ff385210%28v=office.12%29.aspx

which purports to document the PST format. There are links to the document
sections and also to PDFs.

It seems to be a simple and understandable format of a mere 190 pages. Yes, I
am joking, though not about the page count; it seems like an obscene and
totally nuts way to store email. IIRC, M$ LookOut! doesn't even use
Message-IDs, but some proprietry scheme.

>Since you're migrating your mail *to* Outlook, I'd look for some means
>of importing mail. Worst case, you could create a bunch of files in
>the maildir layout,

The OP has these I think.

>spin up an off-the-shelf POP3 server, and tell
>Outlook to read it all.
>
>Personally, though, I'd refuse the job altogether. Creating a 60GB
>blob of mail is going to cause nothing but pain. That might even be
>beyond Outlook's current hard limit - I don't know. Bad, bad idea.

It is a very strange request. But very corporate/government in mindset.

Alas, I cannot help to the with any tools or module for his task.

Cheers,
Cameron Simpson <c...@zip.com.au>

Microsoft Mail: as far from RFC-822 as you can get and still pretend to care.
- Abby Franquemont-Guillory <abb...@tezcat.com>

ru...@yahoo.com

unread,
Nov 3, 2015, 7:46:55 PM11/3/15
to
On 11/03/2015 12:09 PM, Anthony Papillion wrote:
> Does anyone know of a module that allows the wiring of Outlook PST
> files using Python? I'm working on a project that will require me to
> migrate 60gb of maildir mail (multiple accounts) to Outlook.

I used libpst (http://www.five-ten-sg.com/libpst/) a few years ago
successfully but I was going in the other direction (outlook -> maildir)
so not sure if it does writing of pst files.
At the time I also looked into libpff which (from my notes, I don't
remember much of this work) was newer but the documentation was pretty
thin and it wasn't worth the time to figure out how to use it.
Of course things may have changed since then...

ru...@yahoo.com

unread,
Nov 3, 2015, 8:32:52 PM11/3/15
to
I should have checked the web site before posting, it
appears that both libpst and libpff only read pst files,
no write. Sorry for the noise.

Tim Golden

unread,
Nov 4, 2015, 5:08:58 AM11/4/15
to
On 03/11/2015 22:12, Chris Angelico wrote:
> On Wed, Nov 4, 2015 at 6:09 AM, Anthony Papillion
> <ant...@cajuntechie.org> wrote:
>> Does anyone know of a module that allows the wiring of Outlook PST
>> files using Python? I'm working on a project that will require me
>> to migrate 60gb of maildir mail (multiple accounts) to Outlook.

I don't know if this is quite what you were after but, if automating
Outlook is an option (ie using its COM object model), the code below is
an *extremely* Q&D approach to adding a PST and copying the messages in
a mbox one by one.

Obviously I've done just enough to show that it's viable, and none of
the more complicated work around conversation ids, multipart messages etc.

TJG

#!python3
import os, sys
import gzip
import mailbox
import urllib.request

import win32com.client
dispatch = win32com.client.gencache.EnsureDispatch
const = win32com.client.constants

PST_FILEPATH =
os.path.abspath(os.path.join(os.path.expandvars("%APPDATA%"),
"scratch.pst"))
if os.path.exists(PST_FILEPATH):
os.remove(PST_FILEPATH)
ARCHIVE_URL =
"https://mail.python.org/pipermail/python-list/2015-November.txt.gz"
MBOX_FILEPATH = "archive.mbox"

def download_archive(url, local_mbox):
with gzip.open(urllib.request.urlopen(url)) as archive:
with open(local_mbox, "wb") as f:
print("Writing %s to %s" % (url, local_mbox))
f.write(archive.read())

def copy_archive_to_pst(mbox_filepath, pst_folder):
archive = mailbox.mbox(mbox_filepath)
for message in archive:
print(message.get("Subject"))
pst_message = pst_folder.Items.Add()
pst_message.Subject = message.get("Subject")
pst_message.Sender = message.get("From")
pst_message.Body = message.get_payload()
pst_message.Move(pst_folder)
pst_message.Save()

def find_pst_folder(namespace, pst_filepath):
for store in dispatch(mapi.Stores):
if store.IsDataFileStore and store.FilePath == PST_FILEPATH:
return store.GetRootFolder()

download_archive(ARCHIVE_URL, MBOX_FILEPATH)
outlook = dispatch("Outlook.Application")
mapi = outlook.GetNamespace("MAPI")
pst_folder = find_pst_folder(mapi, PST_FILEPATH)
if not pst_folder:
mapi.AddStoreEx(PST_FILEPATH, const.olStoreDefault)
pst_folder = find_pst_folder(mapi, PST_FILEPATH)
if not pst_folder:
raise RuntimeError("Can't find PST folder at %s" % PST_FILEPATH)
copy_archive_to_pst(MBOX_FILEPATH, pst_folder)
0 new messages