PST, Printing, Search

36 views
Skip to first unread message

FreeEed Team

unread,
Nov 25, 2011, 1:07:19 AM11/25/11
to freediscovery
Hi,

I've found a solution for PST processing,
http://shmsoft.blogspot.com/2011/11/how-to-process-microsoft-outlook-pst.html,
one that's architecturally better than I have ever done before. Here
are the things that I want to have ready by the December 15
presentation for Women in eDiscovery:

* solid email processing;
* stable PDF printing;
* search in culled results.

Oskar Cid

unread,
Nov 25, 2011, 8:30:27 AM11/25/11
to freedi...@googlegroups.com

I wish you are doing very well , I really need your best advice , I have to process around 50 gb of email and I am wondering what could be the best option. I have to run keywords over the pst files and then report.

Thanks in advance I will appreciate your help.

Oskar Cid

--
You received this message because you are subscribed to the Google Groups "freediscovery" group.
To post to this group, send email to freedi...@googlegroups.com.
To unsubscribe from this group, send email to freediscover...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/freediscovery?hl=en.

Mark Kerzner

unread,
Nov 25, 2011, 10:40:58 AM11/25/11
to freedi...@googlegroups.com
Oskar,

at the moment you can run FreeEed on 50GB email on a single computer (I would prefer Linux but it runs on Windows too), and my estimate is that it will take about 50 hours. You can compare your case to processing Enron data, http://shmsoft.blogspot.com/2011/09/freeeed-used-to-process-complete-enron.html.

In my tests, I ran each PST as a separate project, and did not combine them. I think I should, as my next test. How do you want it in your case?

Also, at the moment, culling is done as part of processing, but you cannot search the results with keywords. And finally, what kind of reports do you need? It produces the basic one right now, with just the number of documents, but not by type yet.

I'd be glad to help you in this (we can communicate directly), because I am interested in improving the program. In the near future, you will be able to run it in the cloud, which will make it faster for large volumes, and easier. So many things depend on your goals and deadlines.

Sincerely,
Mark

PS. I've added your requirements as issues for the project in github.

Oskar Cid

unread,
Nov 27, 2011, 10:33:21 AM11/27/11
to freedi...@googlegroups.com

Thanks a lot for your response. I need a frequency report of searched a keywords list. I also need a communication report in order to know with what person has more and less communications.

Once I have the keyword results I just need to review this representative sub set of emails.

Thanks in advance.
Oskar Cid

Mark Kerzner

unread,
Nov 27, 2011, 10:54:43 AM11/27/11
to freedi...@googlegroups.com
Oskar,

do you have sample reports that you can share? i would be glad to add such reports to FreeEed, but I need to be more certain about the details. In particular

Frequency report

* For each found keyword, how many documents were responsive? How do we count emails with attachments?

For communications

* Do we count To, From, CC fields for each person?
* Do custodians have anything to do with it?

Thank you,
Mark

geekweb

unread,
Nov 27, 2011, 12:51:45 PM11/27/11
to freedi...@googlegroups.com
Hi,
 
Reports -> search keywords 
- hits in documents per custodian
- hits per kewywords
Then, we select the items to add to the review site.
 
Each item has a ID(pk). And a masterParentID (link with the email) for attachments.
 
To, from, cc, bco: best option, to do a matrix table, to make social analysis.
 
It is a great idea to improve these features:
 
- process NSF (lotus notes)
 
- recover deleted emails (very important, because pst save it, even when deleted from the trash) http://www.paraben.com/email-examiner.html
 
- pst File Recovery: http://scanpst.org/
 
 
- deduplicate emails by (subject, content, etc...), deduplicate files
 
- make a automatic QoC of the processed files.
 
 
I would like to participate to develop a Review System.
 
Regards,
 
Uriel Rodrigues

Mark Kerzner

unread,
Nov 27, 2011, 1:48:23 PM11/27/11
to freedi...@googlegroups.com
Uriel,

With yous and others' help, we will make FreeEed a really useful software system. I've added your wish list to "Issues" on GitHub. 
I combined them under Milestone V 5.0, https://github.com/markkerzner/FreeEed/issues?milestone=6&state=open, and tentatively put the release date at March 12. 

That date might be reasonable if I continue working alone, in my spare time. I don't expect others to join me on a volunteer basis, instead, I would like to hire more developers from the pool of people I worked with before. If that happens, the release dates will move closer.

Then we will be ready to tackle the review, and your help will be welcome.

Regards,
Mark
Reply all
Reply to author
Forward
0 new messages