Version 3.5 of FreeEed is available for download

66 views
Skip to first unread message

FreeEed Team

unread,
Feb 29, 2012, 11:13:44 PM2/29/12
to freediscovery
Hi, all,

The list of improvements is given here,
https://github.com/markkerzner/FreeEed/blob/release/release_notes.txt,
but in general

1. Many bugs are fixed;
2. The program remembers the directory, so it is easier to work with;
3. The program remembers the last eight projects you worked with, so
you don't have to look for them;
4. Each computing run creates a separate folder for the output
results, thus, they are kept, and moreover, folder lock-ups in Windows
are gone;
5. Denisting, or removing of system files from processed output, is an
option now;
6. Other options and more fixes.

Thank you, FreeEed team! http://freeeed.org/about-us/team

Sincerely,
Mark

Matt

unread,
Mar 6, 2012, 12:52:46 AM3/6/12
to freediscovery
I have installed v3.5.3 on a Windows server at home, and have begun
testing it out on a small data set. Here are the beginnings of that
project: http://bit.ly/wOkayS

I will share my findings with this group in this forum, when I have
the chance.

-Matt

On Feb 29, 11:13 pm, FreeEed Team <markkerz...@gmail.com> wrote:
> Hi, all,
>
> The list of improvements is given here,https://github.com/markkerzner/FreeEed/blob/release/release_notes.txt,

Mark Kerzner

unread,
Mar 6, 2012, 1:06:17 AM3/6/12
to freedi...@googlegroups.com
Matt,

thank you for a funny and informative post. FreeEed has this over Concordance: we will fix any error you find and add any improvement that will be needed to beat them. So please keep us all posted.

Sincerely,
Mark


--
You received this message because you are subscribed to the Google Groups "freediscovery" group.
To post to this group, send email to freedi...@googlegroups.com.
To unsubscribe from this group, send email to freediscover...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/freediscovery?hl=en.


Matt

unread,
Mar 15, 2012, 11:09:27 PM3/15/12
to freediscovery
I have posted Part 2 of this project, where I use Concordance to
process my sample data set.

Part 3 (coming sooner, I hope) will give the results of using FreeEed.
I will share. And I will also keep an eye out for feature suggestions.

http://bit.ly/wS8a00

Best,
Matt

Matt

unread,
Mar 31, 2012, 11:32:19 PM3/31/12
to freediscovery
Hi Mark -

I have been working with FreeEed whenever able in the past few weeks.
Below are a few observations and issues.

I have had some _SUCCESS with sample_freeeed_windows.project, but not
with my own data. The only difference I can see initially is the fact
that the test-data for sample project is inside the FreeEed
application folder, while mine is on another drive. It does not appear
to be a permissions issue.

https://docs.google.com/open?id=0BxFq_4HTyS8UX3ZPMUdPYlhUd3k1X09lMHFWNmFWQQ

Above is a link to the processing history for my most recent attempt
at my sample data. It looks like FreeEed staged 214 files, and then
started processing them, but got through only 25.

And this has been consistently the case for this data set - the
source, staging and output counts. I have even run processing on a
different machine, with the same effect.

Also, for my tests, Native.zip appears to be corrupt; I have tried to
open it with winzip and mac archive utility - both to no avail. A
sample is available at:

https://docs.google.com/open?id=0BxFq_4HTyS8USG1zcThDTDZTa1NpWG4zRThCeEJGZw

Native.zip for the sample project opens fine for me.

Also, the metadata output is not playing nice with database import. It
seems to have 19 fields named, but each record has a varying number of
fields. From my test case, here are the headers from the metadata
file:

"UPI"
"File Name"
"Custodian"
"Source Device"
"Source Path"
"Production Path"
"Modified Date"
"Modified Time"
"Time Offset Value"
"processing_exception"
"To"
"From"
"CC"
"BCC"
"Date Sent"
"Time Sent"
"Subject"
"Date Received"
"Time Received"

These also match the named fields in the Sample_Project metadata file.

And here is record 2 from my case output:

"00002"
"011NewCaseDONE.jpg"
""
""
"011NewCaseDONE.jpg"
""
""
""
""
""
""
" "
""
""
""
""
""
""
""
"011NewCaseDONE.jpg"
"application/msword"
"10"
"Normal.dot"
"Chad M. Wilmer"
""
"1"
"Microsoft Office Word"
"Erin Wrage"
" "
"1"
"269"
"94800000000"
"2006-12-05T21:55:00Z"
""
"10 and 11"
"1536"
""
"BOF
Inc."
"818957243"
""
"2008-01-04T19:12:00Z"
"ewr...@imagecap.com"


As you see, there are many more fields here than are listed in the
header. And other records have more or fewer fields in the same
metadata file.
I am unaware of a convenient way to code a file like this into an
"Insert Into..." sql script. That's something I've done a lot of, with
standardized output files where each record has the same number of
fields. This could be a "my problem", but it's the first time I've run
into it.

Below is a link to the entire metadata file:
https://docs.google.com/open?id=0BxFq_4HTyS8UYjFHNFBmQ29RbnFRcTllUDhvbFBmdw

While I continue to work on this project, your assistance would be
most greatly appreciated. Nevertheless, my curiosity is maximally
piqued, and I'm excited to be involved in this interesting endeavor.
So thanks very much.

I hope to hear from you.

-Matt

Mark Kerzner

unread,
Mar 31, 2012, 11:42:37 PM3/31/12
to freedi...@googlegroups.com

Matt, can you share your data?

Matt Toomey

unread,
Mar 31, 2012, 11:59:46 PM3/31/12
to freedi...@googlegroups.com
Mark-

https://docs.google.com/open?id=0BxFq_4HTyS8USXhLXzg1ZmZSU21CN2hWQU8tWFU3UQ

Let me know if that doesn't work. I should be able to get it on my FTP server in the morning.

Thanks!

Mark Kerzner

unread,
Apr 1, 2012, 12:08:30 AM4/1/12
to freedi...@googlegroups.com
Matt,

got your data intact! Beginning to work on it. If your curiosity is picked, and I feel that I have a challenge.

Thank you very much. Sincerely,
Mark

Mark Kerzner

unread,
Apr 1, 2012, 6:11:10 PM4/1/12
to freedi...@googlegroups.com
Matt,

I am actually running a later version, 3.6.1, which was earlier a release candidate. We did a lot of work with the output fields, and we now allow a choice of 4 different field separators. There are no more quotes (") to separate values (which was only a concession to Excel), so your database import should be fine.

I ran your data, and with the latest version it processed fine and gave me about 600 results in the output. I am attaching my screen shots with project options. Here is the zip of the output results, which you can download from our site: http://shmsoft.com/view/results_3.6.1.zip

You are a great help, and I really like working with you on that. I wonder if we should start accumulating tests in some public place on our site, do you think it would be a good idea? For example, your test files, do you think we can post them?

When you have the time, could you try with the latest stable release on the download page, http://freeeed.org/download?

Sincerely,
Mark
01-project.png
02-matt project.png
03 settings.png
04 - processing options.png

Matt Toomey

unread,
Apr 1, 2012, 9:46:41 PM4/1/12
to freedi...@googlegroups.com
Mark - v3.6.1 seems to have worked for me!

I'll now try to get it into a SQL Server table and so on. You'll hopefully hear from me next via my blog.

In the mean time, please do share that data as far and wide as you please. I think a lot can be learned from head-to-head runs.

OK - back to work for me. Thanks so much for your prompt and thorough assistance.

-Matt

Mark Kerzner

unread,
Apr 1, 2012, 9:51:00 PM4/1/12
to freedi...@googlegroups.com
Matt,

great to hear that. I will prepare the sharing structure - I would like to have a great number of tests publicly available, but first I would really love to know how we can create the "standard" answer - if one exists - to compare to.

Thank you for your cooperation.

Sincerely,
Mark

Matt Toomey

unread,
Apr 3, 2012, 8:54:00 PM4/3/12
to freedi...@googlegroups.com
Thanks so much for the help, Mark. I've posted the first look at my work with FreeEed.

eDiscovery - Lower in the Stack pt.III - FreeEed http://bit.ly/HeMpit

-Matt

Mark Kerzner

unread,
Apr 3, 2012, 9:00:31 PM4/3/12
to freedi...@googlegroups.com
Matt,

thank you very much for your awesome work - for the benefit of the eDiscovery community. Also, thanks for your great suggestions, they will be well taken in the soon-to-come releases. Our plan is to include OCR and imaging (TIF/PDF) in the very near future.

That's it for now...

Sincerely,
Mark

Mark Kerzner

unread,
Apr 3, 2012, 9:08:41 PM4/3/12
to freedi...@googlegroups.com
Of course, much credit goes to those open source projects which FreeEed (TM) incorporates. Countless hours,  days, etc.  of work of people who do Hadoop, Tika, PDFBox, and Lucene is what makes FreeEed shine.

Sincerely,
Mark
Reply all
Reply to author
Forward
0 new messages