Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Parallel archiver was updated to version 1.98...

5 views
Skip to first unread message

aminer

unread,
Nov 21, 2013, 2:36:41 PM11/21/13
to

Hello,

Parallel archiver was updated to version 1.98...

The LoadFromStream() method was fault tolerant to archive
damages and power failures etc. but there was still a problem,
the archive have to have a kind of an unique id so that the
LoadFromStream() works correctly , so i have added this unique
id and now Parallel archiver is rock solid an very stable.

You can download parallel archiver 1.98 from:

http://pages.videotron.com/aminer/

PArchiver 1.98 (stable version)

Description: Parallel archiver using my Parallel LZO , Parallel LZ4 ,
Parallel Zlib , Parallel Bzip and Parallel LZMA compression algorithms..

Supported features:

- Opens and creates archives using my Parallel LZ4 or Parallel LZO or
Parallel Zlib or Parallel Bzip or Parallel LZMA compression algorithms.

- Wide range of Parallel compression algorithms: Parallel LZ4, Parallel
LZO, Parallel ZLib, Parallel BZip and Parallel LZMA with different
compression levels

- Compiles into exe - no dll/ocx required.

- 64 bit supports - lets you create archive files over 4 GB ,
supports archives up to 2^63 bytes, compresses and
decompresses files up to 2^63 bytes.

- Now my Parallel Zlib gives 5% better performance than Pigz.

- Supports memory and file streams , adds compressed data
directly from streams and extracts archived files to streams
without creating temp files.
- Save/Load the archive from stream

- Supports in-memory archives

- You can use it as a hashtable from the hardisk or from the memory

- Fault tolerant to power failures etc..

- Creates encrypted archives using Parallel AES encryption with 256 bit
keys.

- Fastest compression levels are extremely fast

- Good balanced compression levels provide both good compression ratio
and high speed

- Maximum compression levels provide much better compression ratio than
Zip, RAR and BZIP and the same as 7Zip with 8 megabytes dictionary.

- It supports both compression and decompression rate indicator

- You can test the integrity of your archive

- Easy object programming interface

- Full source codes available.

- Platform: Win32 , Win64

Please look at test_pzlib.pas , test_plzo.pas , test_plz4.pas ,
test_pbzip.pas and test_plzma.pas demos inside the zip file, compile and
execute them.. -
I have tried to do a worst scalability prediction with an HDD drives and
a Q6600 quad core for my parallel archiver with Parallel LZMA, and i
think it's good..

There is four things in my Parallel LZMA algorithm:

First we have to copy serially a stream from the hardisk to the memory
and this will take in average 0.2 second and in the compression method
we have to copy a stream to the memory and this will take in average
0.05 second and in the compression method you have to compress a stream
to another stream in memory and this will take in average 13 seconds
seconds and in the compression method you have to copy a compressed
stream to a hardisk file and this will take in average 0.01 second.

So we have the serial part that is: 0.2 second + 0.01 second + 0.05
second = 0.26 second = 0.02%
and the parallel part will that is: 13 seconds = 0.98%

So the worst case scalability scenario using an HDD and using the Amdahl
equation will give us: 1/0.02% + (0.98%/N) = 50X scalability (N: is the
number of cores)

So this will scale up to: 50X , so as you have noticed with an HDD drive
this is a good scalability.

So what can we do to scale more parallel archiver using parallel LZMA ?

You can for example use a RAID 10 with a base configuration of 4 HDD
drives, so this will cut in 4 the 0.2 second and the 0.01 second , so
this will give a scalability of 124X and this is better.. but to speed
more the things we can use SSD drives that are 2X time faster than a HDD
drives and with a RAID 10 configuration and this will give: 434X
worst case scalability.

So as you have noticed if you are using only an HDD with a multicore
system you will get a 50X scalability with my parallel archiver using
parallel LZMA, and if you use RAID 10 with SSD drives you will get 434X
scalability.

When you want to delete files inside the archive you have to call the
DeleteFiles() method , the DeleteFiles() method will not delete the
files, it will mark the files as deleted , when you want to delete
completly the files , you have to call the DeletedItems() method to see
how many files are marked deleted and after that you use the Clean()
method to delete completly the files from the archive. I have
implemented it like that, cause it's better in my opinion..

And my parallel archiver uses a hashtable to store the file names and
there corresponding file positions so that you can direct access to
files inside the archive when decompressing, and deleting etc. so it's
very fast.

Please look at the test_pzlib.pas, test_plzo.pas, test_plz4.pas ,
test_pbzip.pas and test_plzma.pas demos inside the zip file to see how
to use my Parallel archiver.

And please don't use directly the ParalleZlib.pas that i have included
inside the Parallel archiver zip file, cause i have modified it to work
correclty with my Parallel archiver.

If you want to use my ParallelZlib library just download it from my
website, or download my other Parallel compression library.

You can now use my Parallel archiver as a hashtable from the hardisk
with 0(1) access, you can for example
stream your database row with my ParallelVarFiler into a memory stream
or into a string, and store it with my Parallel archiver into an
archive, and after that your can access your rows into the hardisk as a
hashtable with O(1) access, you can use it like that as a database if
you have for example id keys that you want to map to database rows,
that will be a good idea to
use my Parallel archiver as a hashtable.

Question:

What's your newest ideas behind your parallel archiver ?

Answer:

Of course my Parallel Archiver supports Parallel compression etc. but my
newest ideas behind my Parallel Archiver are the following:

I have played with Winzip and 7Zip , but if you want to give some files
to extract or to test there integrity, they both (Winzip and 7Zip) will
use sequential access and that's bad i think, so i have decided to
implement a O(1) access that is very fast for extraction and and for
testing the integrity etc. into my Parallel Archiver and for that i have
used an in-memory hashtable that maintains the files names and there
correponding file positions , and my second idea is that my Parallel
Archiver is fault tolerant to power failures and also if your hardisk is
full and you get file corruption etc. so my Parallel Archiver is fault
tolerant to this kind of problems , 7Zip and Winzip i think are not
fault tolerant to those kind of problems.

I have just played with 7Zip , and i have compressed 3 files into the
archive and after than i have opened the archive with an editor and i
have deleted some bytes and i have saved the file and after that when i
have tried to open the archive, 7zip responded that the file is
corrupted, so 7Zip is not fault tolerant, i think that with WinZip it's
the same, but i have done the same test with my Parallel archiver, and
it's recovering from the file damage, so it's fault tolerant to this
kind of damages, such as power failures and when also the disk is full
and you get a file corruption etc. I have implemented this kind of
fault tolerancy into my Parallel archiver.

I have updated my Parallel archiver and i have added the Update()
method, it's overloaded now in the first version you pass a key name and
a TStream, and in the second version you pass a key name and a filename.
Please look at the test_pzlib.pas demo inside the zip file to see how to
use those methods.

So now you have all the methods to use my Parallel archiver as a
Hashtable from the hardisk with direct access to the compressed and/or
encrypted data with O(1) complexity and very fast acces to the data ,
the DeleteFiles() has a O(1) complexity the ExtractFiles() and Extract()
have also O(1) complexity and GetInfo() is also O(1) and of course the
AddFiles() is also O(1), the Test() method is also O(1). So now it's
extremely fast.
When you want to do solid compression with my Parallel archiver using
Bzip , you can use the same method as is using Tar , you can first
archive your file with the compression level 0 and after that compress
all your archive file using Bzip, and when you want to encrypt your data
with Parallel AES encryption just give a password by setting the
password property and when you don't want to encrypt just set the
password property to a null string or don't set the password property ,
that's all.

Parallel archiver supports the storing and restoring of the following
file attributes:

Hidden, Archive, System, and Read only attributes.

To store and restore them just set the AddAttributes property like this:

pzr.AddAttributes:=[ffArchive,ffReadOnly,ffHidden,ffSystem];
I have added the in-memory archives support, cause this way Parallel
archiver will be much more faster than disk archives, and you will be
able to lower much more the response time and to lower the load on your
server.

If you want to use an in-memory archive, pass an empty string to the
file name in the constructor, like this:

pzr :=TPLZ4Archiver.Create('',1000,4);

And if you want to read your in-memory archive , read from the Stream
property that is exposed(a TStream) like this:

pzr.stream.position:=0;
A_Memory_Stream.copyfrom(pzr.stream,pzr.stream.size)

You can also load your archive from a file or memory stream just by
assigning your file or memory stream to the Stream property (a TStream).

I have overloaded the GetKeys() method , now you can use wildcards, you
can pass the wildcard in the first argument and the TStringList in the
second argument like this: pzr.getkeys('*.pas',st);
and after that call the ExtractFiles() method and pass it the TStringList.

As you have noticed, the programming interface of my Parallel archiver
is very easy to use.

And read this:

"We're a video sharing site located in China. We rewrote the PHP
memcached client extension by replacing zlib with QuickLZ. Then our
server loads were dramatically reduced by up to 50%, the page response
time was also boosted. Thanks for your great work!

Jiang Hong"

http://www.quicklz.com/testimonials.html

http://www.quicklz.com/

So as you have noticed , like QuickLZ or Qpress, i have implemented
Parallel archiver to be very fast also.

By using my Parallel Zlib or my Parallel LZ4 or my Parallel LZO
compression algorithms my Parallel archiver will be very very fast and
as i have wrote in my webpage:

"So now you have all the methods to use my Parallel archiver as a
Hashtable from the hardisk with direct access to the compressed and/or
encrypted data with O(1) very fast acces to the data , the DeleteFiles()
has a O(1) complexity the ExtractFiles() and Extract() have also O(1)
complexity and GetInfo() is also O(1) and of course the AddFiles() is
also O(1), the Test() method is also O(1). So now it's extremely fast. "

You can even use my Parallel archiver as a hash table database from the
Harddisk to lower more the load on your server (from internet or
intranet) and boost the response time.....

I have used solid compression like with the tar.lzma format and i have
found that my Parallel archiver, with maximum level compression that is
clLZMAMax, compresses to the same size as 7Zip with maximum level
compression and with a dictionary size of 8 MB and it compresses 13%
better than WinRar with maximum level compression and it is muh better
than WinZip on compression ratio .

How to use solid compression with my Parallel archiver ?

Just archive your files with clLZMANone and after that compress your
archive with clLZMAMax, Parallel archiver will then compress to the same
size as 7Zip with maximum level compression and with a dictionary size
of 8 MB and it will compress 13% better than WinRar with maximum level
compression and it will compress muh better than WinZip with maximum
level compression .

I have updated my Parallel archiver to a new version and i have decided
to include Parallel LZ4 compression algorithm (one of the fastest in
the world) into my Parallel archiver, so to compress bigger data such
us Terabytes data you can use my Parallel LZO or my Parallel LZ4
compression algorithms with my Parallel archiver, i have also added the
high compression mode to Parallel LZ4 compression algorithm, now for a
fast mode use clLZ4Fast and for the high compression mode use clLZ4Max.
The Parallel LZ4 high compression mode is interresting also, it
compresses much better than LZO and it is very very fast on
decompression, faster than Parallel LZO. I have included a test_plz4.pas
demo inside my Parallel archiver zip file to show you how to use
Parallel LZ4 algorithm with my Parallel archiver.

Here is the LZ4 website if you want to read about it:

http://code.google.com/p/lz4/


I have downloaded also the IHCA compression algorithm from the
following website:

http://objectegypt.com/

And i have wrote a Parallel IHCA and begin testing it against my
Parallel LZO and my Parallel LZ4 , they say on the IHCA website that it
has the same performance as the LZO algorithm , but i have noticed on
my benchmarks that Parallel IHCA(that i wrote) is much more slower than
my Parallel LZO and my Parallel LZ4 , so i think the IHCA compressoin
algorithm is a poor quality software that you must avoid, so please use
my Parallel archiver and Parallel compression library cause with my
Parallel LZO and my Parallel LZ4 they are now one of the fastest in the
world.

I have also downloaded the following QuickLZ algorithm from:

http://www.quicklz.com/

and i have wrote a Parallel QuickLZ and i have tested it against my
Parallel LZO and Parallel LZ4 , and i have noticed that Parallel QuickLZ
is slower than my Parallel LZ4 algorithm, other than that with QuickLZ
you have to pay for a commercial license , but with my Parallel
archiver and my Parallel compression library you have to pay 0$ for a
commercial license.

My Parallel archiver was updated, i have ported the Parallel LZ4
compression algorithm(one of the fastest in the world) to the Windows
64 bit system, now Parallel LZ4 compression algorithm is working
perfectly with Windows 32 bit and 64 bit, if you want to use Parallel
LZ4 with Windows 64 bit just copy the lz4_2.dll inside the LZ4_64
directory (that you find inside the zip file) to your
current directory or to the c:\windows\system32 directory, and if you
want to use the Parallel LZ4 with Windows 32 bit use the lz4_2.dll
inside the LZ4_32 directory.

Here is more information about my Parallel archiver:

Parallel LZO supports Windows 32 bit and 64 bit

Parallel Zlib supports Windows 32 bit and 64 bit

Parallel LZ4 supports Windows 32 bit and 64 bit

Parallel Bzip is Windows 32 bit only

Parallel LZMA is Windows 32 bit only

But even if Parallel LZMA and Parallel Bzip are windows 32 bit only , my
Parallel archiver supports Terabytes files and your archive can grow to
Terabytes size even with 32 bit windows executables, and that's good.

And Look also at the prices of the XCEED products:

XCEED Streaming compression library:

http://xceed.com/Streaming_ActiveX_Intro.html

and the XCEED Zip compression library:

http://xceed.com/Zip_ActiveX_Intro.html

http://xceed.com/pages/TopMenu/Products/ProductSearch.aspx?Lang=EN-CA


I don't think the XCEED products supports parallel compression as does
my Parallel archiver
and my Parallel compression library..

And just look also at the Easy compression library for example, if you
have noticed also it's not a parallel compression library.

http://www.componentace.com/ecl_features.htm

And look at its pricing:

http://www.componentace.com/order/order_product.php?id=4


My Parallel archiver and parallel compression library costs you 0$ and
they are parallel compression libraries, and they are very fast and very
easy to use, and they supports Parallel LZ , Parallel LZ4, Parallel
LZO, Parallel Zlib, Parallel Bzip and Parallel LZMA and they come with
the source codes and much more...

Hope you will enjoy my Parallel archiver.

Here is the public methods that i have implemented:

Constructor Create(file1:string,size:integer;nbrprocs:integer);
- Creates a new TPZArchiver ready to use, size is the hashtable size for
the index(Key file names and the corresponding file position ,and file1
is the file archive, nbrprocs is the number of cores you have specify to
run Zlib , LZ4, LZO , Bzip and LZMA in parallel.

Destructor Destroy;
- Destroys the TPZArchiver object and cleans up.

function AddFiles;
- Adds the files to the archive.

function AddStream;
-Adds the stream to the archive.

function DeleteFiles;
- Deletes the TStringList content from the archive.

function Erase;
- Erases the data inside the archive and inside the hashtable.

function Update;
- Updates the file or the stream inside the archive

function ExtractFiles;
- Extracts the TStringList content from the archive.

function ExtractAll;
- Extracts all the files from the archive.

function Extract;
-Extracts the file to the stream.

function Test;
- Tests the integrity of the files inside the archive.

function GetInfo;
- Gets the file info that is returned in a TZSearchRec record.

function ClearFile;
- Deletes all contents of the archive.

function Clean:boolean
- Cleans the marked deleted items from the file.

function DeletedItems:integer
- Returns the number of items marked deleted.

function LoadIndex:boolean
- Loads the the file names keys and there correponding file positions
values from the file passed to the constructor into the hashtable.

function Exists(Name : String) : Boolean;
- Returns True if a file Name exists

procedure GetKeys(Strings : Tstrings);
- Fills up a TStrings descendant with all the file names.

function Count : Integer;
- Returns the number of files inside the archive.


PUBLIC PROPERTIES:

Indicator : boolean
- To show the compression and decompression indicator.
CompressionLevel;
- Sets and reads the compression level.
Overwrite:boolean
- To update and overwrite the file without asking .
Freshen: boolean
-Adds newer files to the archiver and extract newer files from the archive.
AddRecurse: boolean
- AddFiles() method will recurse on subdirectories.
Stream:boolean
- The archive is exposed as a TStream, use it for in-memory archive
or disk archive.
AddAttributes: TAttrOptions
- FindFile attributes for the AddFiles() method, look inside FindFile
component.

Language: FPC Pascal v2.2.0+ and Lazarus / Delphi 7 to 2007:
http://www.freepascal.org/

Operating Systems: Win32 and Win64

And inside defines.inc you can use the following defines:

{$DEFINE CPU32} and {$DEFINE Win32} for 32 bit systems

{$DEFINE CPU64} and {$DEFINE Win64} for 64 bit systems

Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal

-Sd for delphi mode....

Required Delphi switches: -DMSWINDOWS -$H+ -DDelphi



Thank you,
Amine Moulay Ramdane.


0 new messages