Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

implementing read/write archive/image format

2 views

Skip to first unread message

cr88192

unread,

Apr 27, 2006, 7:44:18 PM4/27/06

so, a little closer to the start of the month, the issue and idea came up
for implementing a fully read/write archive format.

dunno if the idea (or format/implementation) is actually interesting to
anyone. I guess people can post responses if they feel compelled, or maybe I
can put a copy online so people can mess with it if they want (or I could
email zip files)?...

it may be useful to hear any design criticisms or similar...

core design problems:

as defined, the zip file format is unsuitible for read/write usage in
itself, leading to an inability to just use the format for this in my case
(it is fine for read-only use though).

typically data likes to change size when modified and recompressed. this
requires the ability to move data around dynamically as needed. this
requires some manner of graph-structured, rather than stream-structured file
format (typical of the various archive formats I had looked at).

the archives may be potentially large (100s of MB or maybe multiple GB) and
may contain a potentially large number of files (eg: 100k maybe). likewise,
it is unreasonable to expect everything to be kept in memory (the host app
needs to use ram for other things), and operations need to be completed more
or less in real-time (leading to avoidance of any particularly expensive
operation, partly excluding mounting or unmounting which don't necissarily
need to be done in real-time). the expectation is that the user will notice
and not respond positively to any sudden and noticable delays.

simplifying assumptions:
smaller files are not to be fragmented;
for now, the entire root directory and spans tree are assumed small enough
to be reasonably stored in memory (root with 100k entries will take about
6.4MB, spans tree nodes are presently malloc'ed, but a slab may make more
sense, which would require about 5MB in my case);
4GB is still rather large, so it is presently assumed that contained files
are smaller than this (albeit the archives themselves may be larger).

implementation details:

the present implementation is a command-line tool (rather sad, not showing
the full possibility of the format). a slight improvement could be a
specialized shell, allowing copying data to/from the host filesystem, rather
than just giving a tar or infozip style interface to the format. intent is
to later modify the code and integrate it with my main codebase (however, an
external tool is likely to be highly necessary in making the format
"useful").

presently, files are managed with a cache, where recently accessed files are
kept in cache (via an mru scheme) up to some memory limit. files beyond this
limit will be written back into the archive image and have their memory
freed (this aspect of caching not is presently implemented...).

spans are managed via a spans-tree. this tree is primarily just a variation
of an AVL tree. it is rebuilt at mount time and not saved to the image (it
takes O(n log2 n) time to rebuild, so storing it is presently unlikely to
offer any real speed advantage).

it my case it was either an avl tree (or maybe a b-tree) vs an elaborate
linked-list setup. in my case, the linked list option would be significantly
more complex, and quite possibly much slower to process/rebuild. likewise,
the linked lists would risk having a potentially significant impact on the
design of the rest of the format.
so, in this case, the AVL tree idea won out (as for b-trees, these are much
more complicated to implement in my experience and wouldn't really offer
that much in return).

most of the core algos for space management were designed in terms of
stepping along the avl tree, so most are completed in O(log2 n) time. an
exception presently is locating the "best fit", which should take linear
time (optimizations here would be possible at the cost of increasing tree
size and implementation complexity).

fragmented and sparse files are not yet implemented.

note that dir entries are fixed size to allow quick allocation and freeing
of entries, and also to allow me to read/write the whole directory at once
(and later possibly using a "tile" system). a fixed 32 char name limit was
considered reasonable enough (if needed, longer names could be "faked" via
storing a name prefix and using the last 4 chars as a hash-code for the full
name, allowing retrival via longer names, albeit readdir-like calls would
return the modified name).

"my_filename_as_some_arbitrarily_long_string_or_maybe_even_a_"
"full_paragraph_or_book_or_something.txt"
eg: "my_filename_as_some_ar~AkZ9.txt".

file format:

the file format is fairly simplistic in all, and stores very little "extra"
information. a plain reader or sequential writer is thus unlikely to be all
that complicated to implement (just a full read/write implementation
requires a little more).

all multi-byte numbers are little-endian.

Header
{
FOURCC magic; //'ZPAK'
u32 ents; //size of directory
u64 offs; //offset of directory
}

DirEntry
{
char name[32]; //file name, 0 padded
u32 chain; //next dir entry
u32 date_time; //date modified
u16 flags; //file flags (1&=fragmented, 2&=dir)
u16 method; //method number, 0=store, 8=deflate,
//9=deflate(ext)
u32 crc32; //CRC32 of data (or first entry if dir)
u32 usize; //uncompressed size of data
u32 csize; //compressed size of data
u64 offset; //offset to start of data
}

a name beginning with 0 will be free. note that the name does not include
the path. any names beginning with '$' are reserved for archive metadata.

the date_time field multiplies and adds the values:
60s, 60m, 24h, 31d, 12m, 133.63y (rel 1980, fails in 2113).
so:
s=date_time%60;
m=(date_time/60)%60;
h=(date_time/3600)%24;
...

flags:

1&=the file is fragmented. Offset/csize refers to a table of spans, each
giving the offset and size of each fragment. this could be useful for larger
files, where possibly relocating on each write may not be all that
efficient.

2&=directory. the CRC field will refer to the first directory entry. usize,
csize, and offset, and method are reserved.

method 9 will be deflate with a 64kB window.

the CRC algo will be the same algo used in ZIP and PNG.

fragmented files:
fragmented files will have the contents refer to an array of spans.

FragmentSpan {
u32 _resv0; //reserved, 0
u32 _resv1; //reserved, 0
u16 flags; //fragment flags
u16 method; //method number
u32 crc32; //CRC32 of data
u32 usize; //uncompressed size of data
u32 csize; //compressed size of data
u64 offset; //offset to start of data
}

"sparse" files may be possible, in this case, method, csize, and offset are
0, wheras usize is not. the span is to be interpreted as 0-filled.

for fragmented files, the total file size is defined as the sum of all the
fragment sizes. note that since there is no direct representation of "file
size" for fragmented files, it is possible that they may be allowed
(eventually) to exceed the 4GB limit.

toby

unread,

Apr 27, 2006, 9:54:07 PM4/27/06

cr88192 wrote:
> so, a little closer to the start of the month, the issue and idea came up
> for implementing a fully read/write archive format.

> ...

> the present implementation is a command-line tool (rather sad, not showing
> the full possibility of the format). a slight improvement could be a
> specialized shell, allowing copying data to/from the host filesystem, rather
> than just giving a tar or infozip style interface to the format.

Have you considered using e.g. FUSE to make your archive mountable as a
first class filesystem? (http://fuse.sourceforge.net/)

--T

cr88192

unread,

Apr 27, 2006, 11:48:29 PM4/27/06

"toby" <to...@telegraphics.com.au> wrote in message
news:1146189247.1...@v46g2000cwv.googlegroups.com...

would be cool, apart from the fact I am currently using windows as my main
os...

> --T
>

cr88192

unread,

Apr 30, 2006, 10:55:16 AM4/30/06

"cr88192" <cr8...@NOSPAM.hotmail.com> wrote in message
news:824a0$44515754$ca83b3d3$96...@saipan.com...

> so, a little closer to the start of the month, the issue and idea came up
> for implementing a fully read/write archive format.
>
> dunno if the idea (or format/implementation) is actually interesting to
> anyone. I guess people can post responses if they feel compelled, or maybe
> I can put a copy online so people can mess with it if they want (or I
> could email zip files)?...
>
> it may be useful to hear any design criticisms or similar...
>

still, misc updates on the effort...
still using deflate (but deflate is fairly fast and works pretty well).

since I am implementing a good deal of basic filesystem style functionality
(readdir, file operations, directory operations, ...). the effort is a
little more involved than a typical archiver (which takes a more direct
"building an archive" or "extracting from an archive" approach), so progress
is a little slower (things are done via an external api, with an interface
and expected behavior similar to the os-level api's).

even as such, it seems with the original commandline version of the tool, a
lot of shortcuts were taken. some major operations, eg, adding or extracting
files, interacted fairly directly with the directory and cache managment
code (vs taking the more externalized approach, eg,
opendir/readdir/stat/mkdir, open/read/write, deletion, ...).
likewise, I had relied some on the shells' tendency to do basic strings
matching (dealing with '*', ...).

so, upon implementing the shell-based version of the tool, a lot of things
were altered. I have now split off the code partly into a half-assed library
(closer to a mass of source files that include each other, but oh well, I
have seen a few libs done this way).

implementing a shell-based tool pointed out a lot of bugs in the
implementation as well, which I have been fixing (along the lines of "oh
crap, if the first file in the root is deleted, all the archive contents
disappear", "mkdir is making an empty file instead of a directory", ... type
stuff). also bolted down more of the semantics of the file-format (empty
directories are in fact empty, root entry 0 is special and serves mostly to
point to the real first root file vs having contents, ...).

other major things:
originally the avl trees were built by using malloc to allocate the nodes. I
have changed this to a big array, and now integers are used to refer to
nodes vs pointers (this was partly by realizing that with pointers, calling
realloc risked potentially somewhat messing things up, but this transition
was still a hassle).

more implemented the caching features (so, now, stuff is actually forced out
to disk when the cache fills...). there is also a limit set to the max
number of files that can be in cache (file caches are managed by another
array, albeit the buffers and similar are still allocated with malloc).
fragmented files are still not yet implemented.

the shell version of the tool presently uses readline (fairly convinient and
gives nice input handling, but not exactly all that portable).

ended up doing some strings matching stuff similar to the host shell.
'*' matches a variable number of characters, but only within a single
directory;
'**' matches a variable number of characters potentially between directories
(this idea came partly from VI, dunno if used elsewhere);
likewise, '?' matches a single character, and '\' works as an escape.

eg:
*.txt will match text files within the current directory;
*/*.txt will match text files within a direct subdirectory;
**.txt will match all text files in this and sub-directories;
...

note that this matching is done by the commands and not the shell, meaning
that not all commands will sanely accept patterns (eg: the ls variants
accept paths, but not presently patterns, ...).

the shell also works by maintaining 2 paths, one is the external working
directory, and the other is the virtual working directory. a lot of commands
(import, export, rm, ...) are done in relation to these paths.

still no big deal though, after the lib is cleaned up a little more, I may
go and integrate this with my main codebase.

if made into a standalone lib though, maybe some people might actually care
for all I know...

or such...

0 new messages