AMF Container Format

77 views
Skip to first unread message

Jesse McGatha

unread,
Jan 10, 2013, 12:09:50 AM1/10/13
to st...@googlegroups.com

Container Format – While the spec does allow for a ZIP archive container, it is fairly restrictive in its use, particularly with regard to the ZIP item name, which limits the ability of a producer to include additional payloads in the ZIP archive. Why would you not allow use of the Open Packaging Conventions ZIP-based container format (now an ISO standard) to hold these documents and extract the markup part. It also has the advantage of specifying the pack:// URI scheme for addressing parts within the ZIP container (allowing you to only download the AMF markup out of the OPC container without having to download textures and the like. Have you considered relaxing the container format to allow for this form of ZIP container? Aside from this, the ZIP format itself could use additional specification. The format allows for informative ZIP item headers in addition to the central archive at the end of the file. When it comes to streaming scenarios for large files, it would be helpful to recommend that the ZIP items each have appropriate header information to assist streaming consumers. Personally, I believe it would be advantageous to support only an OPC container, which encapsulates all of the details for interacting with ZIP under the onus of another standard that is responsible for monitoring details of the underlying ZIP archive format.

Hod Lipson

unread,
Jan 11, 2013, 2:49:35 PM1/11/13
to st...@googlegroups.com

I am open to suggestions here regarding container format. Our goal was to keep it simple and familiar. Anyone can zip an AMF file using a variety of free utilities and free libraries. I am not sure we need to have an expanded capability here yet – implementers already have plenty to contend with. But we could consider extending the container format if you think it would be worth the added complexity.

 

--hod

--
You received this message because you are subscribed to the Google Groups "STL 2.0" group.
To view this discussion on the web visit https://groups.google.com/d/msg/stl2/-/QVL7ofCbNUQJ.
To post to this group, send email to st...@googlegroups.com.
To unsubscribe from this group, send email to stl2+uns...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/stl2?hl=en.

Jesse McGatha

unread,
Jan 14, 2013, 6:00:11 AM1/14/13
to st...@googlegroups.com

On Friday, January 11, 2013 11:49:35 AM UTC-8, Hod Lipson wrote:

I am open to suggestions here regarding container format. Our goal was to keep it simple and familiar. Anyone can zip an AMF file using a variety of free utilities and free libraries. I am not sure we need to have an expanded capability here yet – implementers already have plenty to contend with. But we could consider extending the container format if you think it would be worth the added complexity.

 

I would make 2 suggestions here:
 
1) Explicitly define the extension for an uncompressed AMF file (e.g. ".amf") and explicitly define the extension for a compressed AMF file (e.g. ".amfc"). Also define MIME types for each (this allows web server software to present all AMF file MIME types consistently without treating the files as generic ZIP documents). Sniffing is evil and does not really work on the web. It is much better to not follow the precedent created by STL here.
 
2) The restriction on the ZIP item naming is the key problematic section of the spec. This should be restricted to only AMF specs that are archived with the designated format. This lifts the restriction on other ZIP-based formats (such as OPC) that want to include AMF content.
 
-Jesse

Reinoud Zandijk

unread,
Jan 14, 2013, 7:42:29 AM1/14/13
to st...@googlegroups.com
Hi folks,

On Mon, Jan 14, 2013 at 03:00:11AM -0800, Jesse McGatha wrote:
> I would make 2 suggestions here:
>
> 1) Explicitly define the extension for an uncompressed AMF
> file (e.g. ".amf") and explicitly define the extension for a compressed AMF
> file (e.g. ".amfc"). Also define MIME types for each (this allows web
> server software to present all AMF file MIME types consistently without
> treating the files as generic ZIP documents). Sniffing is evil and does not
> really work on the web. It is much better to not follow the precedent
> created by STL here.

I'd opt for leaving out the compression altogether and leave it to the
interchanger. For Unix systems it'll typically be ".amf.gz" or ".amf.Z" or
whatever is wanted. Its very easy for such systems to stream the data since
its basicly just one compressed data stream.

The usage of `zip' is only sensible if we insist on using multiple files
together in one archive. It also does't allow for streaming though since the
complete zip needs to be present when we start extracting files from it, even
if its just the main file. This limits streamability.

I'm not a fan of multiple file inclusion since all required data can be
inlined anyway. It also prevents zips inside zips when its included into
another XML stream/document. Finally it also ensures the data is there in one
piece and is allways one complete snapshot. (Think of AMF files with missing
texture files or referring to missing sub-AMF files).

> 2) The restriction on the ZIP item naming is the key problematic section of
> the spec. This should be restricted to only AMF specs that
> are archived with the designated format. This lifts the restriction on
> other ZIP-based formats (such as OPC) that want to include AMF content.

I think i just adressed this above.

With regards,
Reinoud

Jesse McGatha

unread,
Jan 16, 2013, 7:44:04 AM1/16/13
to st...@googlegroups.com, rei...@13thmonkey.org

On Monday, January 14, 2013 4:42:29 AM UTC-8, Reinoud Zandijk wrote:

I'd opt for leaving out the compression altogether and leave it to the
interchanger. For Unix systems it'll typically be ".amf.gz" or ".amf.Z" or
whatever is wanted. Its very easy for such systems to stream the data since
its basicly just one compressed data stream.

The usage of `zip' is only sensible if we insist on using multiple files
together in one archive. It also does't allow for streaming though since the
complete zip needs to be present when we start extracting files from it, even
if its just the main file. This limits streamability.
 
I disagree. ZIP is useful for more than aggregating multiple files; it is also useful for compression.  In connected systems, bandwidth is a significant problem, so reduction in overall size is a problem. The spec readily identifies in X1 the consequences of not having compression make AMF a substantially bigger file size (12MB to 206MB in one case) and substantially larger than even an equivalent STL file. The bandwidth costs and network traffic costs to an enterprise (especially one that operated over a WAN) would be tremendously prohibitive, so file size should always be as small as possible. Furthermore, it also creates an ongoing storage cost for circumstances where archival is required (such as the mandatory archival of medical data). Enterprises pushed Microsoft to abandon its binary print spool file format in favor of XPS for these very reasons, among others. In fact, I would argue the reverse: Compression should always be required because of these factors.
 
It is not true that a ZIP archive impedes streaming. It is completely possible to stream a ZIP archive and decompress it as you receive the bytes. This is what happens in the Windows print pipeline in fact. Optimizing for streaming does require careful organization of the file, of course, but an intelligent producer can easily do this. It also requires that proper and accurate ZIP item headers are used, so that the streaming consumer does not have to wait for the central directory (located at the end of the file) to arrive before processing ZIP items, but rather relies on the individual ZIP item headers.
 

I'm not a fan of multiple file inclusion since all required data can be
inlined anyway. It also prevents zips inside zips when its included into
another XML stream/document. Finally it also ensures the data is there in one
piece and is allways one complete snapshot. (Think of AMF files with missing
texture files or referring to missing sub-AMF files).
 
Inlining all data is actually a negative for the consumer that does not need that data. Many compliant consumers will not use textures, for example, yet all of them are required to parse that data before they can get to the data they actually want. That's an ongoing performance hit each time the file is read. Not to mention that it actually inflates texture data by compressing it as inlined XML data. ZIPs inside ZIPs can (and should) be prohibited by the spec. There is no difference in completeness guarantee between all data in one XML file or all data in one ZIP file. It is just as possible for a producer to fail to include inlined texture data as it is to fail to include the texture data as a referenced ZIP item. As long as the spec calls out the required failure behavior if it is missing. this is no different than a Word .docx (zip archive with multiple items) or an OpenDocument file (ZIP archive with multiple items). The evidence would suggest that data going missing in ZIP archives is a non-existent problem in the real world.
 
-Jesse

Hod Lipson

unread,
Jan 16, 2013, 11:31:06 AM1/16/13
to st...@googlegroups.com, rei...@13thmonkey.org

Streamability is not an issue: AMF files can’t be streamed anyway because they are non-linear  - you need the last vertex information before you can print anything.

 

The reason for including compression in the format is that it would be important for producer to produce a compressed file directly, rather than produce a non-compressed file and then compress it later. Otherwise, the intermediate file could be Gigabytes.

 

--hod

 

 

Hod Lipson

Associate Prof. of Mechanical & Aerospace Engineering and Computing & Information Science

Cornell University, 242 Upson Hall, Ithaca NY 14853, USA

Office: (607) 255 1686 Lab: (607) 254 8940 Fax: (607) 255 1222

Email: Hod.L...@cornell.edu

Web: http://www.mae.cornell.edu/lipson

Administrative Assistant:  Craig Ryan  cd...@cornell.edu, (607) 255-6981, Upson 258

Calendar: http://www.mae.cornell.edu/lipson/calendar.htm

--

You received this message because you are subscribed to the Google Groups "STL 2.0" group.

To view this discussion on the web visit https://groups.google.com/d/msg/stl2/-/mluTzHVWt-IJ.

Reinoud Zandijk

unread,
Jan 18, 2013, 4:43:01 PM1/18/13
to st...@googlegroups.com
Dear Jesse,

On Wed, Jan 16, 2013 at 04:44:04AM -0800, Jesse McGatha wrote:
> On Monday, January 14, 2013 4:42:29 AM UTC-8, Reinoud Zandijk wrote:
> > I'd opt for leaving out the compression altogether and leave it to the
> > interchanger. For Unix systems it'll typically be ".amf.gz" or ".amf.Z" or
> > whatever is wanted. Its very easy for such systems to stream the data
> > since its basicly just one compressed data stream.
> >
> > The usage of `zip' is only sensible if we insist on using multiple files
> > together in one archive. It also does't allow for streaming though since
> > the complete zip needs to be present when we start extracting files from
> > it, even if its just the main file. This limits streamability.
>
> I disagree. ZIP is useful for more than aggregating multiple files; it is
> also useful for compression. In connected systems, bandwidth is a
> significant problem, so reduction in overall size is a problem. The spec
> readily identifies in X1 the consequences of not having compression make AMF
> a substantially bigger file size (12MB to 206MB in one case) and

Well, i am arguing that the compression method can and should be left out when
we stick to a single file. The filename.amf.gz i referred to is a gzip'd file
which is basicly a compressed stream. It can be decompressed on-the-fly when
needed without an intermediate extraction first. It is also free software
instead of zip, which is/used-to-be a propriatary format.

Even when we're considering multiple files, there is no reason to dictate zip,
we could easily use tar, but thats another story.

> fact, I would argue the reverse: Compression should always be required
> because of these factors.
>
> It is not true that a ZIP archive impedes streaming. It is completely
> possible to stream a ZIP archive and decompress it as you receive the bytes.
> This is what happens in the Windows print pipeline in fact. Optimizing for
> streaming does require careful organization of the file, of course, but an
> intelligent producer can easily do this. It also requires that proper and
> accurate ZIP item headers are used, so that the streaming consumer does not
> have to wait for the central directory (located at the end of the file) to
> arrive before processing ZIP items, but rather relies on the individual ZIP
> item headers.

Yes, and that won't happen... the fact that it is `possible' only means that
it *can* be done, but reality dictates that in most cases this won't happen.

> > I'm not a fan of multiple file inclusion since all required data can be
> > inlined anyway. It also prevents zips inside zips when its included into
> > another XML stream/document. Finally it also ensures the data is there in
> > one piece and is allways one complete snapshot. (Think of AMF files with
> > missing texture files or referring to missing sub-AMF files).
> >
>
> Inlining all data is actually a negative for the consumer that does not need
> that data. Many compliant consumers will not use textures, for example, yet
> all of them are required to parse that data before they can get to the data
> they actually want. That's an ongoing performance hit each time the file is
> read. Not to mention that it actually inflates texture data by compressing

Even when files are inlined they can be forgotten sure, thats a program bug,
but inlined files can be written out as one stream by the producing program
whereas if its a zip archive, this can be doubted since it might need to
update references in the file before it can stream out.

> required failure behavior if it is missing. this is no different than a
> Word .docx (zip archive with multiple items) or an OpenDocument file (ZIP
> archive with multiple items). The evidence would suggest that data going
> missing in ZIP archives is a non-existent problem in the real world.

I am not arguing that files might get lost inside zip archives but that the
archives could be opened and files overwritten or modified or whatever. This
is also a problem with the docx/openoffice files. An AMF file can be opened in
a text editor for sure, but its harder to replace an image or to do whatever.

Cheers,
Reinoud

rei...@13thmonkey.org

unread,
Jan 18, 2013, 5:10:21 PM1/18/13
to st...@googlegroups.com
Dear Hod,

On Wed, Jan 16, 2013 at 04:31:06PM +0000, Hod Lipson wrote:
> Streamability is not an issue: AMF files can't be streamed anyway because
> they are non-linear - you need the last vertex information before you can
> print anything.

Sure, but you can read the vertices while they come in ;) you don't have to
have received the entire file before you can start parsing :-) Thats also
streaming!

> The reason for including compression in the format is that it would be
> important for producer to produce a compressed file directly, rather than
> produce a non-compressed file and then compress it later. Otherwise, the
> intermediate file could be Gigabytes.

I have to disagree on this. Sure compresson is needed in some cases and
producing compressed files directly might be needed to avoid very big
intermediate files, that we agree on. What i dont agree with is the explicit
requirement/dictation to use ZIP with all its multi-file functionality as a
compression format.

When using libz for example, one can open a compressed file directly just as a
normal file, even on streaming stdout and just write the uncompressed data on
it and it will all be compressed on-the-fly and send out/written out. Its
compression is at par or better than zip. Its a single stream compresser.

Not dictating the compression format also leaves the interfacing parties to
communicate the details. I don't think a compression format should be part of
a standard since compression formats come and go. What might be hot and new
today is old and obsolete later.

Regardless of the chosen format, libz is free software and is free available
if not even standard on all unix systems while zip is very windows specific
but more importantly it used to be, and probably still is, propriatary; sure
there are readers and writers around but its hardly used outside windows
systems.

With regards,
Reinoud

Jacob Barhak

unread,
Jan 18, 2013, 11:42:13 PM1/18/13
to st...@googlegroups.com
Hi Hod,

About streaming. Reinoud has a good point about being able to process information out of order. Who knows it may become useful under certain circumstances, although you are right it is probably not be useful at first.

Never the less, if you think of pipelining rather than streaming then things start making more sense. Consider the Linux approach of having small programs that do small tasks being able to work together using a pipeline. If you think of the standard this way then compression can be left out of the standard. The standard will produce the text stream. The stream can then be written to a pipeline input rather than an actual file and the pipeline output will be compressed using a method of choice. Decoupling has several advantages:
1. Keeps the standard simple and dedicated to represent the object as text
2. Leaves compression preferences to the user
3. Lets every program do best what it does best without replication of effort. And AMF coders should not worry about compression implementation.

The disadvantage will come from:
1. Users that will not compress information and send huge files
2. Printers that will not recognize a compression format used by a user.

One way to avoid this is to rely on the file name as part of the standard and allow several common compression formats that printers should support at the pipeline level. You can even add Error Correction Code or encryption in the middle of the pipeline if you are afraid of transmission problems or file corruption. Yet there is no end to what you can do in the middle of a pipeline. As long as the extensions are standardized in the file name you can rely on other existing tools.

And as for zip and multiple files in an archive. I found it useful to add a Readme file to another file in an archive. Some people may add a license or some other documentation or even the generating CAD file. So if you choose to use the pipeline approach with file extensions then make sure that the file that goes to the printer is the AMF file that starts with the same name before the zip/tar/arj/other extension. This way the printer will know what to process for printing and what is documentation.

In short I recommend addressing file names in the standard. This may even help interoperability between operating systems if you restrict some characters and lower/upper case. Yet this may be overdoing things and out of scope - you decide. Never the less formulating extensions may be helpful.

One question to the implementors. Is there any way that processing can be parallelized using multi core machines or clusters? During AMF creation? During AMF extraction? I understand that the bottleneck is the actual streaming and printing. Yet think about machines with virtually endless memory, or a stream that is processed by an elastic service in the cloud and sent to a machine that just has a controller and buffer for printing. Does this make sense at all? Does this help reduce costs/time?

I hope you find this to the point.

Jacob


Sent from my iPhone
> --
> You received this message because you are subscribed to the Google Groups "STL 2.0" group.

Jesse McGatha

unread,
Jan 19, 2013, 5:40:45 PM1/19/13
to st...@googlegroups.com

On Friday, January 18, 2013 8:42:13 PM UTC-8, Jacob Barhak wrote:

Never the less, if you think of pipelining rather than streaming then things start making more sense. Consider the Linux approach of having small programs that do small tasks being able to work together using a pipeline. If you think of the standard this way then compression can be left out of the standard. The standard will produce the text stream. The stream can then be written to a pipeline input rather than an actual file and the pipeline output will be compressed using a method of choice. Decoupling has several advantages:
1. Keeps the standard simple and dedicated to represent the object as text
2. Leaves compression preferences to the user
3. Lets every program do best what it does best without replication of effort. And AMF coders should not worry about compression implementation.
 
You could just as easily argue that the job of the first component in the pipeline is to decompress and the job of the last component is to recompress. By fragmenting compression options, you lead to much the same problem you have today with dozens of 3D modeling programs that have mismatching import/export options. The benefit of having a standard is that there is only one form of implementation required and everyone implements it symmetrically. Leaving compression preferences to the user is a recipe for disaster in terms of diminished utility for the file format as a standard and ubiquity of adoption. There are many standard libraries for decompressing/recompressing ZIP that other programs can leverage.
 
-Jesse McGatha
Reply all
Reply to author
Forward
0 new messages