Alembic 1.0.4 Released

705 views
Skip to first unread message

Lucas Miller

unread,
Jan 23, 2012, 3:11:26 PM1/23/12
to alembic-...@googlegroups.com, alembic-d...@googlegroups.com
Hello, Alembic users, thanks to your feedback we have another collection of
API additions and small but useful bug fixes.

Some highlights:

API:

 - Now supports raw reads, and reading as a different precision, for array properties
 via a new method getAs (). Currently strings, wstrings, and float16_t
 can not be read as any other type.

 - Added an optional velocity property to polymeshes, subds, nurbs, and curves.
 Velocity is considered to be in units per second.

 - Added a temporary work around for an HDF5 bug where parts of the file could become
 corrupt. This workaround always stores the HDF5 links in dense storage.  This can lead to
 larger file sizes when you have a lot of IObjects.

 - Improved the performance on partial hierarchy traversal by deferring opening of an
 object's HDF5 group until truly needed.

Renderman Procedural:

 - Added support for V3dGeomParam, P3dGeomParam, and BoolGeomParam

Houdini SOP:

 - Dramatically increased the performance under Houdini 12 and pleasantly increased
 performance under 11.1.

Maya AbcImport:

 - Don't interpolate when the time values closely match the ceiling value.

 - Don't create a new color set every time an animated color set is evaluated.

 - Interpolating non indexed color values would sometimes cause a crash.

We welcome your comments on the discussion list.

Helge Mathee

unread,
Jan 24, 2012, 3:39:53 AM1/24/12
to alembic-d...@googlegroups.com, Lucas Miller, alembic-...@googlegroups.com
That's fantastic. I will use the velocities to implement support for
dynamic topology of polygon meshes.

Thanks Lucas!

-H

> --
> You received this message because you are subscribed to the Google
> Groups "alembic-discussion" group.
> To post to this group, send email to alembic-d...@googlegroups.com
> To unsubscribe from this group, send email to
> alembic-discuss...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/alembic-discussion?hl=en
>
> For RSS or Atom feeds related to Alembic, see:
>
> http://groups.google.com/group/alembic-dev/feeds
>
> http://groups.google.com/group/alembic-discussion/feeds

evolutionary theory

unread,
Feb 4, 2012, 11:06:06 PM2/4/12
to alembic-discussion
Hi everyone!

After 5 solid days of work I've just successfully compiled Alembic
1.0.4 with Windows VC++ 2010 in Windows 7 64bit, including glew, hdf5,
zlib ( a new 64bit cleanly-compiling Windows version I arranged with
Madler ), boost, ilmbase, openexr and mayasdk, 3Delight and opengl
support, all statically linked in Release mode thus far.

Everything passes all the unit tests, and thus far, all appears to
work perfectly, including the 3Delight procedural, which can now also
be compiled, without change for prman or 3Delight.

There was a LOT of work to get this running ( more than I care to talk
about ) and a significant amount of code that had to be changed to get
it working on windows, all of which retains the Linux compatibility
via some extra pre-processor macros ( but not THAT many of them ).
I've also fixed the std::min std::max issue that usually plagues cross-
platform compiles for Alembic also, allowing a pretty clean compile
( with less warnings now too ).

I'm currently organizing with my employer about how I can sign your
agreement and commit this all back for others to make use of, cause it
also includes some fixes for Linux as well ( there were a few minor
code errors ).
It now compiles absolutely everything, without error, including all
tests, and all of them seem to run OK.

I know a LOT of people ( that I know at least ) are hanging out for
these build scripts and fixes, so how can I get the ball rolling on
this?

regards,

Luke Emrose - Flying Bark Productions
Rendering and Lighting TD

Luke Emrose

unread,
Feb 5, 2012, 12:14:24 AM2/5/12
to alembic-discussion
Also as another note, it compiles flawlessly against hdf5-1.8.8 as well.
cheers,
Luke

--
You received this message because you are subscribed to the Google
Groups "alembic-discussion" group.
To post to this group, send email to alembic-d...@googlegroups.com
To unsubscribe from this group, send email to
alembic-discuss...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/alembic-discussion?hl=en

For RSS or Atom feeds related to Alembic, see:

http://groups.google.com/group/alembic-dev/feeds

http://groups.google.com/group/alembic-discussion/feeds

bazuka

unread,
Feb 6, 2012, 3:23:32 PM2/6/12
to alembic-discussion
Hi people, 1st i want to say that you made a big step in CG world,

and I have a question about something, well I'm Naiad user (http://
www.exoticmatter.com/) and as we all know fluid files can be really
big >1gb per frame,

I know that alembic is storing all data into one file, but is it
possible to make an option to make an seq of alembic files coz of the
fluid files?

Im just not sure how big could alembic file be if i store all frames
inside one file, so is my request logical?

Best Regards

Lucas Miller

unread,
Feb 6, 2012, 4:49:23 PM2/6/12
to alembic-d...@googlegroups.com
Storing the data all in one file allows Alembic to automatically share the data with HDF5 hard links, storing a file per frame makes it less likely that you'll be storing duplicate data.

Depending on your read and write patterns, you may not want to use Alembic at all. (exceptionally large and sparse data sets  for example)

Instead you may want to use Field3D:

bazuka

unread,
Feb 7, 2012, 6:19:13 AM2/7/12
to alembic-discussion
Hi Lucas,

i understand that, but i wasnt talking about smoke, fire and other
voxels data

but i was talking more about the meshes :)

cheers

On Feb 6, 10:49 pm, Lucas Miller <miller.lu...@gmail.com> wrote:
> Storing the data all in one file allows Alembic to automatically share the
> data with HDF5 hard links, storing a file per frame makes it less likely
> that you'll be storing duplicate data.
>
> Depending on your read and write patterns, you may not want to use Alembic
> at all. (exceptionally large and sparse data sets  for example)
>
> Instead you may want to use Field3D:
>
> http://opensource.imageworks.com/?p=field3dhttps://sites.google.com/site/field3d/

Steve LaVietes

unread,
Feb 7, 2012, 1:24:27 PM2/7/12
to alembic-d...@googlegroups.com
As Lucas mentioned, you lose many of Alembic's advantages when splitting the data into individual frames. Since the library itself provides no mechanism for reading across multiple archives, you'd place that burden on the reader plug-ins. (This is no different than something like .bgeo.)

Depending on your needs, some of the reference reader plug-ins could work mostly as-is with frame sequences -- assuming you control the filename per-frame yourself.

The renderer plugins (and Katana) expect to find all samples relevant to the current shutter window within a single archive. Unless each archive in the sequence had enough samples for motion blur for the current frame, you wouldn't get it motion blur. The Houdini reader should work -- although you'll lose automatic interpolation of transforms via the alembic_xform OTL.

In any case, it'd be a non-standard setup. In what applications would you want to read this data?

-stevel

bazuka

unread,
Feb 7, 2012, 5:13:22 PM2/7/12
to alembic-discussion
Hi Stevel,

thx for ur time and Lucas's too,

i just need to split them, i dont know how Alembic format will handle
1gb per frame, didnt have the chance to try, coz loading something big
inside Maya may take a lots of time and memory,
i liked the reading speed of Alembic format to be honest.

As I already said my pipeline is based on Maya and Vray, i use Naiad
bgeo exporter for meshes then i convert them into vrproxy and thats
it, but i would like to skip some parts and speed up the workflow

cheers


On Feb 7, 7:24 pm, Steve LaVietes <steve.lavie...@gmail.com> wrote:
> As Lucas mentioned, you lose many of Alembic's advantages when splitting
> the data into individual frames. Since the library itself provides no
> mechanism for reading across multiple archives, you'd place that burden on
> the reader plug-ins. (This is no different than something like .bgeo.)
>
> Depending on your needs, *some* of the reference reader plug-ins could work
> mostly as-is with frame sequences -- assuming you control the filename
> per-frame yourself.
>
> The renderer plugins (and Katana) expect to find all samples relevant to
> the current shutter window within a single archive. Unless each archive in
> the sequence had enough samples for motion blur for the current frame, you
> wouldn't get it motion blur. The Houdini reader should work -- although
> you'll lose automatic interpolation of transforms via the alembic_xform OTL.
>
> In any case, it'd be a non-standard setup. In what applications would you
> want to read this data?
>
> -stevel
>
> On Feb 7, 2012, at 3:19 AM, bazuka <bazuka...@gmail.com> wrote:
>
> Hi Lucas,
>
> i understand that, but i wasnt talking about smoke, fire and other
> voxels data
>
> but i was talking more about the meshes :)
>
> cheers
>
> On Feb 6, 10:49 pm, Lucas Miller <miller.lu...@gmail.com> wrote:
>
> Storing the data all in one file allows Alembic to automatically share the
>
> data with HDF5 hard links, storing a file per frame makes it less likely
>
> that you'll be storing duplicate data.
>
> Depending on your read and write patterns, you may not want to use Alembic
>
> at all. (exceptionally large and sparse data sets  for example)
>
> Instead you may want to use Field3D:
>
> http://opensource.imageworks.com/?p=field3dhttps://sites.google.com/s...
> For more options, visit this group athttp://groups.google.com/group/alembic-discussion?hl=en

Nicholas Yue

unread,
Feb 7, 2012, 6:02:11 PM2/7/12
to alembic-d...@googlegroups.com
On 8/02/2012 9:13 AM, bazuka wrote:
> Hi Stevel,
>
> thx for ur time and Lucas's too,
>
> i just need to split them, i dont know how Alembic format will handle
> 1gb per frame,
Hi,

Alembic is based on the HDF5 file format.

http://www.hdfgroup.org/why_hdf/

It has been documented to handle multi-terabytes data sets so 1GB
per frame should be fine.

> didnt have the chance to try, coz loading something big
> inside Maya may take a lots of time and memory,
> i liked the reading speed of Alembic format to be honest.

Speed is one of the key strength of HDF5.

Enjoy Alembic, it's awesome !

Regards

--
Nicholas Yue
Graphics - RenderMan, Houdini, Visualization, OpenGL, HDF5
Custom Dev - C++ porting, OSX, Linux, Windows
http://au.linkedin.com/in/nicholasyue

Moritz Moeller

unread,
Feb 8, 2012, 5:00:29 AM2/8/12
to alembic-d...@googlegroups.com
On 02/07/2012 07:24 PM, Steve LaVietes wrote:
> As Lucas mentioned, you lose many of Alembic's advantages when splitting
> the data into individual frames. Since the library itself provides no
> mechanism for reading across multiple archives, you'd place that burden
> on the reader plug-ins. (This is no different than something like .bgeo.)

What about pipeline considerations?

Commonly one frame is rendered on one blade. Or maybe even split between
blades, if very complex.
I can't cache the frame's dependencies on the blade any more if the file
is multi-terabyte as it contains all the samples. Too much network i/o
and likely to much space taken on the blade's cache if the resp. shop
runs more than one show on the same farm (quite common, I'd say).

My blade is forced to read the resp. parts from the server, /every/ time
it renders a new version of this frame.

Or I must write new code that caches the frame's ABC dependencies out on
the blade (e.g. as new ABCs, containing only the data relevant to the
frame's shutter).

All doesn't sound very clever to me, unless network- and server disk i/o
are non-issues (which they never were, in any place I worked).

So what am I missing here?

.mm

Steve LaVietes

unread,
Feb 8, 2012, 9:56:07 AM2/8/12
to alembic-d...@googlegroups.com, alembic-d...@googlegroups.com
That might be a useful tool for us (or someone) to write for inclusion along with the library.

For formats which separate their samples across multiple files, your local caching tools are essentially doing the same thing. They act with knowledge of the file naming conventions to choose which samples are relevant to your frame That's admittedly simpler but conceptually equivalent.

In our case, we don't locally cache geometry per render blade. We've found the access patterns for geometry data within an hdf5 container to be suitable for random access over the network. Alembic (and its hdf5-based internal predecessor) were designed with that as a goal.

-stevel

Peter Shinners

unread,
Feb 8, 2012, 10:39:02 AM2/8/12
to alembic-d...@googlegroups.com

With Alembic you only pay for the IO you actually use. If you have a 4TB
alembic file and want to read the bounding box data for a few objects on
a single frame, you are only going to transfer a few KB worth of data
over the network (if not less).

Compare this to something like a tiled EXR image, or a renderman TEX
image with mipmaps. Even if you have a 12k texture on a small object,
the renderer will only read the smallest levels of mipmaps over the network.

Moritz Moeller

unread,
Feb 8, 2012, 12:26:46 PM2/8/12
to alembic-d...@googlegroups.com
On 02/08/2012 04:39 PM, Peter Shinners wrote:
> With Alembic you only pay for the IO you actually use. If you have a 4TB
> alembic file and want to read the bounding box data for a few objects on
> a single frame, you are only going to transfer a few KB worth of data
> over the network (if not less).

This detail does not escape me. I was one of the early adopters of HDF5
in VFX, following Colin Doncaster's advice when I joined Rising Sun
Pictures in 2005, I based their fur baking pipeline on it.
So I have some background with HDF5. :)

Nevertheless, if my fluid data for a single sample is 1GB and the shot
goes through 50 revisions (I've seen shots that went to over 150
revisions) I will have generated 50GB of i/o *for a single blade* with
no gain other than an abstraction change.

Mind you, a file system is nothing but an abstraction of the
organization of data on disk.
An ABC file is another layer of abstraction on top. So you added one
layer of abstraction and made the job of the pipeline people harder.

That's why I asked what I am missing here. For me the idea of storing
all samples in a single file is, frankly, bollocks.

> Compare this to something like a tiled EXR image, or a renderman TEX
> image with mipmaps. Even if you have a 12k texture on a small object,
> the renderer will only read the smallest levels of mipmaps over the
> network.

Yes, and nevertheless people have been caching these for years. I
replicated 3Delight's built-in network cache for for PRMan while at
DNeg, for that very reason: https://code.google.com/p/jupiterfilecache/

There is some musing about the reasoning behind this
https://code.google.com/p/jupiterfilecache/wiki/Shadeop

3Delight has done the opposite of what you suggest to do with ABC and
has moved individual mip levels into single files organized into a
folder called a 'directory texture'. The renderer deals with these
automatically.

Why?

From the data pov, there is zero difference between having a single
texture that contains 10 mip levels or 10 textures that contain one
each, in a directory.

But from a pipeline perspective, the difference is huge. Because all
operating systems have hundreds of built-in tools that deal with this
thing called 'files'.
But they have zero tools to deal with some unknown data in a proprietary
container like HDF5 or a texture file.

The fact that Steve LaVietes tells me a tool that extracts parts of ABC
files (into new ABC files) to cache them locally would be a "good idea"
seems to further second that the idea of having an ABC file contain all
samples is based on either poor or at least lopsided understanding of
data access patterns on various shows at various places.

At DNeg texture access was giving us trouble on the servers. Guess what:
it was only 12 odd textures on a single creature.
But that creature was instanced some 5k times in a single frame. The
server disks couldn't cope with the amount of random access (bandwidth
and network latency were /not/ an issue).
There was more but basically it came down to: avoid random access of
data living on the server.
Caching on blades is the easiest solution.
And the easiest implementation is at the file level because it does not
require to write new tools, just some minor glue like e.g. the above
Google code project.

That is why I take the liberty of questioning if storing all samples in
a single file makes any sense. To me it does not, quite the opposite.

Maybe buying new hardware all the time such issues arise is an option
for ILM. For the vast majority of shops it is not, I can assure you of that.


.mm

Steve LaVietes

unread,
Feb 8, 2012, 12:43:16 PM2/8/12
to alembic-d...@googlegroups.com
I'm reluctant to wade into this further -- as I agree with mostly
everything you've said on the surface. You know your own pipelines
best and I don't presume to try to convince you otherwise.

Yes, there are cases in which long sequences of heavy samples could be
onerous to manage as a single file in pipelines which cache ALL render
inputs locally.

Yes, Alembic ships with an abcstitcher utility. Why not also include
an abcsplitter? That in itself is not an argument against storing
common sequences of geometry data in a single file.

Comparison to texture data is misleading though.

The access patterns for texture (and texture-like) data at render-time
differ considerably from that of regular scene geometry. Texture data
is randomly accessed (paging in and out of the renderer's texture
cache) during shading. Shading vastly outweighs geometry reads (from a
baked format) in a typical render. (I digress. Since Alembic isn't for
texture data, it's not particularly relevant to the discussion
thread.)

No one denies you your liberty to question this design decision based
on your own experiences. But please don't assume that our decisions
were made with some mythical ability to "buy new hardware all the
time."

-stevel

Jonathan Gibbs

unread,
Feb 10, 2012, 8:54:33 PM2/10/12
to alembic-d...@googlegroups.com
This is a good topic, and we've been testing along some similar lines
too. It's all quite a black art and we're finding that which is better
is quite hard to pin down.


On Wed, Feb 8, 2012 at 9:26 AM, Moritz Moeller <virtu...@gmail.com> wrote:
> Yes, and nevertheless people have been caching these for years. I
> replicated 3Delight's built-in network cache for for PRMan while at
> DNeg, for that very reason: https://code.google.com/p/jupiterfilecache/

We've been running a very very similar system here since before my
time. And it's clear that such a system and Alembic are not great
bed-fellows. There was a time when turning off our equiv system would
have been really bad, but today we're not so sure. We have a lot more
caching in the network itself that we used to.

One thing I've learned is that all assumptions about network and
servers and such have to really be re-evaluated often!

I love Alembic's structure, and we'll soon see how that plays out on
*our* network. (Which is certainly designed differently than yours.)

> 3Delight has done the opposite of what you suggest to do with ABC and
> has moved individual mip levels into single files organized into a
> folder called a 'directory texture'. The renderer deals with these
> automatically.

I do know that our systems guys would be mad if we did this. They are
constantly telling us to have fewer larger files rather than more
smaller files. It's counter intuitive to me, who always used to think
about bytes transferred as a measure of performance. But in our case
all the NFS ops for all the small files were really hurting things. We
were probably at the extreme of the many-small-files spectrum, but
even we didn't break out mip-levels.


> From the data pov, there is zero difference between having a single
> texture that contains 10 mip levels or 10 textures that contain one
> each, in a directory.

This isn't true!

> But from a pipeline perspective, the difference is huge. Because all
> operating systems have hundreds of built-in tools that deal with this
> thing called 'files'.

I do agree. It feels like we're being dragged painfully into some kind
of post-file-era. It's scary and fun!

--jono

Moritz Moeller

unread,
Feb 10, 2012, 11:30:51 PM2/10/12
to alembic-d...@googlegroups.com
On 02/11/2012 02:54 AM, Jonathan Gibbs wrote:
> I love Alembic's structure, and we'll soon see how that plays out on
> *our* network. (Which is certainly designed differently than yours.)

I have no idea how DNeg's network looks nowadays. I only worked there
for a year and this was just one example. The scenario I described was
encountered numerous times during my career.

I left production 3 years ago and now write software for people who do
the job I did for 15 years, before.
So my pov may be outdated (though I doubt it, really). :]

>> 3Delight has done the opposite of what you suggest to do with ABC and
>> has moved individual mip levels into single files organized into a
>> folder called a 'directory texture'. The renderer deals with these
>> automatically.
>
> I do know that our systems guys would be mad if we did this. They are
> constantly telling us to have fewer larger files rather than more
> smaller files.

Yes, exactly. This only makes sense when you think about caching. Which
your system guys probably don't. Because these only make sense when you
use client-side caching, of course. No one would use directory textures
if they didn't use caching. Because the texture would be accessed on the
server so the amount of data would be the tiles requested. Every time
the frame renders. And that would likely be (much) less than the
texture's size. Every time the frame renders. So it would ve exactly
what your system guys do /not/ like.
However, when you cache, the resp mip level texture get transferred to
the client /once/ (assuming the client's cache does not overflow). Not
every time a frame renders.

And as far as Alembic goes, its design is counter what your system guys
tell you too. Because the ABCs do /not/ get transferred to the client if
you don't cache, only (smallish) parts of them do: those samples that a
frame requests.
Exactly what you don't want because that causes the access patterns that
server disks have trouble with.

Small files or big files of which small chunks are read are the exact
same thing, from the hardware's perspective.
This was my point exactly why storing multiple samples in a single file
is a bad idea.

It leaves you three choices:
1. hammer the server disks with a 'small file access pattern' (because
HDF5 is so good at that) [bad for the server disks]
2. transfer the ABC to the client so you get the best from the network
and disks which behave well when huge, continuous chunks of data are
transferred [bad because most of the data doesn't need to be transferred
for starters, so you waste a shitload of network bandwidth and server
disk access time]
3. write some tool that extracts the relevant sample data and transfers
it to the client [best, but requires writing tools for something that
could be done at the file level with tools that are part of the OS already]

>> From the data pov, there is zero difference between having a single
>> texture that contains 10 mip levels or 10 textures that contain one
>> each, in a directory.
>
> This isn't true!

Can you elaborate why you think this isn't true?

>> But from a pipeline perspective, the difference is huge. Because all
>> operating systems have hundreds of built-in tools that deal with this
>> thing called 'files'.
>
> I do agree. It feels like we're being dragged painfully into some kind
> of post-file-era. It's scary and fun!

Fun? When I worked in Aus we sometimes had more than one beer in the
lunch break. Made even the most weird TD work 'fun' afterwards. :)

But seriously: can you think of any single good reason why one should
store all these samples in a single file?
I can't come up with any so far. Hence my original question what I may
be missing. So far I don't feel it has been answered at all.


.mm

Steve LaVietes

unread,
Feb 10, 2012, 10:42:27 PM2/10/12
to alembic-d...@googlegroups.com
This assumes that it happens as you'd expect with Alembic archives in practice.

We access our Alembic (and Alembic-like) geometry from relatively inexpensive uncached network drives and it's never been the problem as our core count has dramatically increased. (Texture and f3d data is another story altogether).

-stevel

Doug Epps

unread,
Feb 10, 2012, 10:43:24 PM2/10/12
to alembic-d...@googlegroups.com

On Feb 10, 2012, at 8:30 PM, Moritz Moeller wrote:


But seriously: can you think of any single good reason why one should
store all these samples in a single file?
I can't come up with any so far. Hence my original question what I may
be missing. So far I don't feel it has been answered at all.


Naive question, but aren't "many files" (e.g. one per frame) in a directory a bad-access pattern from a server's perspective ?  I think I've run across many a server that stores inodes in a linear-list.  So finding frame N out of 1000s isn't "fun".

Whatever happened to cachefs ?


Moritz Moeller

unread,
Feb 10, 2012, 11:52:54 PM2/10/12
to alembic-d...@googlegroups.com, Steve LaVietes
On 02/11/2012 04:42 AM, Steve LaVietes wrote:
> This assumes that it happens as you'd expect with Alembic archives in
> practice.

The topic that brought up the debate was single samples, inside the ABC,
in the GB range.
Is this the kind of data size you are talking about?

And if so, what would be a reason that I should not see happening what I
expect? If anyone who has been storing large geo samples in ABCs and can
talk about the effects this had (or, surprisingly, didn't have) on their
pipeline, I'd be very interested to hear what they have to tell.

.mm

Moritz Moeller

unread,
Feb 10, 2012, 11:54:05 PM2/10/12
to alembic-d...@googlegroups.com
On 02/11/2012 04:43 AM, Doug Epps wrote:
> Naive question, but aren't "many files" (e.g. one per frame) in a
> directory a bad-access pattern from a server's perspective ? I think
> I've run across many a server that stores inodes in a linear-list. So
> finding frame N out of 1000s isn't "fun".

Maybe an even more naive question but: 1000s? Most sequences I worked on
were in the range of secs.
To get into 1000s, in a single folder, you'd need a sequence that is >83
seconds. :]

Is that the common case, nowadays?

.mm

Steve LaVietes

unread,
Feb 10, 2012, 11:03:41 PM2/10/12
to Moritz Moeller, alembic-d...@googlegroups.com
Specifically framed to that, no, our typical asset is rarely in the GB-per-sample ballpark.

For the more common cases, the benefits of storing the sequence within a single archive:

1) greater likelihood of *automatic* data de-duplication. This has had a significant impact on the overall disk usage at both contributing facilities.
2) more convenient detection of available samples without a table-of-contents file or scanning directories.

Nothing about the format prevents you from storing the samples in separate archives. As previously noted, the library itself and existing reference implementations don't currently provide direct support for that workflow. But they don't rule it out either.

-stevel

Rob Bredow

unread,
Feb 11, 2012, 2:13:45 AM2/11/12
to alembic-discussion

The decision to migrate from a system using many individual files to
fewer larger HDF files was made after consulting with multiple parties
across the industry including people designing current and next
generation file-systems. This is not to say it will be better in every
single case, but in our environment the performance improvement is
dramatic. Alembic read/writes are responsible for only a trivial
amount of load on our filers (a big improvement from our past
experiences.)

But, like Steve says, if it works better for your workflow to break
the archive up into a file per frame (or for that matter in any way
you see fit) feel free to try it out. I'll be interested to hear the
results of any testing.

Rob

Luke Emrose

unread,
Feb 11, 2012, 5:11:58 AM2/11/12
to alembic-d...@googlegroups.com
Just out of curiosity, you guys mentioned "*automatic* data de-duplication" quite a few times in posts and Alembic information.
What do you actually mean by this?  Do you mean instancing?  Or some other type of other system for removing "duplicate" data?

I'm quite curious, as I've thought about this for a while, and aside from compression ( which is some ways does that anyway ), I can't imagine how you would do this or what you would do this on?

Thanks in advance.

--
You received this message because you are subscribed to the Google
Groups "alembic-discussion" group.
To post to this group, send email to alembic-d...@googlegroups.com
To unsubscribe from this group, send email to
alembic-discuss...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/alembic-discussion?hl=en

For RSS or Atom feeds related to Alembic, see:

http://groups.google.com/group/alembic-dev/feeds

http://groups.google.com/group/alembic-discussion/feeds



--

Colin Doncaster

unread,
Feb 11, 2012, 9:19:13 AM2/11/12
to alembic-d...@googlegroups.com
If a sample doesn't change Alembic will share the data across samples vs. storing the same data twice.  It's an extremely elegant solution and I'm sure it wasn't easy to implement. 

Aghiles

unread,
Feb 11, 2012, 10:28:57 AM2/11/12
to alembic-discussion


On Feb 11, 9:19 am, Colin Doncaster <colin.doncas...@gmail.com> wrote:
> If a sample doesn't change Alembic will share the data across samples vs. storing the same data twice.  It's an extremely elegant solution and I'm sure it wasn't easy to implement.


It is called a hash table! :) So I don't think it is a real challenge
to implement but it is a nice idea indeed in this context.

-- aghiles


>
> On 2012-02-11, at 5:11 AM, Luke Emrose wrote:
>
>
>
>
>
>
>
> > Just out of curiosity, you guys mentioned "*automatic* data de-duplication" quite a few times in posts and Alembic information.
> > What do you actually mean by this?  Do you mean instancing?  Or some other type of other system for removing "duplicate" data?
>
> > I'm quite curious, as I've thought about this for a while, and aside from compression ( which is some ways does that anyway ), I can't imagine how you would do this or what you would do this on?
>
> > Thanks in advance.
>
> > On 11 February 2012 15:03, Steve LaVietes <steve.lavie...@gmail.com> wrote:
> > Specifically framed to that, no, our typical asset is rarely in the GB-per-sample ballpark.
>
> > For the more common cases, the benefits of storing the sequence within a single archive:
>
> > 1) greater likelihood of *automatic* data de-duplication. This has had a significant impact on the overall disk usage at both contributing facilities.
> > 2) more convenient detection of available samples without a table-of-contents file or scanning directories.
>
> > Nothing about the format prevents you from storing the samples in separate archives. As previously noted, the library itself and existing reference implementations don't currently provide direct support for that workflow. But they don't rule it out either.
>
> > -stevel
>

Jonathan Gibbs

unread,
Feb 11, 2012, 11:44:29 AM2/11/12
to Moritz Moeller, alembic-d...@googlegroups.com
On Fri, Feb 10, 2012 at 8:29 PM, Moritz Moeller
<real...@virtualritz.com> wrote:
> Yes, exactly. This only makes sense when you think about caching. Which
> your system guys probably don't. Because these only make sense when you
> use client-side caching, of course.

Our systems guys certainly think about caching a LOT. In our network
there are many forms of caching besides local-on-the-box, and most of
those, I believe, are block-level caching, not file-level caching. So
you don't have all the problems you are worried about.

> Small files or big files of which small chunks are read are the exact
> same thing, from the hardware's perspective.

But not from NFS's perspective. For instance, the caches all need to
be checking to see if they have the up-to-date version of the file. If
there are many small files, they have many checks to do. If there are
fewer larger files, fewer checks. I'm told that while these modern
systems can move a lot of bytes, they can get easily swamped by these
kinds of checks. NFS ops, etc, and at this point it's clear to
everyone I'm not an NFS expert!

>>> From the data pov, there is zero difference between having a single
>>> texture that contains 10 mip levels or 10 textures that contain one
>>> each, in a directory.
>>
>> This isn't true!
>

> Can you elaborate why you think this isn't true?

Mostly what I said about. 10 files require more NFS ops, more checks
from the caches, than a single file.

> But seriously: can you think of any single good reason why one should
> store all these samples in a single file?

I can think of many. Makes file management and version control easier.
Better performance over the network in "playback" situations (more
just raw data transfer, less network overhead), and I think a big
reason for us is that it saves a lot of disk space. In a run of 250
subd models, there is lots of redundant information and Alembic
naturally eliminates that. That's less data to transfer too.

I'm sure those using Alembic more than we have yet can give you more.

--jono

Lucas Miller

unread,
Feb 11, 2012, 11:52:30 AM2/11/12
to alembic-d...@googlegroups.com
Yes!
It is using a hash table on write to decide which samples are across the entire file are the same and uses HDF5 links to prevent rewriting the data.

The key is written with the data for possible use at read time across files.

Lucas

Aghiles

unread,
Feb 11, 2012, 11:59:55 AM2/11/12
to alembic-discussion
Hello Rob,

I happen to have worked a long time on file systems as well
as designed one from scratch for embedded Unix systems. That
was some years ago though, before starting the 3Delight project.
I would be interested to hear what was the argument of the
people saying that having one file is better than many files.

I think it depends.

And in our particular field I tend to think that these people are
wrong by a large margin. They could be wrong because they
might have been asked the wrong question though. I am not
an Alembic expert but the question I would ask is: what is the
best architecture to access data through a _network_.  This
is obvious because 99% of the cases network will be your
bottleneck (if you have one computer, having HDF5 or many
files won't make a noticeable difference anyway). So the question
shouldn't even land in the hands of people writing file systems
in the first place! ;)

1. One file scenario
- An artist access some data on frame X
- Message goes through NFS/SMB or some other protocol to server
- Server checks cache for data and returns it if possible.
- Server accesses HDF5 file in the desired range
- Server caches data for future use
- Server returns data  (most time consuming op)

2. Many files scenario
- An artist access some data on frame X
- Message goes through NFS/SMB or some other protocol to server
- Server checks cache for data and returns it if possible.
- Server accesses file in the desired range.
- Server caches data for future use
- Server returns data  (most time consuming op)

So basically no difference at all! There won't be _any_ speed
difference either. But now, having many files makes it easy
to do the following:

2bis. Many files scenario

- An artist access some data on frame X using Alembic
- If file is already in local cache, return data (see last step)
- Message goes through NFS/SMB or some other protocol to server
- Server checks cache for data and returns it if possible.
- Server accesses file in the desired range
- Server caches data for future use
- Server returns data (most time consuming op but now only once in
a while)
- Alembic stores file in an LRU local cache for future use.

So we basically added two operations, one on top and one at the
end and guess what happens: you just accelerated Alembic by
an order of magnitude at least, in most cases. Because you are
able to cache a small part of your data very easily and
_naturally_.

Now, I am not saying that you can't do this with HDF5, but why
all the hassle ?

Most importantly -and regardless of the final solution, I think
it is a very urgent matter to position the problem in the realm
of network IO performance. This will make all the difference
to Alembic and to the end users.

-- aghiles

P.S. If you want to do something extreme, transform an Alembic
file into a git repository. It handles duplication naturally,
it is compressed, it has logs and history, it can be accessed
concurrently, it is maintained by people who now what they are
doing and it has very good performance. You can checkout
parts of it for local caching if you want to (sparse checkouts).
That would be like waking up in the 21s century. :)

Jonathan Gibbs

unread,
Feb 11, 2012, 1:47:53 PM2/11/12
to alembic-d...@googlegroups.com
To be more specific, imagine a series of polygon models. Positions and
normals probably change from frame to frame, but UVs probably do not.
Alembic will automatically store the UVs only a single time, and the
positions and normals once per frame.

But if in one case the UVs do change, it will store them more than once.

This is all automatic, and is one of the big wins of Alembic.

--jono

Jonathan Gibbs

unread,
Feb 11, 2012, 1:56:43 PM2/11/12
to alembic-d...@googlegroups.com
Two things stand out to me:

On Sat, Feb 11, 2012 at 8:59 AM, Aghiles <aghi...@gmail.com> wrote:
> - Server returns data  (most time consuming op)

This makes intuitive sense, but may not always be true. We've found
that networks are designed to serve lots of data, but less robust at
handling all the other NFS ops. With many small files, then other NFS
ops can be the bottle neck.

In the extreme, we've been repeatedly told that transferring one large
file to, say, India, is much more efficient than transferring many
small files.

> - An artist access some data on frame X using Alembic
> - If file is already in local cache, return data (see last step)

This also assumes that caching is by-file. Most of the caches in our
network (and there are many, the blades do not just talk directly to
the file server) are block-based caches. These actually perform better
with the larger files than the many smaller files as well.

IANASG (I am not a systems guy).
--jono

Aghiles

unread,
Feb 11, 2012, 5:10:29 PM2/11/12
to alembic-discussion


On Feb 11, 1:56 pm, Jonathan Gibbs <jonogi...@gmail.com> wrote:
> Two things stand out to me:
>
> On Sat, Feb 11, 2012 at 8:59 AM, Aghiles <aghil...@gmail.com> wrote:
> > - Server returns data  (most time consuming op)
>
> This makes intuitive sense, but may not always be true. We've found
> that networks are designed to serve lots of data, but less robust at
> handling all the other NFS ops. With many small files, then other NFS
> ops can be the bottle neck.

I am not talking by "intuitivness" :) I am talking by years of
experience
writing system and network applications, down to the BIOS level (and
unfortunately, even lower).

Your argument about amount of ops is not correct: you will have the
same
amount of ops regardless if that is one file or many files since the
NFS client
asks the same amount of operations to the NFS server. Not counting the
first
open operations, which would be <1% of all operations. You are
argument
is valid in one way though: you should always minimize the amount of
IO
ops, be it NFS or other. For example, instead of reading one file a
byte at
a time, read it in one big read :)

> In the extreme, we've been repeatedly told that transferring one large
> file to, say, India, is much more efficient than transferring many
> small files.

This example cannot be taken to design Alembic. You are talking
about sporadic file transfers and we are talking about sustained
access in a networked studio environment.

(I would like to add that there is no scientific reason on why
transferring many files is slower than transferring one big file,
assuming the files are large enough.
The only reason I can see is the work _you_ would have to do by
hand. )

>
> > - An artist access some data on frame X using Alembic
> > - If file is already in local cache, return data (see last step)
>
> This also assumes that caching is by-file.

No, it assumes that when you work, you don't access the entire
Alembic file all the time, but parts of it. Such as working on one
frame at a time. Which is logical but not certain I agree. In this
scenario, you can cache one part of the file instead of copying
the entire HFS5 file!

> network (and there are many, the blades do not just talk directly to
> the file server) are block-based caches. These actually perform better
> with the larger files than the many smaller files as well.

Not true! Block based caches, are file agnostic because they
are .............. block based caches !!! A good example of a block
based cache is the Linux disk cache. Do you think it really cares
if you have one or many files ? If you can be more specific about
the block-based caches you are using, we can analyse with more
seriousness what would be the impact in that particular case.

> IANASG (I am not a systems guy).

I am. :)

I fear that the decision on working with one big file came out of
"intuitivness" and not out of hard numbers ... which akin to
taking a bad fork in the road.

-- aghiles

Aghiles

unread,
Feb 11, 2012, 5:22:53 PM2/11/12
to alembic-discussion


> I fear that the decision on working with one big file came out of
> "intuitivness" and not out of hard numbers ... which akin to
> taking a bad fork in the road.

All this discussion reminds of a very famous program that once had
all its data in one database. It ended in a serie of mini-disasters.
And the programmers, after some heated arguments, switched to a very
simple directory/file format for the application, as the default.

The program ? SVN.

-- aghiles

Rob Bredow

unread,
Feb 11, 2012, 11:06:00 PM2/11/12
to alembic-discussion


On Feb 11, 2:22 pm, Aghiles <aghil...@gmail.com> wrote:
> > I fear that the decision on working with one big file came out of
> > "intuitivness" and not out of hard numbers ... which akin to
> > taking a bad fork in the road.
>

Our decisions were informed by real world production testing which
were conclusive for our environments. You are free to implement
Alembic in any way you see fit for your environment and I'm sure we
would all be interested in seeing your results.

Rob




Aghiles

unread,
Feb 11, 2012, 11:59:59 PM2/11/12
to alembic-discussion


On Feb 11, 11:06 pm, Rob Bredow <rob.bre...@gmail.com> wrote:
> On Feb 11, 2:22 pm,Aghiles<aghil...@gmail.com> wrote:
>
> > > I fear that the decision on working with one big file came out of
> > > "intuitivness" and not out of hard numbers ... which akin to
> > > taking a bad fork in the road.
>
> Our decisions were informed by real world production testing which
> were conclusive for our environments.

As explained in my post, you won't see any difference in speed between
one file/many files. But the many files scenario is a much more
flexible representation when considering performance enhancements. It
would be interesting to see the numbers in your particular case.

> You are free to implementAlembicin any way you see fit for your environment and I'm sure we
> would all be interested in seeing your results.

I understand : "Just shut up and code it if you want". :)

I wouldn't touch one code of line in Alembic since I have 0 time. But
I though that an informed opinion, from someone who actually knows
pretty much about the specifics about network IO, disk IO, caching and
general OS architecture, might be of some help.

I am afraid that you are making the wrong design decision for
Alembic.

-- aghiles

Moritz Moeller

unread,
Feb 12, 2012, 6:42:17 AM2/12/12
to alembic-d...@googlegroups.com
On 02/12/2012 05:06 AM, Rob Bredow wrote:
> Our decisions were informed by real world production testing which
> were conclusive for our environments. You are free to implement
> Alembic in any way you see fit for your environment and I'm sure we
> would all be interested in seeing your results.

Unless I miss sth, Alembic has no support for redundancy when samples
are spread over multiple files.
Meaning, if I have static data (UVs, topology etc.) and I spread my
samples over multiple ABC files, such static data is duplicated in each
sample file.

Is this assumption correct?

.mm

Simon Legrand

unread,
Feb 12, 2012, 11:16:16 AM2/12/12
to alembic-d...@googlegroups.com
With all due respect,

I'm not sure you could call this a 'wrong design decision'.
Alembic is being adopted by a lot of studios at the moment and the reason for that is that it's achieving the results people want from it.
There are many ways to split scene data. One of them is dividing the data on a per frame basis, sure, but you can also divide it on a per object/group/asset basis.

This discussion started with someone bringing up the test case of a large Naiad mesh. It is true that if that is what one wants to do with alembic, one very large file seems like a very counter intuitive thing. However there are other ways to bring in a Naiad mesh into your favourite renderer. The EMP format was designed for this very purpose, so why not use it?

As far as I'm aware Alembic was designed more with the creature/animated asset scenario in mind. The studios I've worked at who had designed their own bake format (I won't name them but they're up there) all had a single file approach to baking creatures and assets and a multiple file approach to baking FX mesh data.

It's pretty clear that the two approaches apply well to different purposes. Alembic was not designed as a Naiad mesh cache.

Calling it wrongly designed is like calling a butter-knife  wrongly designed because it doesn't cut steak. (have you try applying butter on toast with a steak knife?) :)

All silly analogies aside. I believe the single file approach of alembic helps saves huge amounts of data on constant topology by saving a duplication of mesh description per frame. That's kind of the point of Alembic really.

On non-constant topology meshes these advantages don't apply. Therefore, don't use alembic for Naiad meshes. It's that simple really.

Simon.

--
You received this message because you are subscribed to the Google
Groups "alembic-discussion" group.
To post to this group, send email to alembic-d...@googlegroups.com
To unsubscribe from this group, send email to
alembic-discuss...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/alembic-discussion?hl=en

For RSS or Atom feeds related to Alembic, see:

http://groups.google.com/group/alembic-dev/feeds

http://groups.google.com/group/alembic-discussion/feeds



--
Simon Legrand
http://slegrand.blogspot.com/

Aghiles

unread,
Feb 12, 2012, 2:31:34 PM2/12/12
to alembic-discussion

On Feb 12, 11:16 am, Simon Legrand <legrand.si...@gmail.com> wrote:

> I'm not sure you could call this a 'wrong design decision'.

In terms of _performance_, you certainly can call it that way. It
doesn't mean
that it makes the software worse in other aspects (it could make it
better
even, but not in this case I am afraid because of the additional level
of
abstraction).

> Alembic is being adopted by a lot of studios at the moment and the reason
> for that is that it's achieving the results people want from it.

Absolutely. Which is great. One more reason to care about performance.

> This discussion started with someone bringing up the test case of a large
> Naiad mesh.

In this same discussion I see some talks about performance of
accessing many small files vs. one big file. There were wrong
assumptions and I though it might be a good idea to say something
about that.

> All silly analogies aside. I believe the single file approach of alembic
> helps saves huge amounts of data on constant topology by saving a
> duplication of mesh description per frame. That's kind of the point of
> Alembic really.

You can have exactly the same nice properties with the many file
scenario. For example, if an Alembic project was a directory, each
subdirectory could
represent one frame. If there is a geometry in frame X that is the
same as in frame Y then simply reference that geometry.
Now, with this structure, you can easily accelerate your software for
network IO as explained in a previous post.

If performance is not a design principle in Alembic, please disregard
my posts. If you do _really_ care about performance, all major design
decisions
should be backed by hard numbers (isn't that obvious?). Those are
difficult to obtain and necessitate a large amount of tests. Most
importantly, is to properly test the network IO scenario.

-- aghiles

Moritz Moeller

unread,
Feb 12, 2012, 5:26:17 PM2/12/12
to alembic-d...@googlegroups.com, Simon Legrand
On 02/12/2012 05:16 PM, Simon Legrand wrote:
> With all due respect,
>
> I'm not sure you could call this a 'wrong design decision'.
> Alembic is being adopted by a lot of studios at the moment and the
> reason for that is that it's achieving the results people want from it.

Frankly, what did we have before? OBJ, the craptacular FBX and your
garden variety GTO (because all studios had their own GTO variety and
never pushed their changes back to the community upstream, they could
never exchange data with any other place using GTO). These are the
formats most apps support out of the box or through publicly available
plugins.

My personal opinion is that anything that is better than the above three
would have a good chance of being adopted, even more so if it has ILM
and SPI behind it as names. And clearly, Almbic is a big step forward
from that pov.

How you reason from this that Alembic is well designed/though out in all
regards, particularly the one we are debating here and whose
understanding requires a lot of special knowledge, eludes me though.
I think it is fallacy.

I've seen this industry adopt many things over the 16 years I've been
working in it. And a lot of them were very flawed. It was just that
there wasn't anything better.

If people adopt your stuff it means you are doing /something/ right. And
Alembic gets a lot of things right.
But it is short sighted too to infer that you are doing /all/ things right.
Maya 1.0 was full of bugs and nevertheless people jumped on it. There
simply wasn't anything better at the time.

> As far as I'm aware Alembic was designed more with the creature/animated
> asset scenario in mind. The studios I've worked at who had designed
> their own bake format (I won't name them but they're up there) all had a
> single file approach to baking creatures and assets and a multiple file
> approach to baking FX mesh data.

And no one ever has issues with this approach. The issue was mostly lack
of redundancy (static, temporally sparse data was duplicated for every
sample).

> It's pretty clear that the two approaches apply well to different
> purposes. Alembic was not designed as a Naiad mesh cache.
>
> Calling it wrongly designed is like calling a butter-knife wrongly
> designed because it doesn't cut steak. (have you try applying butter on
> toast with a steak knife?) :)

As far as I recall the people who spoke up here against the single file
approach did not even mention the Naiad case in recent mails any more
because it didn't matter.
The point was simply that you can do at the file system level just as
well as what you can do at the file level.

And that indeed covers the other cases (e.g. an animated creature) too.

> All silly analogies aside. I believe the single file approach of alembic
> helps saves huge amounts of data on constant topology by saving a
> duplication of mesh description per frame. That's kind of the point of
> Alembic really.

You can do that in the single sample per file case easily too. Just
reference the single file that contains all the static data in each
sample.
Or if you have data that changes with a different time frequency than
other data, write that into file samples spaced at that very frequency
and reference them in the resp. samples at the highest frequency
(usually the vertex animation).

Basically do exactly what Alembic does, but at the file level. This also
means that you save even more data because all samples of the same
animated mesh can reference *one* topology/uv/other static data file for
the whole sequence/show!

Or to use the file analogy: ever heard of symbolic links? :)

In conclusion I have to agree with Aghiles.
So far I have not seen anyone making sound point that backs up the
single file for all samples decision. Creature or fluid mesh makes no
difference, really.

.mm

Jonathan Gibbs

unread,
Feb 12, 2012, 10:36:28 PM2/12/12
to alembic-d...@googlegroups.com
On Sat, Feb 11, 2012 at 2:10 PM, Aghiles <aghi...@gmail.com> wrote:
> I am not talking by "intuitivness" :)

I did not mean any insult there. It's hard enough to do this analysis
on a single network, let along generalize for all network setups and
configurations.

I personally think the Alembic team make the right choice wrt the
areas we've been discussing, but if others have different needs I look
forward to seeing alternative development, perhaps even in a
compatible way. For instance the Alembic API could be preserved but
under the hood maybe there are actually more files (or no files!)

It does impose some changes on how we do things (the local file-based
cache on render farm machines makes less sense, I'd want all caches to
be block-based), there are some issues when updating selective frames,
but so far it seems like the right trade-off to me.

> Your argument about amount of ops is not correct: you will have the
> same amount of ops regardless if that is one file or many files since the
> NFS client asks the same amount of operations to the NFS server.

This isn't my experience, but perhaps I'm using the wrong language.
When you have a network cache, it has to periodically check back with
the origin server to see if it has the correct version of the file.
When there are large numbers of small files, there are a lot more
questions to ask. Whether you think this is significant or not, there
are differences.

--jono

Drew Whitehouse

unread,
Mar 28, 2012, 8:36:04 PM3/28/12
to alembic-d...@googlegroups.com
Hi all,

I had a chat with some of our system programmers (we deploy Lustre+DMF/CXFS+Sam/QFS) about this and apparently the critical bottleneck in large distributed filesystems these days is access to the metadata servers (where the information about files location etc. is stored), not bandwidth to the actual data and the disk it's located on. It's for this reason that large files are better than lots of smaller files. E.g.  Say an application needs to check the size of it's input file with a stat() operation, for 1 file and 1 user that's 1 operation. For 1000 files and 1 user it's 1000 operations, for 1000 files and 1000 users it's 1,000,000 operations. 1000 users could be e.g. a renderfarm doing a comp render and checking the existence and size of 1000 images on startup, you get an idea of the scaling problems involved. So it's not about the performance or ease of use of a particular scheme for one or a few clients, it's when you get thousands that these issues become important. Unfortunately past experiences with efficient operations are not necessarily relevant in the modern hardware environment.

The issue is even more important  when you use tape backed hierarchical storage where all the new data on the disks is regularly streamed to tape and snapshots of the inodes are recorded and indexed for future rebuilds, usually multiple copies to multiple devices/sites for safety.

I think the issue of caching is ripe for improvement. How many CG applications read data according to fully qualified file paths ? It's far better to have *uniquely named assets* that can live anywhere from slow network storage to local SSD's or even /dev/shm. Then applications can read it from the closest available location, and stash a local copy if it's likely to need it again, ideally without *without* ever hitting the critical metadata servers. This is particularly useful in the case of read-only assets. See http://en.wikipedia.org/wiki/Content-addressable_storage for more info on this concepts.

-Drew

--
You received this message because you are subscribed to the Google
Groups "alembic-discussion" group.
To post to this group, send email to alembic-d...@googlegroups.com
To unsubscribe from this group, send email to
alembic-discuss...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/alembic-discussion?hl=en

For RSS or Atom feeds related to Alembic, see:

http://groups.google.com/group/alembic-dev/feeds

http://groups.google.com/group/alembic-discussion/feeds



--
Drew Whitehouse
NCI Vizlab
Australian National University


Luke Emrose

unread,
Mar 28, 2012, 8:50:14 PM3/28/12
to alembic-d...@googlegroups.com
Hiya,

With regards to this comment:
"It's for this reason that large files are better than lots of smaller files."

I just finished working on a 4 and a half year project employing 650 people that ran thousands of renders every single night whose entire set of render-files we based entirely on a custom implementation of a HDF5 file-system (similar to, but significantly further down the line in terms of features than Alembic) and our data indicated that whilst your argument makes sense on paper, it just honestly doesn't happen that way in real-life.  We used some pretty impressively large files, versus a lot of small ones.  One could say instead we used a lot of very large files, which is kind of, in your scenario, the worst of both worlds ;-)

What we found was that we wished we had instead had a way to cache single frame data to the render machines (i.e. that we had used a lot of small files), as this provided, in all cases, the best performance in term of turnaround wall-time, server load, and disc load.  Put simply, regardless what theoretical data says (and we had a LOT of theoretical data to back up our original design) - access to the large files was painful.  Full-stop.  That's not theory, that was practice, and practise exercised over more actual hours of network usage and farm rendering than I'm ever likely to see again.  What your distributed file system theory indicates is sound, but how do you GET that data to a single machine that requests it?  Over a network.  The network is generally the bottleneck in a large facility, since even with a super fast disc, you have to squeeze all that data through only a few physical cables.

Basically, for a large production, you have to take all common sense and throw it out the window.  When you make a very expensive, and very large system perform to the very edge of it's limits, all bets are off, and some REALLY strange things can happen, with some even weirder solutions.

Now I understand that YMMV, but we did everything we could to fix this issue, but it doesn't go away, and real systems simply don't exhibit the behaviour your text would suggest.

IMO for a VERY LARGE project, being able to cache smaller chunks of data, (which you seem to also suggest) would drastically improve the performance of a large system, but the secret would be to find a way to access that portion of data without the need to access or cache the whole - HDF5 contains some functionality to assist this, but it's generally circumvented by the file system of the OS you are working with (I'm talking entirely from a Linux perspective).

If you have some actual figures to back up that text I'm all ears, but I've never seen that result happen in a real system.

my2c,

Luke

Drew Whitehouse

unread,
Mar 28, 2012, 9:22:58 PM3/28/12
to alembic-d...@googlegroups.com
Hi Luke,

I'm going to send your reply to a few guys here and get their opinion. Are you using high speed network IB / 10GigE ? Because we're a HPC facility we have different constraints, but all the computational and storage nodes are accessed via IB. Also what filesystem were you using, i.e. how did artist workstations and render nodes see assets ?

-Drew

Luke Emrose

unread,
Mar 28, 2012, 9:33:43 PM3/28/12
to alembic-d...@googlegroups.com
As I am no longer employed by that project, I think it's probably best not to discuss much about the specific details of the hardware in use at that facility, however, I was not involved with the disc or network systems directly, I just was involved from a high level since I wrote and maintained our file-format, and was therefore responsible, at that level, in getting the best performance out of it, which involved a lot of meetings.

Most of my involvement was related to ways to restructure our code to "get around" the limitations of the network and discs - a lot relying on modifying the way our caching systems worked, and in a lot of cases, make smaller and larger numbers of, the files.

Networking is definitely not my forte, but I would be happy to ask our original systems guys a few more specific questions however, since figuring out some other reasons for this would be good too.

Thanks for your interest, it's a very interesting problem!  Large scale rendering seems to really push systems in to chaos that are designed to handle it - I'd love to talk more to people at facilities that can light up a giant beast without it complaining too much.

regards,

L

Drew Whitehouse

unread,
Mar 29, 2012, 2:08:25 AM3/29/12
to alembic-d...@googlegroups.com
OK, 

After some more chatting with one of our filesystem experts I'll take back what I said about the metadata servers which was more informed by another one of our people who worries more about backing up than application performance. Not to say backing up isn't also important :-)

With Lustre it does help to have smaller files which will automatically be spread across the object storage targets (OST). So with many smaller files you won't clobber the single OST holding the one big HDF file. Your throughput to multiple simultaneous clients is then going to be limited by the network bandwidth you can sustain between your multiple servers and clients and if thats a few network cables you're hosed. Interconnect is where HPC clusters spend the big dollars... (our current peak machine - http://nf.nci.org.au/facilities/vayu/hardware.php). And each distributed file system has it's own gotchas that application writers need to be aware of, no silver bullets it would seem.

-Drew

Leonid Onokhov

unread,
Mar 29, 2012, 2:38:45 AM3/29/12
to alembic-d...@googlegroups.com
AFAIK lustre allows stripping file across several OST objects, like
Raid 0. At least wikipedia says so. Can't really say anything about
it's performance though.
Reply all
Reply to author
Forward
0 new messages