Mutable or immutable assets?

23 views
Skip to first unread message

Tommi Laukkanen

unread,
Jun 10, 2009, 1:49:08 PM6/10/09
to kyoryoku
It is a good question whether assets should be considered mutable or
immutable in the URI based identification scheme of assets.

Caching becomes simple if assets are immutable as there is no need to
worry whether asset in cache is current or not.

On the other hand reality is that assets change over time and if we
look at the life time of a single asset it is likely that it may have
tens or even hundreds of versions. Over time practically all versions
of the asset have been referenced. If the asset URI does not refer to
the recommended version of the asset but there is different URI for
each version then the result is that there are many versions of the
same asset out there taking up memory and bandwidth.

In some cases the builder would like to refer to the recommended
version of the asset. In some cases he might want to get a specific
version. Optimal would probably be to allow for referencing both
recommended version of an asset and any version of interest. This kind
of approach would be in line with WebDav versioning extensions and
internal models of content repositories with versioning support.

Allowing to reference recommend version of an asset would also enable
implementations which could be based on traditional folder hierarchy.
You could for example implement this using SVN and WebDav with some
additional functionality to insert container documents as discussed in
previous emails.

It might be good to try to avoid forcing the URIs to flat id based
approaches unless there are clear advantages over more human readable
folder hierarchy. All the modeling tools rely on file system currently
and all large existing content libraries and sets are hierarchical by
nature. It is possible to carry the hierarchy information in metadata
but it is also embedded in the native format references. Flattening
these native references by moving all content in the same folder can
be automated but it is worth the effort? Some projects might want to
use the asset repository as a primary storage for original data and
thus allow for moving content from tools to repository and back. The
question is what use cases are to be supported and if asset server
system is meant solely for distribution of assets to viewers.

-tommi

Hurliman, John

unread,
Jun 10, 2009, 4:31:25 PM6/10/09
to kyor...@googlegroups.com
Both.

Ryan McDougall

unread,
Jun 11, 2009, 1:35:35 AM6/11/09
to kyor...@googlegroups.com

Must resist top-posting...

I have to admit a bit of low-level/asset-level bias in my thinking,
which is perhaps natural. Turning that bias on it's head, why not
leap-frog WebDAV and SVN, and go straight to git?

It has every feature you can think of. It's fast and easy to
understand abstraction.

The only down side is that you'd need a web service to make it URL
addressable, and with binary formats the diffs would be larger than
text, and thus the overhead of storing deltas from previous assets
might cause your cache to be twice the size it would be without
version history.

Thoughts?

Patrick Horn

unread,
Jun 11, 2009, 2:22:47 AM6/11/09
to kyor...@googlegroups.com
Ryan McDougall saidin a message sent at 10.06.2009 22:35:
As for git in particular, one of the reasons that makes it such a useful
tool is that if one person produces a diff from A<->B and another person
produces A<->C, they can be merged in a meaningful way and two
independent branches can be combined transparently. This would be
possible with materials/shaders and structured image content like SVG.
but with meshes and binary images, I don't really see the point.

I think the git discussion may be more helpful with regards asset
references/containers and metadata, rather than binary hash-indexed
blobs themselves, which will likely be changed in their entirety if they
are changed at all. True, we can store diffs here, but I'm thinking on
an individual file basis, rather than an entire repository.

I would think that versioning of metadata and structure itself may be
useful in some cases, but even for that, I think the discussion may make
more sense in terms of a "container format", since that is essentially
what a git repository is.

My opinion is that it is best to make this asset storage as simple as
possible, unless there is a compelling reason to do otherwise.
Versioning is definitely important, but in my opinion, the versioning
that I would probably be most interested in is of an object as a whole.

If individual assets need to be versioned, I think that is better dealt
with using metadata (perhaps by providing a URI to the source asset, and
then these links can be followed by tracing back the versions). For
example, this may be useful for someone designing libraries of
materials/textures or skeletons/avatars that other people may want to
modify.

Of course, if we start to get into the discussion about deleting or
altering content, this becomes a lot more tricky. Some things would be
greatly simplified if we prevent altering an asset altogether. The other
option would be to have some standard way of storing previous versions
and linking to them through metadata.


With respect to the flattening of URIs discussion, there is no way in
real life that URIs will be flat. Even if the paths are flat, there may
need to be cross references to other CDN domains altogether.
I am pretty interested in allowing or maybe even enforcing relative URIs
within a logical unit (i.e. URIs that do not have to change when an
asset is copied). The key advantage is that this makes
duplication/versioning of assets a lot simpler.

And it also simplifies the production pipeline as well: instead of
having to change the URI references in the process of uploading, I can
upload the raw files as they are, and the relative URIs will be handled
by the downloader (computing relative URIs is relatively simple anyway).

I'm curious what you think about relative URIs. Any compelling reasons
for or against these?

(Hope I didn't bore you all too much, I get kind of carried away)
-Patrick

Ryan McDougall

unread,
Jun 11, 2009, 2:56:45 AM6/11/09
to kyor...@googlegroups.com

How often would that be the case in a asset distribution environment?
Just because git allows that doesn't mean can't pick a subset and
enforce that within our toolchain.

We'd put git behind SSH, and if you are not *the* publisher, you don't
get to try and merge A->C. If you're two people within the publishing
house, git complains it can't handle the job, and you have your
publishing tool do a rebase.

> possible with materials/shaders and structured image content like SVG.
> but with meshes and binary images, I don't really see the point.
>
> I think the git discussion may be more helpful with regards asset
> references/containers and metadata, rather than binary hash-indexed
> blobs themselves, which will likely be changed in their entirety if they
> are changed at all. True, we can store diffs here, but I'm thinking on
> an individual file basis, rather than an entire repository.
>
> I would think that versioning of metadata and structure itself may be
> useful in some cases, but even for that, I think the discussion may make
> more sense in terms of a "container format", since that is essentially
> what a git repository is.
>
> My opinion is that it is best to make this asset storage as simple as
> possible, unless there is a compelling reason to do otherwise.
> Versioning is definitely important, but in my opinion, the versioning
> that I would probably be most interested in is of an object as a whole.
>
> If individual assets need to be versioned, I think that is better dealt
> with using metadata (perhaps by providing a URI to the source asset, and
> then these links can be followed by tracing back the versions). For
> example, this may be useful for someone designing libraries of
> materials/textures or skeletons/avatars that other people may want to
> modify.

The only reason I mentioned git is that it is simpler in a matter of
speaking: it already implements a file system on top of an efficient
low-level content-hash-based index. The only difference is that when
designed binary wasn't the main use case. Versioning is just some nice
gravy.

>
> Of course, if we start to get into the discussion about deleting or
> altering content, this becomes a lot more tricky. Some things would be
> greatly simplified if we prevent altering an asset altogether. The other
>  option would be to have some standard way of storing previous versions
> and linking to them through metadata.
>
>
> With respect to the flattening of URIs discussion, there is no way in
> real life that URIs will be flat. Even if the paths are flat, there may
> need to be cross references to other CDN domains altogether.

I didn't mean to imply no cross-domain linking. I think it's kinda simple:

if path doesn't have scheme (ie. HTTP:), then rebase to public domain;
else leave it.

Relative paths are nice too, but as long as you enforce something
common at the publish step, you can handle things uniformly later IMO.

> I am pretty interested in allowing or maybe even enforcing relative URIs
> within a logical unit (i.e. URIs that do not have to change when an
> asset is copied). The key advantage is that this makes
> duplication/versioning of assets a lot simpler.
>
> And it also simplifies the production pipeline as well: instead of
> having to change the URI references in the process of uploading, I can
> upload the raw files as they are, and the relative URIs will be handled
> by the downloader (computing relative URIs is relatively simple anyway).
>
> I'm curious what you think about relative URIs. Any compelling reasons
> for or against these?
>
> (Hope I didn't bore you all too much, I get kind of carried away)
> -Patrick

I'm up for this kind of detail/discussion -- I just hope no one else
is getting turned off already. :)

If so, let us know -- we're a small group here.

Cheers,

Hurliman, John

unread,
Jun 11, 2009, 3:13:37 AM6/11/09
to kyor...@googlegroups.com
I'm confused about the direction you're going with this. Are you suggesting a more efficient way to implement content storage, or a new set of interfaces to access and manipulate content? If the conversation is about the latter, there is a wealth of research in the area of content-based storage and git is just the tip of the iceberg. If we are discussing the interfaces that will be exposed to content development tools and viewing clients then I think it would make sense to focus specifically on that portion from a user experience and use case perspective.

I think the collaborative process of distributed version control systems shares a lot in common with the collaborative process of virtual world content creation. Definitely a lot to explore there, but I'd rather explore the use cases and then pick a technology rather than working the other way.

John

Ryan McDougall

unread,
Jun 11, 2009, 3:24:03 AM6/11/09
to kyor...@googlegroups.com
I think I am making a fairly shallow observation with admittedly not
much thought behind it:

Asset Layer (content hash based) + File Layer (WebDAV) = CB Inventory
Asset Layer + File Layer (WebDAV) + deltas + versioning = SVN
Asset Layer (content hash based) + File Layer + deltas + versioning = git
=> CB Inventory is a special case of git :D

Also, afaik, Unity3D uses git as an option for its asset handling.

While there are down sides to git -- specifically it's really only
designed to work well with text -- the up side is that it's fast and
already done.

Cheers,

Ryan McDougall

unread,
Jun 11, 2009, 3:36:26 AM6/11/09
to kyor...@googlegroups.com
On Thu, Jun 11, 2009 at 10:13 AM, Hurliman, John<john.h...@intel.com> wrote:
>
> I'm confused about the direction you're going with this. Are you suggesting a more efficient way to implement content storage, or a new set of interfaces to access and manipulate content? If the conversation is about the latter, there is a wealth of research in the area of content-based storage and git is just the tip of the iceberg. If we are discussing the interfaces that will be exposed to content development tools and viewing clients then I think it would make sense to focus specifically on that portion from a user experience and use case perspective.
>
> I think the collaborative process of distributed version control systems shares a lot in common with the collaborative process of virtual world content creation. Definitely a lot to explore there, but I'd rather explore the use cases and then pick a technology rather than working the other way.
>
> John


Warning: thinking out loud going on!

Let's for the sake of discussion say that you build your asset
pipeline entirely around git.

The content creation tool hides the complexity of git from the user;
they see only a file system to which they add, remove, or change files
-- text or binary. All file names within container files are in local
terms, ie. "C:\Content".

On the publish step, the app does save (git commit), a branch (git
checkout -b pre-publish), re-base of local names (sed
's/C:\Content/git://example.com/assets' *), and a push to the global
repo (with a delete of the temp branch).

When simulating, the client does a git clone (with working copy). The
simulator sends updates with asset links of the form
git://example.com/assets/house/dog.mesh. The client looks up the
working copy for git://example.com/assets for the local file
house/dog.mesh.

Thoughts?

Cheers,

dani...@graphics.stanford.edu

unread,
Jun 11, 2009, 5:56:21 AM6/11/09
to kyor...@googlegroups.com
I'm a little out of the loop here, being away from the country and solid
internet access, and only have been able to skim the mails so far...
but for the record: can we use a canonical 'user experience' example that
includes at least one instanced asset...

maybe a texture that 2 models share :-) Better yet a texture that's named
2 different things but holds the same content that 2 models share

that way I can see how instancing fits into this whole thing--- maybe a
client can recognize the hash is the same and not download that file from
the repository

for instance
dog.mesh needs dogeyes.jpg (hash:0xbadf00d) brownfur.jpeg (hash:0xdeadbeef)
cat.mesh needs cateyes.jpg (hash:0xbadf00d) whitefur.jpeg (hash:0xca1f00d5)

and maybe they both refer to an animation that targets the same skeleton
(fourlegs.skeleton has the idle animation and dogwalk.skeleton and
catwalk.skeleton modify the fourlegs.skeleton in different ways)

Ryan McDougall

unread,
Jun 11, 2009, 7:03:07 AM6/11/09
to kyor...@googlegroups.com
On Thu, Jun 11, 2009 at 12:56 PM, <dani...@graphics.stanford.edu> wrote:
>
> I'm a little out of the loop here, being away from the country and solid
> internet access, and only have been able to skim the mails so far...
> but for the record: can we use a canonical 'user experience' example that
> includes at least one instanced asset...
>
> maybe a texture that 2 models share :-)  Better yet a texture that's named
> 2 different things but holds the same content that 2 models share
>
> that way I can see how instancing fits into this whole thing--- maybe a
> client can recognize the hash is the same and not download that file from
> the repository
>
> for instance
> dog.mesh needs dogeyes.jpg (hash:0xbadf00d) brownfur.jpeg (hash:0xdeadbeef)
> cat.mesh needs cateyes.jpg (hash:0xbadf00d) whitefur.jpeg (hash:0xca1f00d5)
>
> and maybe they both refer to an animation that targets the same skeleton
> (fourlegs.skeleton has the idle animation and dogwalk.skeleton and
> catwalk.skeleton modify the fourlegs.skeleton in different ways)
>
> and maybe they both refer to an animation that targets the same skeleton
> (fourlegs.skeleton has the idle animation and dogwalk.skeleton and
> catwalk.skeleton modify the fourlegs.skeleton in different ways)

Good idea, though it might be better for you to define the use case,
and maybe put it on the wiki.

Let me give it a shot though. Not sure if it's what you had in mind.
Please update where I am missing something.

== Use Cases

Actors
- Client (C)
- Simulator (S)
- Content Creator (CC)
- Content Distribution Network (CDN)

Objects
- Set of assets (A)
- Set of asset meta-data (M)
- Layout of assets in a space (L)

Assets
dogeyes.jpg
cateyes.jpg
brownfur.jpg
whitefur.jpg

fourlegs.skeleton
dogwalk.skeleton
catwalk.skeleton

dog.mesh -> dogeyes.jpg, whitefur.jpg, catwalk.skeleton -> fourlegs.skeleton
cat.mesh -> cateyes.jpg, brownfur.jpg, dogalk.skeleton -> fourlegs.skeleton


== Creating a World from Nothing, One-step Publish

Goal: create a new world.

0. CC creates A. All are regular files, relative to C:\Content on the
CC's hard drive.
1. CC places (realizes) A within the space such that cat.mesh is
opposite dog.mesh, and they're facing each other.
2. CC iterates over the content and layout until he is satisfied with
the result, at which time he decides to publish.
3. CC gains appropriate permissions to CDN and S.
3. CC places A in the chosen globally accessible CDN, and L is placed
in the chosen S.
4. C visits S, which informs it of {(A,M),L}.
5. C reads M, places A in a local cache, and uses L to render the
completed scene.

== Creating a World from Nothing, Two-step Publish

Goal: create a new world (may lead to simplified content publishing tools).

0. CC creates A. All are regular files, relative to C:\Content on the
CC's hard drive.
1. CC iterates over the content and layout until he is satisfied with
the result, at which time he decides to publish.
3. CC gains appropriate permissions to CDN
3. CC places A in the chosen globally accessible CDN.
3. CC visits S and gains appropriate permissions.
3. CC places (realizes) A within the space such that cat.mesh is
opposite dog.mesh, and they're facing each other.
4. C visits S, which informs it of {(A,M),L}.
5. C reads M, places A in a local cache, and uses L to render the
completed scene.

== Modifying an Existing World

Goal: Modify make the cat's fur brown-with-spots, and place it beside
the dog, facing the same direction.

6. CC visits S and gains appropriate permissions.
7. CC alters L so that the cat and dog are beside each other, facing
the same direction.
7. CC modifies brownfur.jpg (F) to add spots, creating brownspottedfur.jpg (F').
3. CC gains appropriate permissions to CDN.
3. CC places F in the chosen globally accessible CDN.
11. CC modifies {(A,M),L} such that F -> F'.

Help at all?

Cheers,

Ryan McDougall

unread,
Jun 11, 2009, 7:08:32 AM6/11/09
to kyor...@googlegroups.com
D'oh, the number is messed, but I'm tracking these here:
http://wiki.realxtend.org/index.php/Asset_Service_Working_Group

Cheers,
Reply all
Reply to author
Forward
0 new messages