Detect change in files

232 views
Skip to first unread message

Michel Lerenard

unread,
Mar 29, 2013, 6:00:08 AM3/29/13
to alembic-d...@googlegroups.com
Hi

I need to be able to check if a file has been modified, to be able to
know, before parsing it, if I need to update the data I have stored in
my application.

The use case is quite simple:
1 I import data and create objects linked to alembic data.
2 I save the project i'm working on.
3 The alembic file is modified
4 I reopen my project and want to update if necessary.

I've been using timestamps at first ( synchronize if timestamp is
different ), but i get many many 'false positive' if the file have been
copied from place to place and modification dates are lost.

Before using an heavy process (like md5 sum), I'd to know if there is
somewhere in the archive, a date i could check to know if the file
really has been modified.

Obviously I'm looking for mandatory information, I need to be sure it
will always be there.



Michel

Luke Emrose

unread,
Mar 29, 2013, 9:13:33 AM3/29/13
to alembic-d...@googlegroups.com

One option is the hdfdiff executable.

Since Alembic files are just hdf5 files, the standard suite of hdf5 introspection tools will work with them.

Another useful one is hdfview. Which allows visual introspection of hdf5/Alembic files.

Regards,

--
You received this message because you are subscribed to the Google Groups "alembic-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alembic-discussion+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Alex Suter

unread,
Mar 29, 2013, 10:50:32 AM3/29/13
to alembic-d...@googlegroups.com, alembic-d...@googlegroups.com

This required parsing, or at least pre-parsing, but you can generate a hash of the cache values you care about in the archive (hierarchy, geometry, topology, etc.) and store that along with the cache for comparison.


Then you can know, for example, that the hierarchy is the same between caches, but the topology has changed so you could only reload that part.

             -- Alex

Sent from Mailbox for iPhone


To unsubscribe from this group and stop receiving emails from it, send an email to alembic-discuss...@googlegroups.com.

Michel Lerenard

unread,
Mar 29, 2013, 11:02:41 AM3/29/13
to alembic-d...@googlegroups.com
Hi,

that I already do.
If my app is running, and have data loaded, i will check if the data really change using proprety hash_keys to know if I need to update let's say for example, a PolyMesh.
What i'm really interested in is the step before. A check that will spare me the parsing of the file if it didn't change.

Let's say you need to update a bunch of files in a particular folder on your hard drive from the network (or else), some files have changed, others do not. Most of the time, you copy everything, overwritting all the file, and in the process you change the timestamp of the file, but the content are the same.
I would like to be able to detect, without parsing the whole and check hash keys one by one, that the whole file is the same.
At the moment, i can check the timestamp, the size, but that's not enough, too many times i recheck the whole file for no reason.

I checked the source (alembic and hdf5), and found no checksum function that could reflect the content of the file. I saw that we could write metadata on the top node with the write date, but not all exported are using it so it won't be very efficient.




On 03/29/2013 03:50 PM, Alex Suter wrote:

This required parsing, or at least pre-parsing, but you can generate a hash of the cache values you care about in the archive (hierarchy, geometry, topology, etc.) and store that along with the cache for comparison.


Then you can know, for example, that the hierarchy is the same between caches, but the topology has changed so you could only reload that part.

� � � � � � �-- Alex
�

Sent from Mailbox for iPhone


�

Barnaby Robson

unread,
Mar 29, 2013, 11:39:30 AM3/29/13
to alembic-d...@googlegroups.com
Hi,

We don't recommend using hdf5 tools for introspection because in the future there could be other technology supporting the Alembic back-end and then those workflows will cease to function.  The other negative about viewing raw hdf5 is that you see "too much": meaning there are many things in an hdf5 file that is just specific to the hdf5 implementation of Alembic and can confuse you.

That's why we rely on abcecho and we built abcview to handle introspection via a UI.

barnaby.
To unsubscribe from this group and stop receiving emails from it, send an email to alembic-discuss...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


--
You received this message because you are subscribed to the Google Groups "alembic-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alembic-discuss...@googlegroups.com.

Ryan Galloway

unread,
Mar 29, 2013, 12:39:02 PM3/29/13
to alembic-d...@googlegroups.com

Until Alembic has a feature like this, I suspect you'll have to either process the file, use file naming conventions, or something upstream to trigger a callback in your app (which is clarisse?). You could start by checking high level things like topology or the number of samples, then get the property hash key values for the objects and store those in a sidecar file.

Ben Houston

unread,
Mar 29, 2013, 2:18:45 PM3/29/13
to alembic-d...@googlegroups.com
Having a hierarchical hash that is accessible at each node, sort of like how git works (the hash at a node is composed of that node's data and the hashes of its children), would allow for easy detection of whole files changes as well as making it very easy to just traverse the branches that have changed.  I've seen this type of system used before, it is awesome and fairly fast given that Alembic is already hashing the data blocks.
-ben


--
You received this message because you are subscribed to the Google Groups "alembic-discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alembic-discuss...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Best regards,
Ben Houston
CTO, Exocortex Technologies, Inc.
http://www.exocortex.com
Reply all
Reply to author
Forward
0 new messages