Project AToM

35 views
Skip to first unread message

Vladimír Šeděnka

unread,
Jan 6, 2015, 1:17:42 PM1/6/15
to amele...@googlegroups.com
Hello (Nathanaël),
I have creted this topic to continue discussion (started with Cyril) about possible use of the Amelet-HDF in our new project (called AToM). Cyril has told me some interesting features planned for future releases of the Amelet-HDF spec and started to convince me to use Amelet instead of creation of our own "Amelet-like" specification. He has forwarded our previous conversation so I hope I do not have to explain it from the beginning :-).

Best regards,
  Vlada

cyril giraudon

unread,
Jan 8, 2015, 12:42:59 PM1/8/15
to amele...@googlegroups.com
Valdimir,

Nathanael and I have a talk about the usage of Amelet-HDF in the context of AToM. All in All, we don't see major limitations in using Amlet-HDF as is.

32/64/128.....:
First, the problem of the storage of floats. We think the Amelet-HDF library could be compiled in 32, 64 ... 128bit... (for float handling)  without any dramatic consequence in the spec.
The 32bit constraint coumd be seen as a recommandation (for collaboration work) but it is not essential.
With some tests nathanael shows that a 64bit float can be read seemblessly by a client compile in 32bit or 64bit (see the source at the end of the message).

Visualizing the mesh;
During the HIRF-SE project, a paraview plugin allowed to visualize mesh and data in paraview. I think it is the way to go to inspect mesh and let hdfview for items with no large dataset.

You could also write a very little tool in matlab to show data like you would like to inspect them.
Amelet-HDF is a format.

The geometry.
We don't see exactly what a geometry is. For us, it is a set of polygons (triangles, quads, ...), as we can see in X3D VRML, ....
In Amelet-HDF, it is a mesh with only 1D/2D group (edge groups, face goups). But apparently, you would like to decorate some part of the mesh to say "ooooh, there is a hole".

The way to decorate element is through a simulation by the mean of links. you are right, a file is not always read in the context of a simulation.
We think about the notion of "system".
A system would be a simulation but without the module data: a system could a mesh associated with materials for instance.
A system could linked a mesh with hole properties.

The process:
Seems you would like to trace the life of data (geometry, mesh, result...).
Are you sure that storing all data in the same file (whatever the format is) is a good thing ?
We rather think about a file system which would contain directories for each kind of data, then a file could maintain a graph relations between data.
Relations could be ancestor / producer / consumer ....
The file could handle the version strategy.
The entire project could be a .zip file of that file system (like an odf document of libreoffice).
It is what does CuToo at another level with a database. The entire Amelet-HDF simulation file is temporary, it is generated by CuToo from infotypes involved in the simulation.
When the simulation terminates, the ouput Amelet-HDF file is created by the post converter, then CuToo parses the file and splits it into elements matching with infotypes declared in the platform.
You could also use a revision control system like git.

Best regards,

Cyril.


############################
Python code (build input data)
import tables as tb
import numpy as np
h5=tb.open_file("32or64.h5", "w")
h5.create_array("/", "ddset", np.linspace(0, 1, 6).astype(np.float64))
h5.create_array("/", "fdset", np.linspace(0, 1, 6).astype(np.float32))
h5.close()

C code (read data)

#include "hdf5.h"
#include "hdf5_hl.h"

int main(void)
{
hid_t file_id;
float data32_32[6];
float data64_32[6];
double data32_64[6];
double data64_64[6];
herr_t status;
size_t i;

/* open file from ex_lite1.c */
file_id = H5Fopen("32or64.h5", H5F_ACC_RDONLY, H5P_DEFAULT);

/* read dataset */
status = H5LTread_dataset_float(file_id, "/fdset", data32_32); /*read from float data*/
status = H5LTread_dataset_float(file_id, "/ddset", data64_32); /*read from double data*/
status = H5LTread_dataset_double(file_id, "/fdset", data32_64);
status = H5LTread_dataset_double(file_id, "/ddset", data64_64);

/* print it */
for (i = 0; i < 6; ++i)
printf("%f %f %f %f\n",
data32_64[i], data64_64[i],
data32_32[i], data64_32[i]);

/* close file */
status = H5Fclose (file_id);

return 0;
}


Output:


0.000000 0.000000 0.000000 0.000000
0.200000 0.200000 0.200000 0.200000
0.400000 0.400000 0.400000 0.400000
0.600000 0.600000 0.600000 0.600000
0.800000 0.800000 0.800000 0.800000
1.000000 1.000000 1.000000 1.000000

Vladimír Šeděnka

unread,
Jan 9, 2015, 8:07:34 AM1/9/15
to amele...@googlegroups.com
Floats:
I know that HDF is capable to store 64bit floats. I just feel to be limited by the Amelet spec. What do you mean by "dramatic consequence"? :-) I think the spec must be a law... not a recommendation. The spec should tell me what to expect. If it says that something is 32-bit float, nobody will be prepared for doubles. Okay, I believe that your code could read 64-bit data using automatic conversion to 32-bits... but are you sure that it will work also in case of
H5Dread (while specifying 32-bit mem_type_id)?

Visualization:
Ooohhh.. Please, do not remind me Paraview :-D. I believe that Paraview itself is pretty good..however I have bad experience with very often crashes caused (I believe) by the Amelet plug-in. This is actually the reason why I spent month or two learning OpenGL basics to write my own postprocessing tool.
However why I am mentioning post-processing data here:
Common practice is to save only results of the simulation. The visualization tool reads the results, performs some post-processing while keeping all the post-processing data in the RAM. The user will display the visualization, optionally make some screen captures, close and forget. Our problem is that our post-processing will require extensive amount of computational power to compute the post-processing data and we do not want to forget them when the user finishes his work. We don't want to compute everything again when opening the visualization tool next day.

Geometry:
Well, it could be possible to do that, but it is not very elegant. I think that the HDF file structure should be straightforward (e.g. /geometry, /mesh, /solver, /post-processing).
I am sorry but I do not understand your idea how to mark some polygons as holes (using link) and the "system" entity. Could you, please, try to explain them again? Maybe the "system" is rather related to /simulation category rather than to the geometry?

Process/history:
I would not say sure but pretty convinced.
You are proposing directories for each kind of data. What is the function of "your" directories? Hierarchical structuring. But HDF is already hierarchical so why not to use its capabilities?
You need to use compression program to make a single file... I already have one - the HDF file. I don’t have to extract the *.zip to work with the data... just open and work with it.
Compression is already included in the HDF.
Your approach seems to use various file formats (not only HDF). I would have to care about their limitations (charset, file size, data types...). While keeping all the data in the HDF, I do have to care only about limitations of the HDF.

Best regards
 Vlada

Nathanaël Muot

unread,
Jan 26, 2015, 3:49:11 AM1/26/15
to amele...@googlegroups.com
Hello all,

Sorry for not giving a faster answer and be sure we are not trying to
discourage anyone from using this file format.

Concerning float point storing in Amelet-HDF. A specification is here to define
how data must be stored for any one can understand this data without
ambiguity. At this first point, some goals could be added (memory usage,
accuracy, portability, etc). The specification must define just what is the
need, not too mush.

The usage of 64bit has not dramatic consequence because HDF5 cast the data into
the right kind automatically. So a data written in 64bit keep compatible with a
32bit application (without modification) and vice versa.

Then, why encourage the 32bit usage? At beginning, Amelet-HDF have been define
for store input/output data for EM numerical simulation. In the main case, the
32bit provide an acceptable accuracy and have the good property to save memory
(good property for exchange file format).

Concerning the accuracy. As a reminder, the machine epsilon for 32bit
architecture is near to 10^-7 and it is 10^-16 for 64bit. It is not so easy to
have a measurement procedure that guarantees an accuracy of 10^-7. It is also
hard for a numerical calculation to provided 64bit accuracy with a 64bit
calculation (numerical noise). It's for all this points than the 32bit accuracy
is usually good enough for input and output data.

Even so, in some specific cases, for example for storing intermediate data for
computation restart. In this case, the 64bit storage is required.

For all this reasons soften the Amelet-HDF specification is a good idea because
there are not consequence in existing software, keeping back compatibility and
enlarging file format perimeter. But the 32bit floating point storage is
encouraged because in many applications is enough and save memory.



I do not know when you have tested the Paraview plug-in. Today the plug-in
Paraview have been rewritten for used the last Amelet-HDF c library (most
stable).  Amelet-HDF c library API has changed between previous versions so all
functionalities have not been re-implemented yet. At this time, the Paraview
plug-in can read mesh and data on a mesh.



Concerning geometry. I think in Amelet-hdf the 'category' name is related to
the data kind and not to define the data usage. For example, mesh is used to
store triangles, polygons, (discrete geometry representation). 'geometry' is
for us considered as reserve name to define most general geometrical notions
like b-spline, NURBS, geometrical boolean operation and so on.  But as this
time no one needs to have this capability into Amelet-HDF.



To finish, the last point. You say right, HDF5 have hierarchical structure and
compression.  However, using a single technology and a single file have some
drawback. It is not flexible and seems complex. But I do not know all your
requirements. However, it is possible to define all yours data into one
file. If you define two level in yours data file hierarchical. The first level
is used for manage your process, history and so on. The leaf of this first
level are stored into Amelet-HDF strict and you can use the link at the first
level. I try to give an example:



projects.h5
|-- first_project[@attribut decorator]
|   |-- metadata
|   |-- pres[@attribut decorator]
|   |   |-- metadata
|   |   `-- data[@FILEFORMAT=AMELET_HDF]  # Start an Amelet-HDF node
|   `-- mesh[@attribut decorator]
|       |-- metadata
|       `-- data[@FILEFORMAT=AMELET_HDF]
`-- second_project[@attribut decorator]
    |-- metadata
    |-- pres[@attribut decorator]
    |   |-- metadata
    |   `-- data[@FILEFORMAT=AMELET_HDF]
    `-- mesh[@attribut decorator]
        |-- metadata
        `-- data[@FILEFORMAT=AMELET_HDF] -> link to /first_project/mesh






--
You received this message because you are subscribed to the Google Groups "Amelet-HDF" group.
To unsubscribe from this group and stop receiving emails from it, send an email to amelet-hdf+...@googlegroups.com.
To post to this group, send email to amele...@googlegroups.com.
Visit this group at http://groups.google.com/group/amelet-hdf.
For more options, visit https://groups.google.com/d/optout.



--
Nathanaël MUOT
email : nathana...@gmail.com

Vladimír Šeděnka

unread,
Jan 26, 2015, 8:56:18 AM1/26/15
to amele...@googlegroups.com
Hello Nathanaël,
you have already convinced me that H5LT can cast the data automatically. However, I am still not sure whether it works if you specify datatype manually. How about compound datatype (see www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/hdf5-examples/1_8/C/H5T/h5ex_t_cmpd.c )? I still have some concerns about relying on automatic data cast.. but I can suppose that Amelet allows both 32 and 64-bit floats and HDF I/O will/should care. The
intermediate data are the reason of requesting 64-bits.

I have tested the Paraview plug-in 3-4 years ago... but as I have already mentioned... the post-processing (data) was mentioned as an example of intermediate results to be saved.

Concerning geometry. I partially agree with the relation of the 'category' name. Polygons are basically a mesh. On the other hand, there is a difference in the meaning, which should be clearly distinguished. The same two polyons may mean an object with a hole in geometric representation and just two wire loops in mesh representation. How to say "this is a hole" in the "mesh" category?

Mixing Amelet and non-Amelet data in single file is an interesting idea... but what would be advantage of such approach? The file would not be Amelet-compliant anyway. At this time, I am considering only some of the Amelet categories (mesh, link, floatingType, physicalModel, maybe something from electromagneticSource). Moreover, some (if not all of them) are going to be modified (H5L, H5R) so the pure Amelet part will occupy only small percent of the file. What would be the advantage of specifying what is the pure Amelet part and what is not? Do you think of reading partial data using universal Amelet reader? I would prefer using our spec and export to the Amelet.

May I ask what is the current status of the Amelet development? The last public release of the spec is 3 years old. Cyril told me several times what is about to change. I remember that he was even mentioning Amelet 2.0 :-). Is any of you working/planning to work on the spec right now? Does the current spec meet your needs? Are you getting feature requests from the users?

Best regards
 Vladimir


Dne pondělí 26. ledna 2015 9:49:11 UTC+1 nath napsal(a):
Reply all
Reply to author
Forward
0 new messages