AW: APT-HDF5 vote / Matlab/HDF5 indexing issues

34 views
Skip to first unread message

Markus Kühbach

unread,
Oct 31, 2020, 6:59:04 PM10/31/20
to atompr...@googlegroups.com
Multiple answers needed, here's the first round: Try this Matlab code, I used Matlab R2019a:

%% exemplify problems with naive usage of HDF5 and Matlab
% key issue: Matlab uses column-major ordering (aka Fortran-style indexing),
%while HDF5 builds on C, therefore uses row-major (aka C-style) indexing internally
% see the following issue with Matlab and HDF5 in particular comment from James to the end...
% and using the high-level library of HDF5
% following https://www.mathworks.com/matlabcentral/answers/308303-why-does-matlab-transpose-hdf5-data
% here tested for Andys slightly modified example, with always clearing memory and files to reduce eventual caching effects.
N = 4; %4;
M = 7.5 * 1e7; %7;
h5fn = 'HDF5SpeedIssueMatlab_01.h5';
nruns = 3;
for i = 1:nruns
clearvars w_arr;
delete 'HDF5SpeedIssueMatlab_01.h5';
h5create(h5fn, '/dataset', [N, M]);
%w_arr = single(reshape(1:N*M,[N, M]));
%some random floating point data
w_arr = single(unifrnd(0.0, 1.0, N, M));
%disp('writing')
%profile the writing part
%to venus...
disp('w');
tic
h5write(h5fn, '/dataset', w_arr);
toc
clearvars r_arr;
%... back again
disp('r');
tic
r_arr = h5read(h5fn, '/dataset');
% OOPS, Matlab high-level lib call internally transposes again
toc
end

h5disp 'HDF5SpeedIssueMatlab_01.h5'
% OOPS, so Matlab's high-level lib call internally converts to double again even
% though w_arr is single and transposes...
disp('reading')
tic
r_arr = h5read(h5fn, '/dataset');
% OOPS, Matlab high-level lib call internally transposes again
toc
% the here used R2019a makes two times an internal transpose !, check with
% HDFView, array ends up as 7.5e7 x 4 but w_arr and r_arr are 4 x 7.5e7 !

This is what I get a 2.400.002.048 Bytes sized file on disk while I/O with a local Seagate ST2000DM008 2 TB, no network drive
w
Elapsed time is 1.316264 seconds.
r
Elapsed time is 1.269097 seconds.
w
Elapsed time is 1.247720 seconds.
r
Elapsed time is 1.074530 seconds.
w
Elapsed time is 1.245274 seconds.
r
Elapsed time is 0.880468 seconds.

So this confirms what I would expect:
>> Matlab internally transposes.
>> Achieved substantially more speed than 40MB/s

What do we learn?
>>We ask ourselves the right questions, I really like and appreciate this discussion!
>>We should better understand why there are differences between our benchmarks
>>We could in principle go with either 3xN or Nx3, Dan/Andy, when using Matlab, but I at least would always get a N x 4 within the HDF5 file... see why because of below game
>>We should not let Matlab fool us with h5disp as internally --- please check with HDFView --- the order is different than what h5disp indicates

Now play this game:
N = 4; %4;
N = 7.5 * 1e7;
M = 7.5 * 1e7; %7;
M = 4;

I get a 7.5e7 x 4 with h5disp, as expected and voila again:
w
Elapsed time is 1.240888 seconds.
r
Elapsed time is 0.959843 seconds.
w
Elapsed time is 1.338568 seconds.
r
Elapsed time is 1.050437 seconds.
w
Elapsed time is 1.841557 seconds.
r
Elapsed time is 1.107122 seconds.

>> So I do not see a point why to confuse people and use 4 x N instead of N x 4.
>> I agree with you that the speed is not the primary concern for these 2d examples and use cases

What is your pick on this?

But... this is when using the high-level library, now let's try the low-level stuff from Matlab an write out the same
shaped and sized data...


Bests,
Markus

-----Ursprüngliche Nachricht-----
Von: atompr...@googlegroups.com <atompr...@googlegroups.com> Im Auftrag von Daniel Haley
Gesendet: Samstag, 31. Oktober 2020 15:20
An: atompr...@googlegroups.com
Betreff: Re: APT-HDF5 vote

Hi Markus,

Gb/s is quite high for many systems - here is me writing random data to disk, using a random (non-blocking) data source, and a file as target.

$ time dd if=/dev/urandom of=benchmark bs=1M count=2048 ...
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 32.6801 s, 65.7 MB/s

real 0m32,735s
user 0m0,000s
sys 0m13,578s

As you can see, random data write for me is around 65MB/s - not GB/s.
I'd hazard none of Andy's benchmark are unexpected, and could be IO bound, rather than bound at the data encoding stage (though more tests via fopen + random data on Andy's system would be needed).

Ions 7.50E+07
Size/Ion 16
Bytes 1200000000

KB 1171875
MB 1144.4091796875
GB 1.11758708953857

Seconds 28.77
MB/s 39.7778651264338

I'd be interested to see the values for (2) , and if there is a big difference for (4), why does this not show for Andy's test? Why would the data not be contiguous?

I am having trouble following your explanation, and a code sample would greatly help to narrow down your concern here, and allow us to all reproduce the exact problem.

I've added experimental code to libatomprobe to help provide an interface to these files, so you can see the sample code I am using there to do this access.

Thanks,

Dan

On 30.10.20 21:46, Markus Kühbach wrote:
> Dear Andy, dear Daniel, and others,
>
> we do not necessarily train people how to think closer about which use
> cases they want to touch with this preferentially.
>
> Quite frankly, the Matlab test here referred is not convincing to me:
> Were these contiguous datasets? Yes, because this is the default
> behaviour when using the high-level H5 lib in Matlab.
> So, then you would typically expect the HDF5 library to write ~ 0.5 -
> a few GB/s ...
> So why does 4 x 75 mio x likely 4 B take in both strategies >20sec?
> This is flabbergasting, network drive in use?
> Also I/O test should ALWAYS be executed multiple times plus caching is
> an issue whose effects are admittedly not captured by such simple
> test.
>
> 1.) However, yes, the test shows that for most APT datasets speed
> effects are minor.
> So pick the faster: 2 seconds every time you read and write will add up.
> 2.) Do the same test for a 3D dataset. The numbers will very likely
> change and this is because of the nitty gritty details of slice
> indexing. I have the feeling that there is a misconception of
> HDF5 inasmuch as that one does not really have to worry how to shape
> the arrays.
> Yes, different shaping works, so no necessicity in principle to worry.
> But for different use cases not with the same speed.
> 3.) So what are the use cases how we want to read/write array data
> with our standard?
> Think as follows, we have e.g. X, Y, Z coordinates of the recon, let's
> say take the 75mio dataset referred to.
> Use case:Let's pick a specific range of ion sequence IDs.
> Then if you store 4 x 75mio shape you will have at least four memory
> and disk jumps over larger distances if you want to filter x,y,z for a
> specific range of ion sequence IDs.
> By contrast, for 75mio x 4 interleaved HDF5 would give you a
> continuous block with on average fewer jumps.
> Rest depends on how the file is actually layed out on the disk, a
> level I dont want to touch, because to be fair, yes here hardware
> details matter and benchmarking with such small datasets rather
> useless.
>
> 4.) Now think about we are going to store 3d data in our format, how
> should we train people to do it?
> Let's take N>=M>=L, take a slice at fixed L: That slice (shape NxM)
> will be continuous.
> But, now take a slice in orthogonal direction (e.g. fixed N, worst MxL):
> you will get substantially more jumps
> for both memory and disk. So we are not talking about premature
> optimization, then.
> So why would we then use a different indexing for 2d data? At this
> point I am exactly in line with Anna's concern. Hopefully, we remain
> consistent.
>
> For these reasons, I have recommended to shape N>=M (>=L) with N,M,L > 0.
> By the way this is regardless of whether we use contiguous or chunked
> layout.
> For the second the discussion is even more involved.
>
>
> Bests,
> Markus
>
>
>
> ----------------------------------------------------------------
> Dr.-Ing. Markus Kühbach
>
> Max-Planck-Institut für Eisenforschung GmbH Department "Microstructure
> Physics and Alloy Design"
> Research group "Theory and Simulation"
> Room 649
> +49 211 6792 385
> m.kue...@mpie.de
>
>
>
> *From: * "London, Andy" <andy....@ukaea.uk>
> *To: * "atompr...@googlegroups.com" <atompr...@googlegroups.com>
> *Sent: * 10/30/2020 6:29 PM
> *Subject: * RE: APT-HDF5 vote
>
> Hi all,
>
> I did a quick test in matlab just to check that the order didn’t
> matter for writing real data:
>
> Write 4 by 7.5E7 ions Elapsed time is 28.779395 seconds.
>
> Write 7.5E7 by 4 ions Elapsed time is 26.822038 seconds.
>
> >> hdf5_speed_test
>
> Read 4 by 7.5E7 ions Elapsed time is 1.156697 seconds.
>
> Read 7.5E7 by 4 ions Elapsed time is 1.058228 seconds.
>
>  
>
> Markus, was your concern about reading chunks of ions? I.e. you want
> [x1 y1 z1 m1] [x2 …] etc.?
>
> Andy
>
>  
>
> *From:*'Anna Ceguerra' via AtomProbe TC <atompr...@googlegroups.com>
> *Sent:* 27 October 2020 22:07
> *To:* atompr...@googlegroups.com
> *Subject:* Re: APT-HDF5 vote
>
>  
>
> Vote Option 1 (Disseminate)
>
>  
>
> I am hoping the community can give feedback on my concern, which is
> the order of the dimensions (I prefer n x 3 for example, not 3 x n
> which is in the document).
>
>  
>
> Regards,
>
> Anna.
>
>  
>
> *From: *atompr...@googlegroups.com
> <mailto:atompr...@googlegroups.com>
> <atompr...@googlegroups.com <mailto:atompr...@googlegroups.com>>
> *Date: *Tuesday, 27 October 2020 at 9:20 pm
> *To: *atompr...@googlegroups.com
> <mailto:atompr...@googlegroups.com>
> <atompr...@googlegroups.com <mailto:atompr...@googlegroups.com>>
> *Subject: *APT-HDF5 vote
>
> Hi all,
>
> As required by our charter,  I am calling for a renewed vote for the
> following action:
>
> To disseminate the current draft standard on APT-HDF, for a "Request for
> comments" phase from the community, where we invite the wider community
> to review the document.
>
> The proposed standard and the associated software links will be provided
> as part of the mailout. Dissemination will be via the ISC, likely the
> IFES mailing list.
>
> The publicly visible standard document is here:
>
>
> https://protect-au.mimecast.com/s/hx66C81V0PT1RqrLfnmWKL?domain=docs.g
> oogle.com
>
> Option 1: Disseminate the Draft standard
> Option 2: Withhold the Draft standard for further revisions.
>
> Please indicate in your response:
>
> "Vote : Option X", where X is 1 or 2
>
> If there are small errors that should be corrected, please directly
> modify the document, rather than on-list discussion. If you believe
> further revisions are needed, we can discuss this separately to your
> vote.
>
> I'm told that the document above can be hosted on the IFES website, but
> am waiting for the SC to actually upload.
>
> Please do submit your vote as soon as possible.
>
> Thanks,
>
> Daniel
>
> --
> You received this message because you are subscribed to the Google
> Groups "AtomProbe TC" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to atomprobe-tc...@googlegroups.com
> <mailto:atomprobe-tc...@googlegroups.com>.
> To view this discussion on the web, visit
> https://protect-au.mimecast.com/s/7oHKC91WPRTQBq19tEdM-0?domain=groups.google.com.
>
> --
> You received this message because you are subscribed to the Google
> Groups "AtomProbe TC" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to atomprobe-tc...@googlegroups.com
> <mailto:atomprobe-tc...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/atomprobe-tc/ME2PR01MB38285017B1791E4D7D086B58B8160%40ME2PR01MB3828.ausprd01.prod.outlook.com
> <https://groups.google.com/d/msgid/atomprobe-tc/ME2PR01MB38285017B1791E4D7D086B58B8160%40ME2PR01MB3828.ausprd01.prod.outlook.com?utm_medium=email&utm_source=footer>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "AtomProbe TC" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to atomprobe-tc...@googlegroups.com
> <mailto:atomprobe-tc...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/atomprobe-tc/LNXP265MB105013CA1079BD4C0C0DAAD785150%40LNXP265MB1050.GBRP265.PROD.OUTLOOK.COM
> <https://groups.google.com/d/msgid/atomprobe-tc/LNXP265MB105013CA1079BD4C0C0DAAD785150%40LNXP265MB1050.GBRP265.PROD.OUTLOOK.COM?utm_medium=email&utm_source=footer>.
>
>
>
> ----------------------------------------------------------------------
> --
> -------------------------------------------------
> Max-Planck-Institut für Eisenforschung GmbH Max-Planck-Straße 1
> D-40237 Düsseldorf
>  
> Handelsregister B 2533
> Amtsgericht Düsseldorf
>  
> Geschäftsführung
> Prof. Dr. Gerhard Dehm
> Prof. Dr. Jörg Neugebauer
> Prof. Dr. Dierk Raabe
> Dr. Kai de Weldige
>  
> Ust.-Id.-Nr.: DE 11 93 58 514
> Steuernummer: 105 5891 1000
>
>
> Please consider that invitations and e-mails of our institute are only 
> valid if they end with …@mpie.de.
> If you are not sure of the validity please contact r...@mpie.de
>
> Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
> aus unserem Haus nur mit der Endung …@mpie.de gültig sind.
> In Zweifelsfällen wenden Sie sich bitte an r...@mpie.de
> -------------------------------------------------
>
> --
> You received this message because you are subscribed to the Google
> Groups "AtomProbe TC" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to atomprobe-tc...@googlegroups.com
> <mailto:atomprobe-tc...@googlegroups.com>.
> To view this discussion on the web, visit
> https://groups.google.com/d/msgid/atomprobe-tc/346570875-916%40xmail1.
> mpie.de
> <https://groups.google.com/d/msgid/atomprobe-tc/346570875-916%40xmail1.mpie.de?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups "AtomProbe TC" group.
To unsubscribe from this group and stop receiving emails from it, send an email to atomprobe-tc...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/atomprobe-tc/98b4f2c0-33a6-bafa-c9ee-ea86333a8e16%40materials.ox.ac.uk.


-------------------------------------------------
Max-Planck-Institut für Eisenforschung GmbH
Max-Planck-Straße 1
D-40237 Düsseldorf

Handelsregister B 2533
Amtsgericht Düsseldorf

Geschäftsführung
Prof. Dr. Gerhard Dehm
Prof. Dr. Jörg Neugebauer
Prof. Dr. Dierk Raabe
Dr. Kai de Weldige

Ust.-Id.-Nr.: DE 11 93 58 514
Steuernummer: 105 5891 1000


Please consider that invitations and e-mails of our institute are
only valid if they end with …@mpie.de.
If you are not sure of the validity please contact r...@mpie.de

Bitte beachten Sie, dass Einladungen zu Veranstaltungen und E-Mails
aus unserem Haus nur mit der Endung …@mpie.de gültig sind.
In Zweifelsfällen wenden Sie sich bitte an r...@mpie.de
-------------------------------------------------

Markus Kühbach

unread,
Oct 31, 2020, 7:35:24 PM10/31/20
to atompr...@googlegroups.com
Part 2, now with the low-level library:

% following https://www.mathworks.com/matlabcentral/answers/308303-why-does-matlab-transpose-hdf5-data
clearvars;
N = 4; %4;
%N = 7.5 * 1e7;
M = 7.5 * 1e7; %7;
%M = 4;
h5fn = 'HDF5SpeedIssueMatlab_02.h5';
nruns = 3;
for i = 1:nruns
clearvars w_arr;
delete 'HDF5SpeedIssueMatlab_02.h5';
fid = H5F.create(h5fn,'H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
%w_arr = single(reshape(1:N*M,[N, M]));
w_arr = single(unifrnd(0.0, 1.0, N, M));
%disp('writing')
%profile the writing part
disp('w');
tic
dtypid = H5T.copy('H5T_IEEE_F64LE'); %IEEE, 64bit floating point low-endian'
dims = size(w_arr);
maxdims = dims;
h5dims = fliplr(dims);
%h5dims = dims;
h5maxdims = fliplr(maxdims);
%h5maxdims = maxdims;
dspcid = H5S.create_simple(2, h5dims, h5maxdims );
dsetid = H5D.create(fid,'/dataset',dtypid,dspcid,'H5P_DEFAULT');
H5D.write(dsetid,'H5ML_DEFAULT','H5S_ALL','H5S_ALL','H5P_DEFAULT',w_arr);
H5D.close(dsetid);
H5S.close(dspcid);
H5T.close(dtypid);
H5F.close(fid);
toc
% clearvars r_arr;
% disp('r');
% tic
% r_arr = h5read(h5fn, '/dataset');
% % OOPS, Matlab high-level lib call internally transposes again
% toc
end

w
Elapsed time is 1.290342 seconds.
w
Elapsed time is 1.259551 seconds.
w
Elapsed time is 1.224056 seconds.

Again data get laid out 7.5e7 x 4 in the HDF5 file but h5disp keeps telling us they are 4 x 7.5e7 ..
OOPS.. but hey, we can enforce that also Matlab write the data 4 x 7.5e7 into the HDF5 file when uncommenting the h5dims = dims and h5maxdims = maxdims
w
Elapsed time is 1.235353 seconds.
w
Elapsed time is 1.292826 seconds.
w
Elapsed time is 1.285740 seconds.

What do we learn?
High- and low-level libs about the same, Matlab plays internal transpose tricks to go from Fortran to C-style layout automatically.
When I would use the high-level languages, which are supposed for end users I would create the ackward situation
That I always end up inconsistent between what I see in HDFView and h5disp, although Matlab takes care of transposes
and speed is not an issue. In my opinion: this creates just unnecessary inconsistencies especially for those users who are the
less experienced when handshaking between Matlab, Python, C, Fortran, R tools...


Bests,
Markus

-----Ursprüngliche Nachricht-----
Von: atompr...@googlegroups.com <atompr...@googlegroups.com> Im Auftrag von Markus Kühbach
Gesendet: Samstag, 31. Oktober 2020 23:59
An: atompr...@googlegroups.com
Betreff: AW: APT-HDF5 vote / Matlab/HDF5 indexing issues
To view this discussion on the web, visit https://groups.google.com/d/msgid/atomprobe-tc/5f69fcf0-ab3c-4ce4-8dfb-7b3213af7f44%40mpie.de.

Markus Kühbach

unread,
Oct 31, 2020, 7:38:10 PM10/31/20
to atompr...@googlegroups.com
Part 3:

I vote yes for disseminating the standard.
Although my substantiated opinion is clearly against using M << N.
I learned sth new: yes, in fact the question is more a matter of being consistent and clear,
instead of speed.


Bests,
Markus

-----Ursprüngliche Nachricht-----
Von: atompr...@googlegroups.com <atompr...@googlegroups.com> Im Auftrag von Markus Kühbach
Gesendet: Sonntag, 1. November 2020 00:35
To view this discussion on the web, visit https://groups.google.com/d/msgid/atomprobe-tc/a81daec9-9078-4f5a-b524-8ff23055f53c%40mpie.de.

Daniel Haley

unread,
Nov 1, 2020, 6:14:24 PM11/1/20
to atompr...@googlegroups.com

Dear TC,

With Markus' vote, I believe the motion is now carried. I will forward
our response to the ISC, to explain the TC's outcome here.

Though I did not vote, I would like to thank the TC for the work here,
and believe this draft standard will provide a solid footing for
improved interoperability and capacity for researchers. I hope that we
will see good engagement and take-up of this standard from our partners
both in academia and industry.

Thanks,

Daniel

London, Andy

unread,
Nov 2, 2020, 5:15:06 AM11/2/20
to atompr...@googlegroups.com
Markus, it looks like you have a much faster hard drive than me! Maybe time for an upgrade?
Andy

For:
N = 4; %4;
M = 7.5 * 1e7; %7;

>> hdf5_speed_test_markus
w
Elapsed time is 28.482847 seconds.
r
Elapsed time is 1.594164 seconds.
w
Elapsed time is 29.007297 seconds.
r
Elapsed time is 1.397504 seconds.
w
Elapsed time is 30.325398 seconds.
r
Elapsed time is 1.413091 seconds.
HDF5 HDF5SpeedIssueMatlab_01.h5
Group '/'
Dataset 'dataset'
Size: 4x75000000
MaxSize: 4x75000000
Datatype: H5T_IEEE_F64LE (double)
ChunkSize: []
Filters: none
FillValue: 0.000000
reading
To view this discussion on the web, visit https://groups.google.com/d/msgid/atomprobe-tc/a81daec9-9078-4f5a-b524-8ff23055f53c%40mpie.de.
Reply all
Reply to author
Forward
0 new messages