ESRF Data File Support

438 views
Skip to first unread message

Joshua Lande

unread,
Feb 27, 2008, 2:07:28 PM2/27/08
to area-diffrac...@googlegroups.com
I was informed by Apurva of an email by John Daniels who works at the
European Synchrotron Radiation Facility. John (and apparently others)
have requested that the Area Diffraction Machine be able to read in
files of the ESRF Data File format. These files have an .edf
extension. John described this format in an email:

> Hi,
>
> It would be great to be able to put the edf data format directly into
> your program for testing. Unfortunately there is no strict standard
> but
> generally there is a 1kbyte ascii header which contains information,
> most importantly the array size, data type, and byte order. For the
> Pixium detector operated on ID15 an example header would be the
> following (32bit example),
>
> {
> HeaderID = EH:000001:000000:000000 ;
> Image = 1;
> ByteOrder = LowByteFirst ;
> DataType = UnsignedLong ;
> Dim_1 = 2640;
> Dim_2 = 1920;
> Size = 20275200;
> count_time = Na ;
> point_no = 0 ;
> preset = Na ;
> col_end = 1919;
> col_beg = 0;
> row_end = 2639;
> row_beg = 0;
> col_bin = 1;
> row_bin = 1;
> time = Wed Dec 05 18:14:43 2007;
> acq_timestamp = 1196874880.645389;
> dir = N:/inhouse/JohnD/December07/Standards;
> suffix = .edf;
> prefix = S5-12-1810_;
> run = 1;
> title = PIXIUM4700 Detector Image ;
> }
>
> The header always ends with a "}", it is best to search for this and
> then begin loading the data from the next byte (the 1kbyte header size
> is not always adhered to). This should then be able to load images
> from all .edf type files.
>
> ...
>
> Look forward to seeing how it works.
>
> Cheers

I emailed John asking him a few questions about the format:

> Hi John. My name is Joshua Lande and I coded up most all of the Area
> DIffraction Machine. I am an undergraduate at Marlboro College in
> Vermont and I am happy to work with you to get my program to read in
> whatever file formats you are interested in.
>
> Although I haven't yet coded up .edf support, based on your
> description it doesn't look like it should be too hard to get it to
> work. Thanks for the files.
>
> I have a few questions for you at this point. The first is that some
> of the files you sent me store data as 4byte unsigned integers.
> Right now, my program stores any data I load into the program as 4
> byte signed integers because I wrote the program in python and the
> python's Numeric library dose not handle unsigned values very well.
> In principle, I could migrate my code over to unsigned 8 byte
> integers, but this would slow my code down and would require a bit
> of overhead to implement. So I guess my question is: would there be
> any problem with clipping any of the pixels greater than or equal to
> 2^31 = 2,147,483,648 and setting them all to 2^31-1 so that I can
> fit the files you sent me into my array of signed 4 byte integers?
> Or are pixels with these really high intensities important so that
> clipping any of them would be a bad idea? I can always make the
> program spit out warnings (probably to the console) it if finds any
> values that are too large to store in the program.
>
> Second, I see that the data you sent only has the header field
> DataType set to "UnsignedShort" and "UnsignedLong". Are these the
> only possible values? Or should I expect to also see things like
> floating point numbers in the data?
>
> Is the field ByteOrder always set to "LowByteFirst"?
>
> ...
>
> Josh

He replied with the answer

> ...
>
> No problem to convert to the data to 4 byte signed, it will
> certainly be unlikely with all the detectors I use (or any for that
> matter) that this value will be exceeded, but in the unlikely case a
> warning would be great.
>
> The .edf format is a bit of an "anything goes" type format, so
> expecting any data type may be needed for complete compatibility. Is
> the field ByteOrder always set to "LowByteFirst"? I don't know...
> again I guess my advice would be expect anything.
>
> So I apologize for the randomness of the .edf format and hope it
> isn't too much of a pain to add.
>
> Cheers


I will see what I can do to implement this format and respond after I
make some progress.

Josh

joshu...@gmail.com

unread,
Feb 27, 2008, 10:53:24 PM2/27/08
to Area Diffraction Machine
Ok, I coded up the program so that it can read in all the .edf files
I've been given. Right now, it can only open files that are 2 byte or
4 byte integers. If you have any .edf files with the data stored as
anything else, I can try to get my program to read them in. Actually,
if you find any .edf files that my program can't open, just send them
to me and I will see what I can do.

Josh

joshu...@gmail.com

unread,
Feb 27, 2008, 11:00:02 PM2/27/08
to Area Diffraction Machine
One more thing, I put this new feature in version 1.0.1

joshu...@gmail.com

unread,
Feb 28, 2008, 7:28:49 PM2/28/08
to Area Diffraction Machine
Today I got some interesting information from a person named Armando:

> Hello,
>
> ESRF users need to rely on FIT2D to access our ESRF formatted files. I
> guess it would be very interesting for you and for us if your program
> would be able to read and to write ESRF data formatted files. That
> would not require much effort from your side because I am sending to
> you a python module that is able to read and to write those files.
>
> Just python EdfFile.py will generate several edf files.
>
> To read a single image edf:
>
> f = EdfFile.EdfFile(EDFFILENAME)
> data = f.GetData(0)
>
> data will be a 2D numpy array with the image.
>
> To write a single 2D image contained in array data:
>
> edf = EdfFile(EDFFILENAME)
> edf.WriteImage({},data) #You can write any relevant information in the
> dictionnary.
> del edf # to force file close
>
>
> If you have any additional question or comment, please do not hesitate
> to contact me.
>
> Best regards,
>
> Armando

This was shortly followed by:

> Hi again,
>
> It seems you have added your own ESRF.py module.
>
> I would recommend you to use the official module EdfFile.py because it
> is ready for 64 bits and is freely accessible via the PyMca package
> (pymca.sourceforge.net).
>
> If you need the old Numeric version, I can also provide it.
>
> Best regards,
>
> Armando

I replied to him with:

> Hi Armando. I am happy to hear interest among ESRF users for my program
> and I too really want to see it get well implemented. Thanks for your
> informative feedback.
>
> Your EdfFile.py file looks great and will be easy to integrate because
> it is GPL's code like ours. I trust that you guys know better then me
> how to read in these files. I only wrote my hack of a file reader
> because after googling for a while I couldn't find anything like it.
> I'll be happy to rewrite my code to use EdfFile.py instead.
>
> And I am glad that you noticed I am using the antiquated Numeric
> packaged in my program. I've been wanting to switch everything over to
> numpy for a while but have been putting it off to work on more
> noticeable improvements because I am afraid that it will introduce
> weird bugs into the program. So if you could send me the Numeric
> version of the file, that would be great.
>
> The Area Diffraction Machine stores all of its data as 32bit integers,
> so I am not really sure what I would want to do with 64 bit images? Is
> there ever any interesting data in these files with values above
> 2^31-1? Would it be a problem if I just clipped all the data to fit
> into my arrays?
>
> Also, I am not really sure what to do if I found multiple images in one
> of the files. My program can only handle one at a time. Do you think
> the best thing to do would be to add them all together?
>
> Thanks,
>
> Josh

He replied to me with the response

> Hi Joshua,
>
>> Hi Armando. I am happy to hear interest among ESRF users for my
>> program and I too really want to see it get well implemented. Thanks
>> for your informative feedback.
>>
>> Your EdfFile.py file looks great and will be easy to integrate because
>> it is GPL's code like ours. I trust that you guys know better then me
>> how to read in these files. I only wrote my hack of a file reader
>> because after googling for a while I couldn't find anything like it.
>> I'll be happy to rewrite my code to use EdfFile.py instead.
>
> Great!
>
> A thing that would be really interesting for us is that you also offer
> the possibility to save the different images in our format too. As you
> have noticed it is pretty straightforward :-)
>
> WriteImage({'Title': "your own informative message"}, 2Dimagedata,
> DataType = "FloatValue", Append=0)
>
> or
>
> WriteImage({'Title': "your own informative message"}, 2Dimagedata,
> DataType = "DoubleValue", Append=0)
>
> should do the job.
>
>
>> And I am glad that you noticed I am using the antiquated Numeric
>> packaged in my program. I've been wanting to switch everything over to
>> numpy for a while but have been putting it off to work on more
>> noticeable improvements because I am afraid that it will introduce
>> weird bugs into the program.
>
> I ported the whole PyMca in one day. The most important thing was that
> calculation results where identical till the very last digit. That made
> me feel quite confident. Some really minor bugs appeared and were
> solved a couple of weeks later.
>
> Many, many things are just converted replacing import Numeric by import
> numpy.oldnumeric as Numeric. They even offer an automatic conversion
> tool.
>
> The C code turned out to be even easier: just one include to change.
>
>> So if you could send me the Numeric
>> version of the file, that would be great.
>
> It's attached. It uses the old fashion of raising exceptions and may
> give you some unwanted deprecation warnings under python 2.5 when
> having an exception (a diff with the previous numpy version will
> illustrate what I mean) but it will work.
>
>> The Area Diffraction Machine stores all of its data as 32bit integers,
>> so I am not really sure what I would want to do with 64 bit images?
>
> Ok, then use the types "SignedInteger" or "UnsignedInteger" when saving
> edf files. I guess the result of calculations will nevertheless be
> floats or doubles.
>
> We do not use 64-bit images either. The version I am sending to you
> does not support them. The problem with that Numeric version comes when
> reading 32-bit unsigned longs on 64-bit machines. The newer numpy
> version of EdfFile.py deals with that problem properly. I do not think
> you should worry if you generate your EDF images with the FloatValue,
> DoubleValue, SignedInteger or UnsignedInteger datatypes.
>
>> Is
>> there ever any interesting data in these files with values above
>> 2^31-1? Would it be a problem if I just clipped all the data to fit
>> into my arrays?
>
> As I said, that will never happen. Our experimental data are 32 bit.
>
>
>> Also, I am not really sure what to do if I found multiple images in
>> one of the files. My program can only handle one at a time. Do you
>> think the best thing to do would be to add them all together?
>
> I would suggest you to read just the first one ( GetData(0) ) Our users
> will always be able to generate a single image file if needed.
>
>> Also, I've created a news group to discuss group for my program to
>> talk about issues like this. Do you mind if I repost your email to the
>> group?
>
> No problem about. By the way, my own code, PyMca, is also used at SSRL
> (Sean Brennan).
>
>
>> Thanks,
>
> Thanks to you,
>
> Best regards,
>
> Armando


I replied to Armando with the following response

> Thanks for your helpful advice. I took a look at PyMca and although I
> have no idea what it does it looks pretty slick. I especially like how
> it uses PyQt. It looks much nicer on my mac than my program does using
> Tkinter (which IMHO is kind of ugly). Your plots look really nice.
>
> EdfFile.py looks like it does just what I want and I was surprised at
> how little work it took to implement in my code. I want to find a nice
> way to clip all that values that could possibly be above 2^31-1 and it
> doesn't look like Numeric's clip() function is compatible with UInt32
> data. I guess what I am going to have to do is convert whatever data
> the EdfFile object returns into Int32s and then set any values I find
> below 0 to 2^31-1. Anyway, if you know of a better way to clip data
> arrays holding UInt32s, I am all ears.
>
> It would be nice if my program could convert between file formats.
> Right now my program can only save data out as formats that the PIL
> recognizes but it doesn't look like saving as edf data would be
> particularly hard so I will try to add that feature when I have a
> chance.


joshu...@gmail.com

unread,
Feb 28, 2008, 7:32:23 PM2/28/08
to Area Diffraction Machine
I got the following response from Armando:

> Hi Joshua,
>
> Quoting Joshua Lande <jol...@marlboro.edu>:
>
>> Thanks for your helpful advice. I took a look at PyMca and although I
>> have no idea what it does it looks pretty slick. I especially like how
>> it uses PyQt. It looks much nicer on my mac than my program does using
>> Tkinter (which IMHO is kind of ugly). Your plots look really nice.
>
> If one day you decide to forget about TkInter I would say PyQt is a very good
> way to go. I "suffered" TkInter during three years and Qt is way better (and
> faster). I am originally a scientist and my graphical user interfaces are
> functional/practical but still far from being of high aesthetic quality. You
> should se what my computer science colleagues are able to do with
> the very same
> tools :-) (sometimes I get depressed)
>
>>
>> EdfFile.py looks like it does just what I want and I was surprised at
>> how little work it took to implement in my code. I want to find a nice
>> way to clip all that values that could possibly be above 2^31-1 and it
>> doesn't look like Numeric's clip() function is compatible with UInt32
>> data.
>> I guess what I am going to have to do is convert whatever data
>> the EdfFile object returns into Int32s and then set any values I find
>> below 0 to 2^31-1. Anyway, if you know of a better way to clip data
>> arrays holding UInt32s, I am all ears.
>
>
> Do you internally work with UInt32? I thought you would do all
> calculation with
> doubles.
>
> I would not say is a nice way to clip the data but this should be a
> possibility
> (untested):
>
> mask1 = original_data <= (pow(2,31)-1)
> mask2 = original_data > (pow(2,31)-1) #other way to write it mask2
> = (1 - mask1)
> new_data = original_data * mask1 + (pow(2,31)-1) * mask2
> new_data = new_data.astype(Numeric.UInt32)
>
>>
>> It would be nice if my program could convert between file formats.
>> Right now my program can only save data out as formats that the PIL
>> recognizes but it doesn't look like saving as edf data would be
>> particularly hard so I will try to add that feature when I have a
>> chance.
>
> It should be very easy. In my case the user chooses the file format from the
> file dialog itself.
>
>>
>> By the way, I am not actually at SSRL. I am a senior at Marlboro
>> College in Vermont :) I got involved with Apurva over the summer doing
>> a SULI internship. I have applied to Stanford for graduate school but
>> am still waiting to hear back from them.
>
> Sorry, I thought you were at SSRL. I hope you will be there soon ;-)
>
> Best regards,
>
> Armando

Followed shortly by:

> Sorry for the mail bombing. I have just realised a potential issue:
>
> Quoting Joshua Lande <jol...@marlboro.edu>:
>
>> I guess what I am going to have to do is convert whatever data
>> the EdfFile object returns into Int32s and then set any values I find
>> below 0 to 2^31-1.
>
> Some images are background subtracted. Therefore the may have some negative
> values (close to 0).
>
> Some acquisition programs at our side also mark some pixels to be ignored by
> giving them a relatively high negative value.
>
> As far as I know, your program allows automatic masking below and above user
> defined limits. So, I do not expect much problems, but I just wanted
> to point it
> out to you.
>
> Armando

joshu...@gmail.com

unread,
Feb 28, 2008, 7:33:57 PM2/28/08
to Area Diffraction Machine
Finally, I had the following exchange with John from above:

> Hi Josh, I can't load the .edf's, attached is a screenshot of the error.
> Sorry, I'll start using your discussion board next time.

I replied to this with the comment

> You are right. I am sorry about this. I tested my program on the mac but
> not on windows. Usually there wont be bugs on one version that don't
> shown up on the other but it was just a mistake on my part.
>
> Anyway, I've been talking to V. Armando Sole (I don't know if you know
> him?) and he provided a much better library that ESRF wrote for reading
> in edf data. I added in his code and it seems to be working. I will test it a
> little bit more before I upload a new version. I will try to upload it in a
> couple of hours.
>
> Sorry again about not catching the bug.
>
> Best,
>
>Josh

joshu...@gmail.com

unread,
Feb 28, 2008, 7:50:18 PM2/28/08
to Area Diffraction Machine
I replied to Armando

> I figured out what was messing me up (and
> also what is wrong with your code sample). pow(2,31) returns a Long
> (and not an Int) so when you subtract 1 you end up with a Long with
> value pow(2,31). The Numeric library can't deal with long values and
> that is what was messing me up and causing errors to be thrown. I
> hardcoded 2147483647 into your code sample and afterwords everything
> worked well. Thanks for the advice.
>
> I've added into my code the ability to save out files as edf and I
> think everything is working well but I'm going to test it a little bit
> more. I will soon upload the new version and it will be called 1.0.2.
> The only potential issues that may arise is that my program insists
> that all diffraction data it holds is square. It therefore pads
> everything that it reads in with a bunch of 0s to make it square (if
> necessary). Because of this, any data that is saved out will also be
> square (with possibly a lot of blank space in the image). In principle,
> I could try to improve my program at some point to allow for non-square
> images, but it would take a lot of overhead on my part and I am not
> sure if it is really worth it.
>
> Don't worry about sending me too many emails. I don't really have many
> people to talk to about this program in Vermont and I am just happy to
> know that there are other people out there who are interesting in this
> sort of thing :)
>
> Take care,
>
> Josh
Reply all
Reply to author
Forward
0 new messages