Image cleaning and atom finding notebook

32 views
Skip to first unread message

suhas....@gmail.com

unread,
Sep 14, 2018, 9:17:40 AM9/14/18
to pycro...@googlegroups.com
On behalf of Prof. Yang Liu (NC State University):

 This is Yang Liu from North Carolina State University. First of all, I would like to thank you for your short course about Pycroscopy in the M&M meeting. I found it very helpful. I am using your codes and trying to do some image analysis these days. I have some questions and hope you can kindly give me some advice. 

1. Limitation of the image size for cleaning. I tried to do the imaging cleaning for a 1024x1024 pixels STEM image. However, the size of cached files kept increasing up to >100GB (the physical memory for my MAC is 32GB), and the Kernel was down and restarted again with losing all the data. But if I binned the image to 512x512 or 256x256 size, the codes works very well. I did set the maximum memory usage to 16GB (in this code: max_mem = 1024*16). Looks like this doesn't work. So, any method to solve the overflow issue?

2. How to choose the number of components for SVD analysis? For example, after break the image into a sequence of small windows, "the Raw data was of shape (65536, 1) and the windows dataset is now of shape (50176,  1089)", should we set the components to 1089? or it can be any number that is sufficiently large, say, like 1024?

3. The download (save) buttons in the plotted images in the notebook are not working. It cannot save the images by clicking these buttons. Is this anything related to the notebook environment setting?

1. After choosing the motifs, the size of the image is less than the cleaned image (i.e. reconstructed image with the first N components). For example, the raw image and cleaned image are both 512x512. But the images (in pattern matching sores step and the one for finding the atom centers) are only about 380x380. Is this due to the crop of images at some steps? I have tested for several images, this is universal for each try.

2. I would like to find the location of the atoms in the cleaned image and export these locations. I tried to use the data stored in "atom_centroids". However, due to the first question above, these locations are all shifted away for some value (since image size is different). Also, if there are two sets of atoms (A site and B site) in the image, the "atom_centroids" only get one set of the atom locations. Is there any method to export the two sets of atom locations easily with respect to the cleaned image? 

Thank you very much and look forward to hearing from you.

yli...@ncsu.edu

unread,
Sep 14, 2018, 11:35:35 AM9/14/18
to pycroscopy
Thanks Suhas for posting my questions in the group. I spent more time on the codes and figured out the last two questions. 

1.After choosing the motifs, the size of the image is less than the cleaned image (i.e. reconstructed image with the first N components). For example, the raw image and cleaned image are both 512x512. But the images (in pattern matching sores step and the one for finding the atom centers) are only about 380x380. Is this due to the crop of images at some steps? I have tested for several images, this is universal for each try.

The image got cropped twice in the process. The first one is in the cluster step by cropping half of the window size in each direction: cropped_clean_image = clean_image_mat[half_wind:-half_wind + 1, half_wind:-half_wind + 1] 
The second one is in the matting patterns step: double_cropped_image = cropped_clean_image[half_wind:-half_wind, half_wind:-half_wind]
Thus, to match the atom positions of the double cropped image to the raw data, the coordinates should be added the pixel# of the window size. 

2. I would like to find the location of the atoms in the cleaned image and export these locations. I tried to use the data stored in "atom_centroids". However, due to the first question above, these locations are all shifted away for some value (since image size is different). Also, if there are two sets of atoms (A site and B site) in the image, the "atom_centroids" only get one set of the atom locations. Is there any method to export the two sets of atom locations easily with respect to the cleaned image? 

"atom_labels" contains the two sets of atom positions. We can use siteA=atom_labels[0] to get the mofit 0 marked atom positions, and siteB=atom_labels[1] to get the mofit 1 marked atom positions. However, if we need to match to the atom coordinates in the raw image, we need to add a shift (windwo size) to the atom_labels. 
However, I found that the center of the atoms located by this method is about 1-2 pixels shift from the method that used 2D Gaussian fitting. Not sure what caused this yet.  

Thank you!
Yang

Chris Smith

unread,
Sep 17, 2018, 12:21:45 PM9/17/18
to pycroscopy
Yang,

Is it the SVD decomposition where you're running into the out-of-memory error?  If so, then reducing the number of components is a good idea.  From the tests I've done, 128 components is usually plenty.  You can always try more or less as needed.  Because of the way that code works, it has to load the entire dataset into memory at once rather than dividing it into chunks that fit easily in memory as we do in many of our other tools.  Unfortunately we don't have a check to see if you have enough memory to do the SVD at the moment.  It's something that we will be adding though.  

The positions of the centroids are relative to the double-cropped image.  I don't know of anyone who has tried to translate them back to the raw image, but I will look into it.  The shift that you're performing should be the correct method.

For the plots, the matplotlib plotting tools mostly work, but the save figure is one that we've had many problems with.  In out pyUSID package, we have a widget function that will allow you to save any figure.  Simply import pyUSID and call pyUSID.viz.jupyter_utils.save_fig_filebox_button(*figure object*, *filename to save as*).  I'll be adding these to the notebook in the repository if you would prefer to wait.

Please let us know if you have any more questions, and thanks for using Pycroscopy.

Chris Smith
Reply all
Reply to author
Forward
0 new messages