Potential speed increase

324 views
Skip to first unread message

Robin van Zutphen

unread,
Oct 31, 2024, 9:34:57 AM10/31/24
to mcx-users

Dear Dr. Fang,

I am currently working on a research project that requires running a large number of mcxlab simulations. Given the broad range of optical properties and source/detector configurations we need to test, the estimated runtime is several months. We have already minimized the number of required simulations, and simply reducing the number of launched photons is not an option, as we have determined a necessary photon count to ensure negligible statistical error. Furthermore, we intend to apply weighting afterward for different absorption values and therefore run the initial simulation with μ_a = 0, so setting cfg.minenergy also does not help. Thus, we are hoping there might be further optimizations to reduce runtime.

In our simulations, we are only interested in storing detected photon information. Flux and trajectory data are unnecessary for our analysis. Although we have already minimized the simulation volume resolution, the resulting speed improvement was marginal. This has led us to wonder if there might be additional calculations—either in settings or within the source code—that we could suppress to enhance simulation speed, given that our sole interest is the det.data output.

To provide a clearer picture of our approach, I’ve included some matlab code below that outlines the core of our simulation setup. Any guidance on potential optimizations to reduce runtime would be greatly appreciated.

Kind regards,
Robin

% Basic pencil beam simulation to calculate the reflectance as a function of radial distance later on for different absorption and detector sizes (SFR)


% General params

cfg.nphoton         = 1e8;

cfg.vol             = uint8(ones(60,60,60)); % Not reduced here

cfg.unitinmm        = 1;

cfg.gpuid           = 1;

cfg.autopilot       = 1;

cfg.tstart          = 0;

cfg.tend            = 5e-9;

cfg.tstep           = 5e-9;

cfg.isreflect       = 0;


% Sim specific settings
% Different phasefunctions not included here

mu_sp = 5; % per mm

NA = 0.22; 


% General source and detector params

cfg.issrcfrom0      = 1;

cfg.respin          = 1;

cfg.srctype         = 'pencil';

cfg.srcpos          = [30 30 0];

cfg.srcdir          = [0 0 1];

cfg.bc              = 'aa_aaa001000';

cfg.maxdetphoton    = cfg.nphoton;

cfg.maxjumpdebug    = 1e5;             

cfg.savedetflag     = 'dspxv';

max_NA   = NA; 

theta_acc_max = asin(max_NA / 1.35);    

cfg.angleinvcdf = linspace(0, asin(max_NA / 1.35) / pi, 5);



  

cfg.prop = [0 0 1 1; 0 mu_sp / (1 - 0.9) 0.9 1.35]; 


cfg.seed = -1; 


[~, det, ~, ~, ~] = mcxlab(cfg);


r_max = fiber_diameter;  % Not defined here

r_tmp = sqrt(sum((tmp_det.p(:, 1:2) - cfg.srcpos(1:2)).^2, 2));   

det.data(:, r_tmp > r_max) = [];  % Drop outside radius


% Remove photons outside acceptance angle

tmp_angles = acos(-(tmp_det.data(9, :)));

det.data(:, tmp_angles > theta_acc_max) = [];


save(‘detected photons’, ‘det.data’)


Fang, Qianqian

unread,
Nov 3, 2024, 12:45:14 PM11/3/24
to mcx-users
Sorry for the delayed response, have been quite busy.

It seems you have put a lot of thoughts on accelerating this task already.

I have a few quick comments

  1. Do not use ~. MATLAB/octave both supports ~, but MATLAB does not have a function like the isargout() in octaveto tell a function to tell a function (in this case, mcxlab), to skip any specific output. Because of this, mcxlab has to compute all outputs based on nargout, which counts ~, and just not to pass the result to the caller - which results in wastful computation
  2. If you type [~, det, ~, ~, ~] = mcxlab(cfg) as in your example, mcxlab will compute all 5 outputs because nargout = 5. If you want to skip the last 3, you just use [flux, det]=mcxlab(). If you do not want to allocate flux, just clear it afterwards. [~, det] won't save any computation

  3. You can manually disable volumetric fluence output by setting cfg.issave2pt=0. This way, even you request flux in the output, mcxlab won't compute it, resulting in significant speed increase, this was enabled by this commit in 2022: https://github.com/fangq/mcx/commit/d45f08495739c85293dd8cda0ecb91ab859afbf6
  4. If your goal is not to save fluence, then there is no need to use a 60x60x60 voxelated grid - you can just create a 1x1x1 domain with cfg.unitinmm=60 to avoid all the overheads of ray-tracing cross voxel walls. I have never tested 1x1x1, I am not sure if this will trigger the 2D simulation mode - I suppose it will according to this line, you should create a 2x2x2 domain with cfg.unitinmm=30 and you should get almost identical detected photon outputs as the physical space of the domain is the same as 60x60x60 - this should give you much higher speed up.
  5. The other thing to consider creating major speed up is to look at your repetitions - can the quantity you are looking for can be converted from a single simulation? Because mcx does not change scattering path if mua is the only parameter that need to be changed; however, if you alter scattering and g, then, you may have to simulate individually
  6. Do not start with a dense sweep - do a sparse sampling of your parameter space at the beginning and see how smooth is the response, and only refine if it is necessary.
Let me know if I can be of further help.



From: mcx-...@googlegroups.com <mcx-...@googlegroups.com> on behalf of Robin van Zutphen <robinvan...@gmail.com>
Sent: Thursday, October 31, 2024 9:34 AM
To: mcx-users <mcx-...@googlegroups.com>
Subject: [mcx-users] Potential speed increase
 
--
You received this message because you are subscribed to the Google Groups "mcx-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mcx-users+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/mcx-users/718031ba-d572-4d5b-b46c-cd320647ba9dn%40googlegroups.com.

Robin van Zutphen

unread,
Nov 4, 2024, 8:08:33 AM11/4/24
to mcx-users
Dear Dr. Fang,

Thank you for your quick response!

As you noted, a 1x1x1 grid does not work, but a 2x2x2 grid produces identical results. All your other insights are valuable as well. Your help is, as always, much appreciated.

Best regards,  
Robin
Op zondag 3 november 2024 om 18:45:14 UTC+1 schreef Fang, Qianqian:

Matthew S

unread,
Nov 5, 2024, 3:43:42 PM11/5/24
to mcx-users
Dear Dr. Fang,

My lab group is also working on a research project that requires a high volume of mcxlab simulations. Compared to Robin, our needs are similar, except our simulations have a randomized inclusion in the volume with different optical properties. We're simulating SFDI images, and using photon sharing to loop over SFDI illumination patterns for each inclusion. The main output we're interested in is diffuse reflectance.

Your point about disabling volumetric fluence output by setting cfg.issave2pt sounded like it could be very useful in our case. However upon trying it, MATLAB crashes during execution of mcx.mexw64. The crash log indicates an "Access Violation." This problem only seems to occur when using photon sharing, since if I use pattern source with only one pattern, or use a different source type, the program runs fine. Is this a known bug, or are there any steps you would recommend taking to troubleshoot this?

Also, if there are any additional recommendations you have for speeding up simulations like the one I've described, I would really appreciate your input.

Best regards,
Matthew

Princess Margaret Cancer Centre
Toronto, ON

P.S. I've attached the crash report in case that might be useful.
Screenshot 2024-11-04 145124.pngScreenshot 2024-11-05 154145.png

Fang, Qianqian

unread,
Nov 8, 2024, 11:42:37 PM11/8/24
to mcx-users
Hi Robin,

I fixed the 1x1x1 grid issue, and you should be able to set the volume to 1x1x1 to run a 3D simulation. I also added two built-in benchmarks - onecube (1x1x1 with unitinmm=60) and twocube (2x2x2 with unitinmm=30) in mcx binary, both seem to run properly



Compared to the cube60b benchmark, both onecube and twocube runs about 3x faster after disabling the fluence output and using smaller number of voxels.

Qianqian


Sent: Monday, November 4, 2024 8:08 AM
To: mcx-users <mcx-...@googlegroups.com>
Subject: Re: [mcx-users] Potential speed increase
 

Fang, Qianqian

unread,
Nov 9, 2024, 12:00:23 AM11/9/24
to mcx-users
Hi Matthew,

Meant to get back to you sooner, but the week has been busy.

I was able to reproduce the crash in the photon-sharing demo. I will debug it and let you know.

For accelerating your photon sharing simulation, I need to know more what you intend to use the simulation - do you need detected photons data only? Or you need the fluence? When you set randomized inclusion, do you alter absorption properties only? Or you change both mua/mus?

There are a few additional ideas to gain squeeze more speed from mcx
  • if your media has a high g value, you can set cfg.gscatter to 100 or an even smaller number to only consider mus/g within cfg.gscatter number of scattering events, and after that, use mus' and g=0 for subsequent scattering, this can give noticable speed up, but test different gscatter values and pick one that does not change your result to much
  • You can set cfg.minenergy to 1e-4 or 1e-5 to trigger Russian roulette, this will also speedup simulation quite a bit
  • If your intent is to get fluence outputs, you can use our denoiser, either the ANLM filter or our machine learning based denoiser to gain 5x to 50x equivalent more photons
  • if you only change mua in the inclusion, you can just run one baseline simulation and output photon trajectories, and then use the trajectories to rescale your output - this should be much faster than running many simulations
Qianqian


From: mcx-...@googlegroups.com <mcx-...@googlegroups.com> on behalf of Matthew S <msira...@gmail.com>
Sent: Tuesday, November 5, 2024 3:43 PM
To: mcx-users <mcx-...@googlegroups.com>
Subject: Re: [mcx-users] Potential speed increase
 

Fang, Qianqian

unread,
Nov 10, 2024, 2:09:25 PM11/10/24
to mcx-users
Hi Matthew, just want to let you know this bug has been fixed with the following commit I made a moment ago




From: 'Fang, Qianqian' via mcx-users <mcx-...@googlegroups.com>
Sent: Saturday, November 9, 2024 12:00 AM
To: mcx-users <mcx-...@googlegroups.com>
Subject: Re: [mcx-users] Potential speed increase
 

Matthew S

unread,
Nov 14, 2024, 10:01:03 AM11/14/24
to mcx-users
Hi Dr. Fang

Thank you very much for your detailed response, and the bug fix!

We are simulating SFDI images of a 'tumor' inclusion that has a concentration of fluorophore. We first perform an excitation simulation, where we get fluence from the SFDI light source. Then using that fluence we define a 3D pattern light source for a follow-up simulation that is meant to show fluorescence emission. We are primarily interested in receiving diffuse reflectance images of these two simulations. Our application currently only modifies the absorbance between inclusions, but we are hoping to start varying scattering as well.

I was hoping to respond to you sooner, but I wanted to try out the fix for myself first to see what the time savings were. I've been having quite a bit of trouble getting the built version of the new commit, since it seems like the windows version of the overnight build didn't build the mcx.mex file. So I tried building the source code on my end but the process has been quite a pain - I've read some of the old discussion posts and followed the steps in Setup_MCX (under mcx/doc) but still haven't been able to get it built. After jumping through some hoops I was able to get msys64 running and use a make command in mcx/src but get the following error:

nvcc -c -g -lineinfo -Xcompiler -Wall -Xcompiler "-openmp" -DSAVE_DETECTORS -use_fast_math -arch=sm_35 -DMCX_TARGET_NAME='"Jumbo Jolt"' -DUSE_ATOMIC -use_fast_math -DSAVE_DETECTORS -o mcx_core.obj  mcx_core.cu
nvcc fatal   : Value 'sm_35' is not defined for option 'gpu-architecture'
make: *** [mcx_core.obj] Error 1

Not sure where to go from there...

Additionally, I was wondering, are there any plans to add photon sharing support to 3d pattern sources? We can currently photon share to speed up the excitation simulations (which has been really useful), and it would be nice if we could use photon sharing for the emission simulations too.

Best regards,
Matthew

Fang, Qianqian

unread,
Nov 14, 2024, 1:09:39 PM11/14/24
to mcx-users
Hi Matthew,

I just want to make sure - do you need to use octave on window? The mmc.mex is the file name needed for octave on windows, but it is not possible to build - this is because octave mkoctfile uses gcc, while nvcc on windows can only use visual studio, so, these files can not be linked.

If you use matlab instead, you should look for mcx.mexw64. I just checked, and yes, the nightly build mcxlab package does not have this file, likely because I have two nightly build servers are both uploading, and causing trouble. I just turned off one of them, and manually build the mcxlab package, the mex.mexa64 is in the package now.
 
I also fixed the github CI build for mcx two days ago, you should be able to download the github action created packages from https://mcx.space/nightly/github/

Let me know if these mex files are what you are looking for.

For your other question, I believe, photon-sharing should already be supported for 3d patterns, 


If your tests show otherwise, please send me a small reproducer so I can investigate.

Qianqian


Sent: Thursday, November 14, 2024 10:00 AM

Matthew S

unread,
Nov 20, 2024, 4:32:37 PM11/20/24
to mcx-users

Hi Dr. Fang,

Thank you very much for your quick response. I've been running various tests since Friday and wanted to give you an update. First off, I use MATLAB not Octave, so you're right that mcx.mexw64 is the one I really needed, thank you very much for building the windows version.

I downloaded the nightly build and got it up and running, with some very impressive time savings - it was able to decrease times by up to 80% in some of my tests! However, I realized that this actually doesn't solve my problem, since photon sharing seems to conflict with the way detphoton saves a photon's initial weights (for example, detphoton.w0 becomes values in the 1000s rather than between 0 and 1). This gave me an idea though: I was thinking that I could perform post-processing photon sharing by doing the following:
  1. Run one simulation with each tile in the srcpattern having a unique weight,
  2. Use w0 from the detected photons to determine their initial position.
  3. Knowing the initial position of the photon, determine what its initial weight would be for an arbitrary pattern (for example, an SFDI source), and scale w0 accordingly.
  4. Use mcxdetweights with the scaled w0 to get photon weights and bin them into pixels based on their end position to get diffuse reflectance.
I tried it out and got some results - in general the images come out more noisy (see below), and the scaling is much different, but it does give a believable image. I was wondering what your thoughts on this method and its validity might be?

Screenshot 2024-11-20 153041.png

As for 3D photon sharing I tried it on my computer and couldn't get it to work. 3D pattern works fine on its own but when I add a second pattern and change cfg.srcnum to 2, the script breaks with the following error:

      GPU=1 (NVIDIA GeForce GTX 1660 Ti) threadph=203 extra=22144 np=10000000 nthread=49152 maxgate=1 repetition=1
      initializing streams ... Unknown Exception from thread (0)C++ Error: MCXLAB Terminated due to an exception!

I've added my code below if you would like to try to replicate the error on your end.


Best regards and many thanks again,
Matthew


//////////////////////////////////////////////////////////////////
clear cfg

% define source pattern
Rsrc = 20;
[xi, yi, zi] = ndgrid(-Rsrc:Rsrc, -Rsrc:Rsrc, -Rsrc:Rsrc);
sphsrc = ((xi .* xi + yi .* yi + zi .* zi) <= Rsrc * Rsrc);

sphsrc_inv = ones(size(sphsrc)) - sphsrc;

pattern = permute(cat(4, sphsrc, sphsrc_inv), [4 ,1, 2, 3]); % size is 2 x 41 x 41 x41

dim = 60;

% basic settings
cfg.nphoton = 1e7;

cfg.tstart = 0;
cfg.tend = 5e-9;
cfg.tstep = 5e-9;
cfg.respin = 1;
cfg.seed = 99999;
cfg.outputtype = 'energy'; % should get the energy deposits in each voxel

cfg.srctype = 'pattern3d';
% cfg.srcpattern = sphsrc;
cfg.srcpattern = pattern;
cfg.srcnum = 2;
cfg.srcparam1 = [41, 41, 41];
cfg.srcpos = [10, 10, 20];
cfg.srcdir = [0 0 1 nan];
cfg.autopilot = 1;

% define volume and inclusions
cfg.vol = ones(dim, dim, dim);
cfg.prop = [0      0    1    1     % Boundary
            0.003  0.03 0.8  1     % Tissue mua = 0.003 mus = 0.03 n = 1.85/1.37
            0.006  0.09 0.8  1];   % Cancer mus  = 1.09 n = 1.37
cfg.unitinmm = 1e-3;

flux = mcxlab(cfg);

////////////////////////////////////////////////////////////////////////////

Fang, Qianqian

unread,
Nov 21, 2024, 1:40:40 AM11/21/24
to mcx-users
Hi Mathew,

The detp.w0 output in the photon-sharing mode does exactly what you expected - it is an integer 1D index to the srcpattern array, so you can look up the values to retrieve the launch weight for all srcpatterns


> As for 3D photon sharing I tried it on my computer and couldn't get it to work. 3D pattern works fine on its own but when I add a second pattern and change cfg.srcnum to 2, the script breaks with the following error:

Thanks for the test script. I was able to reproduce the issue, and commit a fix


Please try it again

Qianqian




Sent: Wednesday, November 20, 2024 4:32 PM

Matthew S

unread,
Dec 17, 2024, 10:27:20 AM12/17/24
to mcx-users
Hi again Dr. Fang,

My apologies for not replying sooner, but I wanted to thank you for the fix. I've been running tests with 3D photon sharing and it has worked great. Thank you for all your help!

Best,
Matthew
Reply all
Reply to author
Forward
0 new messages