Possible bug in the Steinhardt module qlm calculation (v2.2.0 through v.2.5.1)

102 views
Skip to first unread message

robin...@gmail.com

unread,
Jun 10, 2021, 6:35:33 AM6/10/21
to freud-users
Dear Freud team, 

hello again! I've been happily using Freud for the last while and I'm quite pleased about the improvements you've been adding. Recently, however, a colleague and I had to take a very close look at a set of Steinhardt bond order parameters and we found a discrepancy between Freud's results (v2.2.0 up to the most recent v2.5.1 and the github master build) and our custom implementation even while looking at the exact same system with the exact same set of parameters. After quite a bit of digging I believe we have narrowed it down to a bug in the Steinhardt module, specifically to the loops used in the Steinhardt.cc file:

for (freud::locality::NeighborBond nb = ppiter->next(); !ppiter->end(); nb = ppiter->next())

Freud's neighbour lists include the particle itself, which by the looks of it should be cleverly skipped over by starting the loop with ppiter->next(), if my admittedly somewhat rusty C++ iterator knowledge is correct. However, for some reason I haven't been able to figure out, this doesn't happen. The loop actually starts with the particle itself, yielding a delta vector of (0,0,0). The safeguards for zero distance then makes the Ylm contribution to the sum be zero, but it is still counted as a neighbour for the normalization using total_weight. Effectively, this gives all the qlm a scaling error of n_nb / (n_nb-1), which if I have my math right propagates to the same error in the ql. As a workaround, I've added a single ppiter->next(); to advance the iterator by a single step before the loop starts. With this workaround the qlm between Freud and our custom implementation agree, at least to within a global phase difference of pi, which I believe should be irrelevant.

This loop also occurs in the calculation of the averaged qlm, but there I believe its current behaviour is correct since the particle itself should be included. However, if starting the loops from *->next() is supposed to skip the first neighbour, it shouldn't be in the averaging loop.

Even with the workaround and the qlm agreeing, the ql and wl between Freud and our custom implementation appear to be different by a small but still significant factor, so there may be further problems either on either side. Haven't tested the averaged ql/wl yet. 

Both my colleague and I will be busy the next few days with little chance to work on this, so I figured I'd report this bug and ask you guys to take a look :) I'll update this thread if I find something else.

Kind regards,
Robin van Damme

Bradley Dice

unread,
Jun 10, 2021, 3:26:03 PM6/10/21
to freud-users
Hi Robin,

Thanks for the issue report. We're happy to look into this and figure out the root causes. Could you please share more information about your reference code or any tests you've written for checking the behavior? It sounds like there might be multiple problems here. The first issue just sounds like `exclude_ii` might not be enabled (or isn't working), which could be a problem in your user script with how the neighbor list is being created. The second issue with a "small factor" will probably be harder to track down, and we'll want to cross-check with your reference data and code used to compute it if possible.

We've already adopted some of your code as a reference implementation for Minkowski Structure Metrics (test file, reference data). If you read that test file, you'll see there are a couple cases that we ignore because the values differ. It would be helpful to identify whether the "small factor" is related to that or not.

It might be best to create a GitHub issue to track these discussions: https://github.com/glotzerlab/freud/issues

Thanks,
Bradley

Bradley Dice

unread,
Jun 10, 2021, 3:33:22 PM6/10/21
to freud-users
Also of interest to you: we have an open pull request that will make it possible to compute Steinhardt order parameters for multiple values of L at once. It should be substantially faster for those use cases. I don't think this will affect the issues you're seeing; the behavior of the code is very similar, but performance is higher.

robin...@gmail.com

unread,
Jun 16, 2021, 4:43:22 AM6/16/21
to freud-users
Hi Bradley,

thanks for the response. After digging further, it appears I jumped the gun at calling this a bug, but we're not out of the woods yet. You're completely right on the `exclude_ii` call: I was making a custom neighbour list so that I could reuse it for multiple computations, and I hadn't set that flag on the neighbour query for that one, i.e.:

nbQueryDict = dict(mode='ball',r_max=1.4)
nq = freud.locality.AABBQuery.from_system(snap)
nlist = nq.from_system(snap).query(snap.particles.position, nbQueryDict).toNeighborList()
q6calc.compute(snap, neighbors=nlist)
====> includes ii, smaller qlm

 What tripped me up is that apparently that flag gets set automatically if you don't make a custom neighbour list first:

nbQueryDict = dict(mode='ball',r_max=1.4)
q6calc.compute(snap, neighbors=nbQueryDict)
====> excludes ii, correct qlm!

And of course I checked for correctness with the latter rather than the former. My bad! Including `exclude_ii` fixes the former. That's the first issue. However, even though with that fixed the qlm and ql agree with both my colleague's custom code and my own structure metric code, there's still differences between my wl and Freud's, as well as between the average ql/wl of Freud and both custom codes on our end. Comparing these is a bit of a careful exercise, so let me write it down completely and post it in another reply.

robin...@gmail.com

unread,
Jun 16, 2021, 5:32:37 AM6/16/21
to freud-users
Alright, so let me describe the comparisons we've made. This'll be a bit long, apologies.

 For hard spheres, you can distinguish between the FCC and HCP crystal structures by looking at the Steinhardt w4 or average w4 parameters [1]. My colleague and I noticed we found different numbers of FCC-like and HCP-like particles even when we were looking at the same configurations. For reference I've attached one such configuration both as a text and as a gsd file. This is our test configuration: it's a hard sphere system at pressure beta p sigma^3 = 17.0 that's in the middle of nucleation from a fluid to a crystal. Prompted by this difference we compared between three codes that calculate these order parameters: my colleague's (which I'll indicate GC), mine (RvD) and Freud. For my colleague's code we use a cutoff radius of r_c / sigma = 1.4 to define neighbours and it calculates unweighted order parameters, while my code uses the Voronoi definition and calculates the structure metrics. Freud can do both, so we made two sets of comparisons: GC/Freud for neighbours with rc=1.4, and RvD/Freud for the Minkowski structure metrics. My own code is the structure metric one you are already aware of. I can ask my colleague to share his code if we need to. The Freud results are with Freud 2.4.1/2.5.1, but I believe I'd get the same results for any version from 2.2.0 to 2.5.1. For comparison, we calculated q4, q6, w4 and w6 and their averaged versions with each code on the configuration in the .dat file (which is the last snapshot from the .gsd file). Below are histograms of the q's and w's. These compare GC/Freud:

Freud_vs_GC.pngFreud_vs_GC_av.png

As you can see, the non-averaged ql and wl are identical between GC and Freud, which is good news. The averaged ones are different, however, with similar distributions but the avql being slightly larger for GC than for Freud. For the comparison between RvD/Freud:

Freud_vs_RvDMSM.pngFreud_vs_RvDMSM_av.png

The non-averaged ql are in agreement, but the non-averaged wl are not. I suspect that issue is on my side, actually, because for this configuration I would expect wl distributions to be around zero since most of it should be a fluid. The averaged ql are different again, with a similar picture as with the other comparison: RvD produces slightly larger avql than Freud. The avwl are also different, but again I expect that to be on my side. I've also included the Python/Freud script used to generate these BOPs, as well as data files with the ql/wl generated by the other two codes. 

Given that the ql/wl of GC/Freud agree, the ql of RvD/Freud agree, and the avql of GC/RvD agree (as far as they can), my current suspicion is that my own code has a bug in the wl, and possibly Freud has one with the averaging of the ql/wl. I'll be happy to provide more information, should you need it. And thanks again for having a look at this and for pointing me to that pull request, that looks like something I could make good use of.

Cheers,
Robin

PS: I got a rather unhelpful error message trying to post this with images and files attached, so perhaps it will work if I do not attach files and just give a google drive link to the files: https://drive.google.com/drive/folders/13tZtxcYZTZk4V6mdLuKS-smIEvgvf6-Z?usp=sharing

Bradley Dice

unread,
Jun 16, 2021, 12:30:48 PM6/16/21
to freud-users
> What tripped me up is that apparently that flag gets set automatically if you don't make a custom neighbour list first

Yes, this can be a little confusing. We tried to be as self-consistent as possible with the wide range of analysis methods offered in freud. The `exclude_ii` flag has some complicated rules governing its default behavior of True/False. Basically `exclude_ii` defaults to True if the array of "query_points" is not provided (it is assumed to be the same as the "points" array). Since Steinhardt only accepts one array of points, that's always the case. However, custom NeighborLists using the methods in "nlist = nq.from_system(snap).query(snap.particles.position, nbQueryDict).toNeighborList()" cannot make that assumption about whether to perform self-exclusion. It might be possible with a check like `np.allclose(points, query_points)` but we wanted to avoid that potentially expensive check and the complexity of floating-point equality (the NeighborQuery/AABBQuery instance is in single-precision, while the input array of query points might be double precision -- what numerical tolerances are appropriate?).

I'm looking into the Google Drive files you shared now and will follow up. I appreciate your help in identifying the issue!

Best,
Bradley

Bradley Dice

unread,
Jun 16, 2021, 12:33:55 PM6/16/21
to freud-users
That behavior of self-exclusion is somewhat hidden in the docs. We could definitely improve the clarity of how we describe the default values for `exclude_ii`.

Bradley Dice

unread,
Jun 16, 2021, 4:04:32 PM6/16/21
to freud-users
Robin,

I created a GitHub repository that we can share as we investigate this further. I refactored a lot of the code to make it easier to write tests and check the tolerances of each array with `numpy.testing.assert_allclose`. I gave you write permissions to the repo. https://github.com/bdice/freud_boops_accuracy

I'm going to start with comparing the averaged MSM values between freud and your code. We can look into why the averaged MSM values q'4 and q'6 differ, and ignore the w'4/w'6 for now if you think your code has a bug for w'ls. Is this repository the most up-to-date version of your code? https://github.com/ArvDee/Minkowski-structure-metrics-calculator

It's very possible that the same issue is causing the differences in averaged MSM comparisons with your code as the discrepancy with the GC code's averaged data. If it's not the same problem, then I might need to have the GC code available. The fastest way to check the GC/freud discrepancy on the averaged ql data would probably be to check whether the same neighbors are being found by both codes. Since the non-averaged ql/wls agree between GC/freud, I think the neighbor code would be the next thing to investigate.

Thanks,
Bradley

Bradley Dice

unread,
Jun 16, 2021, 9:04:23 PM6/16/21
to freud-users
Robin,

I identified and fixed a bug in the Steinhardt averaging feature. It appears that freud was averaging over two neighbor shells instead of one. See the branch here: https://github.com/glotzerlab/freud/tree/fix/steinhardt-averaging

This branch of the code fixes the GC (rcut=1.4) discrepancies, and also fixes the MSM q'4 / q'6 data (with some catches, mentioned below). You can check it with the test files here: https://github.com/bdice/freud_boops_accuracy

I'm going to try and get some other open pull requests affecting the Steinhardt order parameter merged first, and then re-apply this fix and make a new release of freud. I hope to have a fix released in the next week.

This error might be a remnant of a previous implementation of Steinhardt averaging, where two neighbor shells were needed because the second neighbor shell was used to compute qlmi before averaging that into the first neighbor shell. Currently we compute qlmi for all particles before performing the averaging in a separate loop, which now only should need to iterate over one neighbor shell to compute averaged quantities from the qlmi values previously computed for all particles. The code has been re-implemented a few times and this was likely an oversight. Unfortunately, this bug was not caught by our tests.

It was kind of hard to construct good tests for the MSM data. Some of the Voronoi bonds in your test file have tiny facet weights. For an example, see the bond between particle 136 and 2532 in the debugging output below. That neighbor pair may not even show up for you depending on whether you read the dat or the gsd file, but changing the neighbor counts will affect the averaged parameters by a factor of sqrt(Nb/(Nb+1)) or something like that. Numerical issues between float/double and truncated precision in the dat file made it hard to create a perfect set of reference data for freud. I now get agreement between your MSM reference code and freud for averaged Minkowski Structure Metrics q'l from l=0 to l=6, within an absolute tolerance of 5e-5 for all but 37 of the 3288 particles. I got around this by enforcing that ~98.5% of the particles have to match within the specified tolerance. I will adapt this new testing code into freud's tests, so that we have validation data for the future.

Thanks again for raising this issue. I sincerely appreciate your effort in making this bug reproducible.

---------------------------------------------------
Testing output taken from hours of print debugging. Not very easy to read, sorry.
Particle 136 has 14 neighbors.
lm (6, 0) SELF qlm (-0.0042036,0)
lm (6, 0) n 2408 qlm (-0.202819,0) weight 0.0905396
lm (6, 0) n 2009 qlm (-0.0478929,0) weight 0.0464807
lm (6, 0) n 2298 qlm (-0.200785,0) weight 0.000170854
lm (6, 0) n 2532 qlm (-0.145161,0) weight 7.83829e-09
lm (6, 0) n 182 qlm (-0.0890809,0) weight 0.0964913
lm (6, 0) n 1936 qlm (-0.245068,0) weight 0.112678
lm (6, 0) n 2038 qlm (-0.122834,0) weight 0.0642128
lm (6, 0) n 364 qlm (-0.0757628,0) weight 0.0921617
lm (6, 0) n 781 qlm (-0.186143,0) weight 0.0957721
lm (6, 0) n 2074 qlm (-0.207613,0) weight 0.0307045
lm (6, 0) n 2040 qlm (0.00178278,0) weight 0.0919968
lm (6, 0) n 2043 qlm (-0.118574,0) weight 0.0891164
lm (6, 0) n 2904 qlm (0.00736233,0) weight 0.10411
lm (6, 0) n 1883 qlm (0.057154,0) weight 0.0855653
Particle 136 has q_{6,0} ave (-1.57964,0) before dividing by the neighbor count
Particle 136 has q_{6,0} ave (-0.105309,0)
---------------------------------------------------

Best,
Bradley

robin...@gmail.com

unread,
Jun 17, 2021, 11:28:51 AM6/17/21
to freud-users
Hi Bradley,

glad to see you managed to track down the issue and that the fix appears to be simple. 

> Is this repository the most up-to-date version of your code?

No, I have a newer version that I've only been updating locally. I've added it as a separate branch (https://github.com/ArvDee/Minkowski-structure-metrics-calculator/tree/qlm_dot_product) should you still need it. There's no changes to the core ql/wl calculations compared to the one you linked, though.

>  It appears that freud was averaging over two neighbor shells instead of one. 
> This error might be a remnant of a previous implementation of Steinhardt averaging, where two neighbor shells were needed because the second neighbor shell was used to compute qlmi before averaging that into the first neighbor shell. Currently we compute qlmi for all particles before performing the averaging in a separate loop, which now only should need to iterate over one neighbor shell to compute averaged quantities from the qlmi values previously computed for all particles. The code has been re-implemented a few times and this was likely an oversight. Unfortunately, this bug was not caught by our tests.

I see, yeah that makes perfect sense. Averaging over an extra neighbour shell would make the averages smaller for this configuration, but not for perfect crystal configurations, so it could've slipped past the tests that way. Thanks for the quick fix!

> It was kind of hard to construct good tests for the MSM data. Some of the Voronoi bonds in your test file have tiny facet weights. For an example, see the bond between particle 136 and 2532 in the debugging output below. That neighbor pair may not even show up for you depending on whether you read the dat or the gsd file, but changing the neighbor counts will affect the averaged parameters by a factor of sqrt(Nb/(Nb+1)) or something like that. Numerical issues between float/double and truncated precision in the dat file made it hard to create a perfect set of reference data for freud. I now get agreement between your MSM reference code and freud for averaged Minkowski Structure Metrics q'l from l=0 to l=6, within an absolute tolerance of 5e-5 for all but 37 of the 3288 particles. I got around this by enforcing that ~98.5% of the particles have to match within the specified tolerance. I will adapt this new testing code into freud's tests, so that we have validation data for the future.

Good point, I hadn't considered that. I did notice the small Voronoi neighbour differences when I compared the neighbour pairs myself, but I had dismissed it because the weighting makes the small facets contribution negligibly small. But that's only for the non-averaged qlm: for the averaged qlm we're both doing an unweighted sum and normalizing by the number of neighbours, so there it'll matter again. For the non-av MSM this weighting reduces the noise due to small bond fluctuations and the numerical issues you described, so we would likely get a similar gain if we were to do the qlm averaging in a weighted way as well. Since all the neighbours, weights and qlm are already calculated, it should be cheap. Might be a nice addition.

Anyway, happy to help. Thanks for the quick response/fix :)

Cheers,
Robin

Bradley Dice

unread,
Jun 17, 2021, 11:34:08 AM6/17/21
to freud-users
Robin,

One last thing, would you and your colleague who provided the "GC" data grant us permission to include the Test_Configuration files and the corresponding Steinhardt/MSM reference data in freud's source code for use as validation tests? freud uses a BSD-3 license.

Thanks,
Bradley

Bradley Dice

unread,
Jun 17, 2021, 11:34:55 AM6/17/21
to freud-users
Oops, just saw your reply on GitHub granting permission. Thanks!
Reply all
Reply to author
Forward
0 new messages