High Q noise reduction

Ripan Biswas

unread,

Jan 30, 2021, 1:49:13 PM1/30/21

to diffpy...@googlegroups.com

Hi all,

How can we reduce noise at higher Q?

Relatively very high noise in S(q) is appearing (attached screenshot) for amorphous systems (amorphous aluminium hydroxide in this case). The data was collected using laboratory bsed PDF setup using Ag radiation in transmission geometry. How does the noise in S(q) be reduced using PDFgetX3/getX2?

Any help will be appreciated.

Thanking all,

Regards,

Ripan

--

Ripan Kumar Biswas,

Senior Research Fellow (DST-INSPIRE),

XRD Group, Materials Characterization and Instrumentation Division,

CSIR-Central Glass and Ceramic Research Institute,

196, Raja S.C. Mullick Road, Jadavpur, Kolkata - 700032, India.

Screenshot from 2021-01-31 02-08-28 cut.png

Peter Metz

unread,

Jan 30, 2021, 2:43:54 PM1/30/21

to diffpy-users

Hi Ripan,

The tongue-in-cheek answer is "collect better data!" but the reality for weakly scattering materials on flux-limited systems is that this will always be a factor.

Two commonly employed strategies are to use smoothing filters on the scattered intensity, and to use the "Lorch filter" (https://doi.org/10.1088/0022-3719/2/2/305) which damps the high-Q intensity and forces F(Q|Q=Qmax) to converge to 0. Both these strategies are options in the legacy code PDFgetX2, but I don't believe either are implemented in PDFgetX3.

You could conveniently apply a Savitsky-Golay filter available in SciPy (https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.savgol_filter.html) if you're familiar with some basic Python.

-Peter

Peter Metz

unread,

Jan 30, 2021, 2:53:37 PM1/30/21

to diffpy-users

Post-script-- if you have a known amorphous material sample, you could also consider performing the same X hour measurement but binning the data (define the experimental step size) in coarser 2-theta steps to improve the counting statistics at each point. The profile shown here is probably oversampled.

Simon Billinge

unread,

Jan 30, 2021, 3:21:41 PM1/30/21

to diffpy...@googlegroups.com

Thanks Peter.

Actually, Peter's tongue in cheek response is the right one. The best is to use a variable counting scheme so you spend more time counting in the high-Q region. To do this really properly the count time would have to increase exponentially with Q but this is pretty unrealistic. One approach that is easier with normal diffractometer software is to stack up recounts, so use a counting strategy such as measuring from 0<2theta<120 (let's say), followed by 40<2theta<120, then 60<2t<120 , 80-120, 100-120.

A quick comment on Peter's other comments: neither smoothing, nor applying the Lorch function improves the data.

On the smoothing, the Fourier transform itself acts like a low-pass band filter, so in the low-r region where most of the features are in the PDF are present in a disordered material such as Ripan's the FT itself is applying an aggressive smoothing, so smoothing the data only serves to possibly introduce bias. Don't do this. (this argument doesn't apply to smoothing a background before subtracting, but that is a convo for another day). If you want your F(Q) to look less noisy you can rebin it on a coarser grid, but you are NOT improving the data. At best you are having no effect on the resulting PDF and at worst you are introducing artefacts.

The Lorch function is a slightly different story. The main effect of the Lorch is simply reducing the range of Q that you are Fourier transforming over, so it is having a similar effect as lowering your Qmax. Of course, this is what we all do, setting Qmax at the point where our signal/noise becomes unfavorable in the F(Q). The only difference between "applying the Lorch" and "applying the Heaviside" functions (the latter for the less mathy people means cutting at a lower Qmax) is that the Lorch does the cut smoothly and not discontinuously. It systematically deweights the high-Q data and introduces an unphysical "temperature factor" effect. The result is that the PDF is convoluted with a more Gaussian-like termination function and not a sinc function. It is a function that has less ringing, which makes people seem to like it more than the Heaviside because they see fewer "ripples" in the PDF but it doesn't mean the PDF is better. It is a matter of aesthetics on the part of the observer whether you think the resulting PDF looks better. Of course we are sometimes tricked by the ripples so it may be a good tradeoff, especially if your data are bad (did I mention that the right approach is actually to get better data?). I am not saying don't do it, but I am saying, first, don't think it is a trick to improve your data, and second, understand fully what the implications when you do apply it.

Sorry for the long email. I often see gaps in understanding on these points in the literature.

S

--
You received this message because you are subscribed to the Google Groups "diffpy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diffpy-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diffpy-users/0dbb79b3-4f9c-457b-a7de-9114a027b3d1n%40googlegroups.com.

--

Simon Billinge

Professor, Columbia University

Physicist, Brookhaven National Laboratory

Peter Metz

unread,

Jan 30, 2021, 3:36:12 PM1/30/21

to diffpy-users

Hi Simon,

Thanks for your clarifications. I suppose my recommendations were more or less cosmetic ones-- I agree that neither increases the information content nor conjures the true S(Q) values from noise.

Regarding the variable counting strategy, between counting time and the fall off of the atomic form factor, VCT data usually has different noise levels in each discrete 2-theta range. Do you know if anyone has worked through the real-space effect of this, or at what level this discontinuity becomes an issue?

-Peter

Simon Billinge

unread,

Jan 30, 2021, 4:00:49 PM1/30/21

to diffpy...@googlegroups.com

That's a no on your second paragraph question Peter, at least I am not aware of anyone doing it. If not, it needs to be done if there are any volunteers. The process is straightforward ( B. H. Toby and Billinge, S. J. L. “Determination of standard uncertainties in fits to pair distribution functions”. In: Acta Crystallogr. A 60 (2004), pp. 315–317. url: http://scripts.iucr.org/cgi-bin/paper?th5001) but it would require simulating data, propagating the errors to F(Q) and G(r) (pdfgetX2 does this, but I think not getX3).

The starting point is that we need good estimates of the uncertainties on the initial intensity data, which is harder than it seems these days with integrating detectors, and it is not worth doing at all unless we have this. But for lab instruments it might be true that the starting point of the standard deviation on the signal being the square root of the signal is a reasonable assumption. This should be close to true in photon counting detectors in any case. Then go from there. And if the data are simulated you could impose that the intensities are sampled from a normal distribution with the right width.

If you want my gut feeling, I think it won't have too much effect other than improving the data quality. The FT spreads out the uncertainties so they are almost constant in r in the G(r) function even when they are highly structured in reciprocal space (for example, from a pattern with Bragg peaks).

S

To view this discussion on the web visit https://groups.google.com/d/msgid/diffpy-users/17217230-fea8-428d-ac06-d7fef7d1be03n%40googlegroups.com.

Reply all

Reply to author

Forward