Calculate the chi square with RAW API

8 views
Skip to first unread message

Feng Yu

unread,
Oct 2, 2024, 4:23:30 PM10/2/24
to BioXTAS RAW
Hi Jesse,
Could you tell me how to average multiple profile .dat and calculate the chi square by comparing different profiles with RAW Python API.
Thank you.
Best
Feng Yu

Jesse Hopkins

unread,
Oct 2, 2024, 5:09:18 PM10/2/24
to bioxt...@googlegroups.com
HI Feng Yu,

This example shows how to average profiles using the API:

At the moment RAW does not have a chi^2 calculator that is accessible through the API. If you want to do that, you'll have to write your own. You can get the q, I, and uncertainty data as numpy arrays as described here: https://bioxtas-raw.readthedocs.io/en/latest/api/getting_started.html#accessing-q-i-and-uncertainty-data

And then you will have to either write your function or use one you find, such as the chiquare function in the scipy stats package. 

I hope that helps.

All the best.

- Jesse


----
Jesse Hopkins, PhD
Deputy Director
BioCAT, Sector 18
Advanced Photon Source


--
You received this message because you are subscribed to the Google Groups "BioXTAS RAW" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bioxtas_raw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/0d95233e-10bb-4e48-bdd6-d7a7f7e49424n%40googlegroups.com.

Feng Yu

unread,
Oct 2, 2024, 5:14:49 PM10/2/24
to bioxt...@googlegroups.com
Hi Jesse,
Thank you. Another quick question, what function would you use to compare the P(r)? I am thinking about the KL divergence but I am not sure.
Best
Feng Yu

Thomas Grant

unread,
Oct 2, 2024, 6:11:58 PM10/2/24
to bioxt...@googlegroups.com
Hi Feng Yu,

Actually there is a function now in the DENSS.py file in RAW 2.3.0 called calc_chi2() that does what you want. It performs all the scaling and fitting and interpolation. It just takes as input the numpy arrays of the experimental scattering profile (an Nx3 array of q_exp, I_exp, error_exp) and the calculated scattering profile (q_calc, I_calc).  It's not directly part of the API, but at least you wouldn't have to write it yourself. The API code has a couple examples on how to use it when importing the DENSS module, so you could use it as DENSS.calc_chi2(Iq_exp, Iq_calc). There's also some options there for turning on/off scaling, offset, and returning the fit.

Hope that helps,
Tom

Feng Yu

unread,
Oct 2, 2024, 8:13:47 PM10/2/24
to bioxt...@googlegroups.com
Hi Jesse and Thomas,
Thank you for your help.
For those two functions, is it necessary to install ATSAS? Because I encountered an error message : /usr/bin/which: no dammif in "...PATH...". And I am running the analysis on the server so I may have a hard time compiling ATSAS.
In addition, is there any weighted average function in RAW? If not I will use numpy to do a weighted average.
Thank you.
Best
Feng Yu
On Wed, Oct 2, 2024 at 2:09 PM Jesse Hopkins <jesse.b...@gmail.com> wrote:

Jesse Hopkins

unread,
Oct 3, 2024, 7:51:41 AM10/3/24
to BioXTAS RAW
Hi Feng,

There is a weighted average function:

In general the API is (I hope) pretty well documented, I’d recommend giving it a read:

As far as your question for the statistical comparison of P(r) functions, I’m afraid I don’t have a good suggestion there, it’s not something I’ve ever really given much thought to.

All the best.

- Jesse

----
Jesse Hopkins, PhD
Deputy Director
BioCAT, Sector 18
Advanced Photon Source

Richard Gillilan

unread,
Oct 3, 2024, 8:15:35 AM10/3/24
to bioxt...@googlegroups.com
Statistical comparison of P(r)'s would be problematic to get right, I think. Slight changes in choice of Dmax, stiffness, etc can lead to oscillations. I notice that GNOM does generate error bars on the P(r) function, but I haven't read how they do that, and I'll bet they are sensitive to choice of Dmax etc. So, in principle, you could do a chi-square comparison of P(r) functions using those errors, if you trust them. It wouldn't be hard to write a short Python script to do that. 

Another alternative is to compare the I(q) curves obtained from the GNOM calculation ... these are the P(r) functions transformed back to reciprocal space. There, you could use the more rigorous errors measured for the experimental I(q) to see if different P(r) functions produce I(q) that can be statistically distinguished. 

Richard


Thomas Grant

unread,
Oct 3, 2024, 8:39:07 AM10/3/24
to bioxt...@googlegroups.com
On that point, the DENSS IFT (also now in RAW) uses a proper least squares error propagation, so you get real error bars on the P(r) curve (rather than the made up errors that GNOM generates using polynomial fitting). But, like Richard said, the P(r) itself, as well as the errors, are sensitive to things like Dmax and alpha regularization. So take the calculations with some skepticism. 

Tom

Jesse Hopkins

unread,
Oct 3, 2024, 8:40:26 AM10/3/24
to bioxt...@googlegroups.com
Richard,

I like the idea of comparing against the original data (which RAW does in the P(r) window), but that only works if you are generating P(r) functions from the same data, I was assuming the application here was comparing P(r) functions from different datasets to see if they could be statistically distinguished, but I could be wrong.

As far as the uncertainty in GNOM, it's from a monte carlo estimation, they vary the random seed of the calculation and see how much the P(r) function changes.

I agree that differences in dmax and regularization parameters are more likely to dominate the difference in P(r) functions than statistical errors.

All the best.

- Jesse

----
Jesse Hopkins, PhD
Deputy Director
BioCAT, Sector 18
Advanced Photon Source

Richard Gillilan

unread,
Oct 3, 2024, 8:52:07 AM10/3/24
to bioxt...@googlegroups.com

> On Oct 3, 2024, at 8:40 AM, Jesse Hopkins <jesse.b...@gmail.com> wrote:
>
> Richard,
>
> I like the idea of comparing against the original data (which RAW does in the P(r) window), but that only works if you are generating P(r) functions from the same data, I was assuming the application here was comparing P(r) functions from different datasets to see if they could be statistically distinguished, but I could be wrong.
>


They just need to share the same q values, right? I've always thought it would be handy to have an interpolate function in RAW that takes one dataset and uses interpolation to place it on a different a different set of q values so that you could then do quantitative comparisons of different datasets. Maybe you already have something like that somewhere.

Richard

Richard Gillilan

unread,
Oct 3, 2024, 8:55:11 AM10/3/24
to bioxt...@googlegroups.com
Years ago, I worked with a visualization system that had that functionality. They called it "mapping." So, you could map one dataset onto another with different geometry.
> --
> You received this message because you are subscribed to the Google Groups "BioXTAS RAW" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bioxtas_raw...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/42EA3659-D460-48F7-99BD-EA807420ACFA%40cornell.edu.

Stephen Paul Meisburger

unread,
Oct 3, 2024, 9:05:57 AM10/3/24
to bioxt...@googlegroups.com
I think the errors in p(r) are correlated with each other, due to properties of band limited Fourier transform. That might complicate any comparison in p(r) space.
-Steve 

On Oct 3, 2024, at 8:40 AM, Jesse Hopkins <jesse.b...@gmail.com> wrote:



Jesse Hopkins

unread,
Oct 3, 2024, 9:54:42 AM10/3/24
to BioXTAS RAW
Richard,  RAW does have interpolation available. 

Jesse 

Sent from a small mobile device. Please excuse errors and brevity.

--
You received this message because you are subscribed to the Google Groups "BioXTAS RAW" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bioxtas_raw...@googlegroups.com.

Richard Gillilan

unread,
Oct 3, 2024, 10:05:53 AM10/3/24
to bioxt...@googlegroups.com
Great! I should get more familiar with the latest version.

Feng Yu

unread,
Oct 9, 2024, 6:18:09 PM10/9/24
to bioxt...@googlegroups.com
Hi,
Thank you for everyone's help. I have one more question about using the DENSS.calc_chi2.  I have a SAXS curve profile object named 'reference' and I interpolated and superimposed the computational curve to the reference. And then I used vstack to compose the array. np.vstack([super_imposed_curve.getQ(),super_imposed_curve.getI(),super_imposed_curve.getErr()]))
But the returned Chi square is very small. Is there anything I need to change?
Thank you.
Best
Feng Yu

Thomas Grant

unread,
Oct 10, 2024, 2:03:26 PM10/10/24
to bioxt...@googlegroups.com
Hi Fenz,

DENSS.calc_chi2 will do the interpolation and scaling for you. However it’s fine to run it on your own too. But be sure to scale the error bars by the same number as the intensities are scaled for superposition. If you don’t scale the errors the chi2 will be wrong. 

Best,
Tom

Reply all
Reply to author
Forward
0 new messages