Calculate the chi square with RAW API

Feng Yu

unread,

Oct 2, 2024, 4:23:30 PM10/2/24

to BioXTAS RAW

Hi Jesse,

Could you tell me how to average multiple profile .dat and calculate the chi square by comparing different profiles with RAW Python API.

Thank you.

Best

Feng Yu

Jesse Hopkins

unread,

Oct 2, 2024, 5:09:18 PM10/2/24

to bioxt...@googlegroups.com

HI Feng Yu,

This example shows how to average profiles using the API:

https://bioxtas-raw.readthedocs.io/en/latest/api/ex_batch_profile.html

At the moment RAW does not have a chi^2 calculator that is accessible through the API. If you want to do that, you'll have to write your own. You can get the q, I, and uncertainty data as numpy arrays as described here: https://bioxtas-raw.readthedocs.io/en/latest/api/getting_started.html#accessing-q-i-and-uncertainty-data

And then you will have to either write your function or use one you find, such as the chiquare function in the scipy stats package.

I hope that helps.

All the best.

- Jesse

----

Jesse Hopkins, PhD

Deputy Director

BioCAT, Sector 18

Advanced Photon Source

--
You received this message because you are subscribed to the Google Groups "BioXTAS RAW" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bioxtas_raw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/0d95233e-10bb-4e48-bdd6-d7a7f7e49424n%40googlegroups.com.

Feng Yu

unread,

Oct 2, 2024, 5:14:49 PM10/2/24

to bioxt...@googlegroups.com

Hi Jesse,

Thank you. Another quick question, what function would you use to compare the P(r)? I am thinking about the KL divergence but I am not sure.

Best

Feng Yu

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAGRN2W0ZZLsemwC2BRWj%3DnYrX4y286zX25QufU8hj0Hv_Qx85A%40mail.gmail.com.

Thomas Grant

unread,

Oct 2, 2024, 6:11:58 PM10/2/24

to bioxt...@googlegroups.com

Hi Feng Yu,

Actually there is a function now in the DENSS.py file in RAW 2.3.0 called calc_chi2() that does what you want. It performs all the scaling and fitting and interpolation. It just takes as input the numpy arrays of the experimental scattering profile (an Nx3 array of q_exp, I_exp, error_exp) and the calculated scattering profile (q_calc, I_calc). It's not directly part of the API, but at least you wouldn't have to write it yourself. The API code has a couple examples on how to use it when importing the DENSS module, so you could use it as DENSS.calc_chi2(Iq_exp, Iq_calc). There's also some options there for turning on/off scaling, offset, and returning the fit.

Hope that helps,

Tom

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAO%2BxLvu2V%3D1Tgd5hZXBSL4Xw_d2B6-X%3DAQOKRTmoeOg_As%2B-3g%40mail.gmail.com.

Feng Yu

unread,

Oct 2, 2024, 8:13:47 PM10/2/24

to bioxt...@googlegroups.com

Hi Jesse and Thomas,

Thank you for your help.

For those two functions, is it necessary to install ATSAS? Because I encountered an error message : /usr/bin/which: no dammif in "...PATH...". And I am running the analysis on the server so I may have a hard time compiling ATSAS.

In addition, is there any weighted average function in RAW? If not I will use numpy to do a weighted average.

Thank you.

Best

Feng Yu

On Wed, Oct 2, 2024 at 2:09 PM Jesse Hopkins <jesse.b...@gmail.com> wrote:

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAGRN2W0ZZLsemwC2BRWj%3DnYrX4y286zX25QufU8hj0Hv_Qx85A%40mail.gmail.com.

Jesse Hopkins

unread,

Oct 3, 2024, 7:51:41 AM10/3/24

to BioXTAS RAW

Hi Feng,

There is a weighted average function:

https://bioxtas-raw.readthedocs.io/en/latest/api/main_api.html#bioxtasraw.RAWAPI.weighted_average

In general the API is (I hope) pretty well documented, I’d recommend giving it a read:

https://bioxtas-raw.readthedocs.io/en/latest/api.html

As far as your question for the statistical comparison of P(r) functions, I’m afraid I don’t have a good suggestion there, it’s not something I’ve ever really given much thought to.

All the best.

- Jesse

----

Jesse Hopkins, PhD

Deputy Director

BioCAT, Sector 18

Advanced Photon Source

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAO%2BxLvubJYn4QJGaacz840HcP%2B17O%2BTeZcV5KkeBm-GybTtt%2Bg%40mail.gmail.com.

Richard Gillilan

unread,

Oct 3, 2024, 8:15:35 AM10/3/24

to bioxt...@googlegroups.com

Statistical comparison of P(r)'s would be problematic to get right, I think. Slight changes in choice of Dmax, stiffness, etc can lead to oscillations. I notice that GNOM does generate error bars on the P(r) function, but I haven't read how they do that, and I'll bet they are sensitive to choice of Dmax etc. So, in principle, you could do a chi-square comparison of P(r) functions using those errors, if you trust them. It wouldn't be hard to write a short Python script to do that.

Another alternative is to compare the I(q) curves obtained from the GNOM calculation ... these are the P(r) functions transformed back to reciprocal space. There, you could use the more rigorous errors measured for the experimental I(q) to see if different P(r) functions produce I(q) that can be statistically distinguished.

Richard

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/474BA618-FDD7-48FF-9D8A-6E3FBF167873%40gmail.com.

Thomas Grant

unread,

Oct 3, 2024, 8:39:07 AM10/3/24

to bioxt...@googlegroups.com

On that point, the DENSS IFT (also now in RAW) uses a proper least squares error propagation, so you get real error bars on the P(r) curve (rather than the made up errors that GNOM generates using polynomial fitting). But, like Richard said, the P(r) itself, as well as the errors, are sensitive to things like Dmax and alpha regularization. So take the calculations with some skepticism.

Tom

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/E18E6414-67B8-4985-AD8F-5A9A0F16C79E%40cornell.edu.

Jesse Hopkins

unread,

Oct 3, 2024, 8:40:26 AM10/3/24

to bioxt...@googlegroups.com

Richard,

I like the idea of comparing against the original data (which RAW does in the P(r) window), but that only works if you are generating P(r) functions from the same data, I was assuming the application here was comparing P(r) functions from different datasets to see if they could be statistically distinguished, but I could be wrong.

As far as the uncertainty in GNOM, it's from a monte carlo estimation, they vary the random seed of the calculation and see how much the P(r) function changes.

I agree that differences in dmax and regularization parameters are more likely to dominate the difference in P(r) functions than statistical errors.

All the best.

- Jesse

----

Jesse Hopkins, PhD

Deputy Director

BioCAT, Sector 18

Advanced Photon Source

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/E18E6414-67B8-4985-AD8F-5A9A0F16C79E%40cornell.edu.

Richard Gillilan

unread,

Oct 3, 2024, 8:52:07 AM10/3/24

to bioxt...@googlegroups.com

> On Oct 3, 2024, at 8:40 AM, Jesse Hopkins <jesse.b...@gmail.com> wrote:
>
> Richard,
>
> I like the idea of comparing against the original data (which RAW does in the P(r) window), but that only works if you are generating P(r) functions from the same data, I was assuming the application here was comparing P(r) functions from different datasets to see if they could be statistically distinguished, but I could be wrong.
>

They just need to share the same q values, right? I've always thought it would be handy to have an interpolate function in RAW that takes one dataset and uses interpolation to place it on a different a different set of q values so that you could then do quantitative comparisons of different datasets. Maybe you already have something like that somewhere.

Richard

Richard Gillilan

unread,

Oct 3, 2024, 8:55:11 AM10/3/24

to bioxt...@googlegroups.com

Years ago, I worked with a visualization system that had that functionality. They called it "mapping." So, you could map one dataset onto another with different geometry.

> --
> You received this message because you are subscribed to the Google Groups "BioXTAS RAW" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bioxtas_raw...@googlegroups.com.

> To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/42EA3659-D460-48F7-99BD-EA807420ACFA%40cornell.edu.

Stephen Paul Meisburger

unread,

Oct 3, 2024, 9:05:57 AM10/3/24

to bioxt...@googlegroups.com

I think the errors in p(r) are correlated with each other, due to properties of band limited Fourier transform. That might complicate any comparison in p(r) space.

-Steve

On Oct 3, 2024, at 8:40 AM, Jesse Hopkins <jesse.b...@gmail.com> wrote:

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAGRN2W0vkCD08seHAQKT91mgwLgqwnf3gK_9cwVhn7Swv8pjqg%40mail.gmail.com.

Jesse Hopkins

unread,

Oct 3, 2024, 9:54:42 AM10/3/24

to BioXTAS RAW

Richard, RAW does have interpolation available.

Jesse

Sent from a small mobile device. Please excuse errors and brevity.

--
You received this message because you are subscribed to the Google Groups "BioXTAS RAW" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bioxtas_raw...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/42EA3659-D460-48F7-99BD-EA807420ACFA%40cornell.edu.

Richard Gillilan

unread,

Oct 3, 2024, 10:05:53 AM10/3/24

to bioxt...@googlegroups.com

Great! I should get more familiar with the latest version.

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAGRN2W30K4VUi7ue8LfZp-Z0GUCgAxvjPpxz8pxuyyDjqFswYA%40mail.gmail.com.

Feng Yu

unread,

Oct 9, 2024, 6:18:09 PM10/9/24

to bioxt...@googlegroups.com

Hi,

Thank you for everyone's help. I have one more question about using the DENSS.calc_chi2. I have a SAXS curve profile object named 'reference' and I interpolated and superimposed the computational curve to the reference. And then I used vstack to compose the array. np.vstack([super_imposed_curve.getQ(),super_imposed_curve.getI(),super_imposed_curve.getErr()])).

But the returned Chi square is very small. Is there anything I need to change?

Thank you.

Best

Feng Yu

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAJZJAnnDGOxhqViH-1VVATWtxZgnJ8HerNgEV8DxKyiEPxNEyg%40mail.gmail.com.

Thomas Grant

unread,

Oct 10, 2024, 2:03:26 PM10/10/24

to bioxt...@googlegroups.com

Hi Fenz,

DENSS.calc_chi2 will do the interpolation and scaling for you. However it’s fine to run it on your own too. But be sure to scale the error bars by the same number as the intensities are scaled for superposition. If you don’t scale the errors the chi2 will be wrong.

Best,

Tom

To view this discussion on the web visit https://groups.google.com/d/msgid/bioxtas_raw/CAO%2BxLvv88o%2B2xEDVNED785ZQn-3vxU8X_AQhoVXsctUoT233uw%40mail.gmail.com.

Reply all

Reply to author

Forward