2D peak fitting

188 views
Skip to first unread message

Michael

unread,
Nov 25, 2013, 11:17:19 AM11/25/13
to nmrglue...@googlegroups.com
Hi,

I am very interested to use nmrglue for 2D, and eventually pseudo-3D, peak fitting.  Your pseudo-Voight and Voight lineshapes should be superior to the Gaussian or Lorentzian approximations that other programs make.  However, I am running into some problems getting the variable set up for fit_spectrum.  Basically, I intend to use a NMRPipe peak list to get the starting points for fit_spectrum.  I think I do not have one or more of the lists set up properly, so I am hoping you can help out with this.  I have attached my program below.

#!/usr/bin/python

import sys
import argparse
import ConfigParser

import scipy as sp
import nmrglue as ng

# Argument Parser
parser = argparse.ArgumentParser(description='Peak Fitting')
parser.add_argument('-par', type=str, required=True, nargs=1, help='Config file')

if len(sys.argv) == 1:
    parser.print_help()
    sys.exit(1)

args = parser.parse_args()

# Config parser
config = ConfigParser.ConfigParser()
config.read(args.par)

if 'default' not in config.sections():
    exit('Selection \'default\' must be in {:s}.\n'.format(args.par[0]))

default = dict()
for name, val in config.items('default'):
    default[name] = str(val)

dic, data = ng.pipe.read(default['spectra'])
pcomments, pformats, pdata = ng.pipe.read_table(default['table'])

lineshapes = []
params = []
params_bounds = []
amps = []
amps_bounds = []
centers = []
clusters = []
box_width = []

for peak in pdata:

    lineshapes.append('pv')

    params.append([(peak['XW'],peak['YW'])])
    params_bounds.append([(None,None)])
    amps_bounds.append(None)

    centers.append((peak['X_AXIS'],peak['Y_AXIS']))

    clusters.append(peak['CLUSTID'])

    x1, x3 = peak['X1'], peak['X3']
    y1, y3 = peak['Y1'], peak['Y3']
    box_width.append((abs(x3 - x1),abs(y3 - y1)))


fit_args = (data, lineshapes, params, amps, params_bounds, amps_bounds,
            centers, clusters, box_width)

print len(fit_args)

for i in range(len(fit_args)):
    print fit_args[i]
9
[[ 2865.02758789  2520.3059082   2312.50317383 ...,  -205.16036987
   -354.90591431  -398.24462891]
 [ 1395.5300293    991.20330811   928.16766357 ...,  -175.32125854
   -315.77874756  -227.38514709]
 [ -220.14697266  -354.62313843  -161.32113647 ...,  -205.51899719
   -300.23153687   -70.73509979]
 ...,
 [-1862.59765625 -2096.47631836 -2303.00219727 ...,   444.19104004
    451.32055664   450.04867554]
 [-2745.17382812 -2980.73266602 -3092.66162109 ...,   369.76657104
    443.45373535   543.30914307]
 [-3316.6796875  -3295.82275391 -3192.07324219 ...,   280.27035522
    404.68154907   521.03979492]]
['pv', 'pv', 'pv', 'pv', 'pv']
[[(7.5529999999999999, 5.8760000000000003)], [(3.391, 3.8570000000000002)], [(5.782, 4.476)], [(5.5830000000000002, 4.21)], [(3.7370000000000001, 5.3739999999999997)]]
[188062.79999999999, 2070200.0, 195643.79999999999, 213270.70000000001, 236937.20000000001]
[[(None, None)], [(None, None)], [(None, None)], [(None, None)], [(None, None)]]
[None, None, None, None, None]
[(447.91199999999998, 116.292), (443.58199999999999, 131.04400000000001), (474.767, 132.797), (482.84399999999999, 134.11099999999999), (452.03899999999999, 146.386)]
[1, 1, 2, 2, 3]
[(10, 8), (4, 6), (8, 6), (8, 6), (6, 8)]
Traceback (most recent call last):
  File "./glueFit.py", line 70, in <module>
    fit = ng.analysis.linesh.fit_spectrum(*fit_args, error_flg=True, verb=True)
TypeError: fit_spectrum() takes at least 10 arguments (10 given)


As you can see, I am looping over and printing the arguments for fit_spectrum.  The final TypeError thrown makes me think I do not have one of the initial lists set up right.  Any help would be greatly appreciated.

Cheers,
Mike
fit = ng.analysis.linesh.fit_spectrum(*fit_args, error_flg=True, verb=True)

#params_best, amp_best, param_err, amp_err, iers = fit

#print params_best


When I run this, however, I get the following


    amps.append(peak['HEIGHT'])

Jonathan Helmus

unread,
Nov 25, 2013, 12:57:53 PM11/25/13
to nmrglue...@googlegroups.com
Michael,

You email got a little mangled on the list but I was able to follow
it. Voight lineshapes may give you additional information, but they have
additional parameters and are a more complicated model, which may or
may-not be appropriate.

You do not specify a error_flag parameter in your call to
ng.analysis.fit_spectrum[1], change the line to:

fit = ng.analysis.linesh.fit_spectrum(*fit_args, error_flag=True, verb=True)

Also you need to adjust the formatting of a number of your parameters as
required by the fit_spectrum function. It looks like you have 5 peaks
(P=5) and a two dimensional spectrum (N=2) so the parameters should be
as follows:

spectrum : 2 dimensional array, looks to be okay.

lineshapes : list of length 5, looks okay.

params : list of length 5 (P). Each element of this list should be a
length 2 (N) list. Each element of these sublist should have the
starting parameters for the lineshape fitting for that peak/dimension.
nmrglue's pseudo voigt lineshapes [2] need three parameters; x0, fwhm,
and eta (you don't need to specify x), so each sublist should be a
3-tuple. You will need to correct his, 0 is a good estimate for x0
since we assume that the centers are correct, 0.5 might be reasonable
for eta assuming a 50/50 Lorentzian/Gaussian mixture.

bounds : same formatting as params (a length 5 list with each element
bing a length 2 list, each element of that a 3-tuple). Each element of
the 3-tuple should be a 2-tuple with the min and max for the specific
parameter, this can be None if you do not want to limit the parameter.
If you use a Voigh lineshape you may want to bound the eta parameters to
be between 0 and 1, and the fwhm to be positive (min is 0, max None).

centers : list of length 5, each list being a 2-tuple, looks to be okay.

rIDs: list length 5, looks okay.

box_width: 2-tuple specifying the boxes around the peaks. The box size
MUST be uniform for all boxes, fix this. If you want different box
sizes around the various peaks you will need to write you own function
which selects out the cluster regions and fit these with fit_NDregion.

Hope this helps,

- Jonathan Helmus

[1]
http://jjhelmus.github.io/nmrglue/current/reference/generated/nmrglue.analysis.linesh.fit_spectrum.html
[2]
http://jjhelmus.github.io/nmrglue/current/reference/generated/nmrglue.analysis.lineshapes1d.sim_pvoigt_fwhm.html
> --
> You received this message because you are subscribed to the Google
> Groups "nmrglue-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to nmrglue-discu...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Michael

unread,
Nov 25, 2013, 4:33:05 PM11/25/13
to nmrglue...@googlegroups.com
Hi Jonathan,

Sorry my previous email was garbled.  Thank you very much for your reply.  I seem to have made a slight amount of progress, but I am getting hung up when fit_spectrum hands things off to fit_NDregion.  I now have output for the spectrum, lineshape, parameters, amplitudes, bounds, ampbounds, centers, rIDs and box width as follows.


[[ 2865.02758789  2520.3059082   2312.50317383 ...,  -205.16036987
   -354.90591431  -398.24462891]
 [ 1395.5300293    991.20330811   928.16766357 ...,  -175.32125854
   -315.77874756  -227.38514709]
 [ -220.14697266  -354.62313843  -161.32113647 ...,  -205.51899719
   -300.23153687   -70.73509979]
 ...,
 [-1862.59765625 -2096.47631836 -2303.00219727 ...,   444.19104004
    451.32055664   450.04867554]
 [-2745.17382812 -2980.73266602 -3092.66162109 ...,   369.76657104
    443.45373535   543.30914307]
 [-3316.6796875  -3295.82275391 -3192.07324219 ...,   280.27035522
    404.68154907   521.03979492]]

['pv', 'pv']

[[(0.0, 7.5529999999999999, 0.5), (0.0, 5.8760000000000003, 0.5)], [(0.0, 3.391, 0.5), (0.0, 3.8570000000000002, 0.5)], [(0.0, 5.782, 0.5), (0.0, 4.476, 0.5)], [(0.0, 5.5830000000000002, 0.5), (0.0, 4.21, 0.5)], [(0.0, 3.7370000000000001, 0.5), (0.0, 5.3739999999999997, 0.5)]]


[188062.79999999999, 2070200.0, 195643.79999999999, 213270.70000000001, 236937.20000000001]

[[((None, None), (0.0, None), (0.0, 1.0)), ((None, None), (0.0, None), (0.0, 1.0))], [((None, None), (0.0, None), (0.0, 1.0)), ((None, None), (0.0, None), (0.0, 1.0))], [((None, None), (0.0, None), (0.0, 1.0)), ((None, None), (0.0, None), (0.0, 1.0))], [((None, None), (0.0, None), (0.0, 1.0)), ((None, None), (0.0, None), (0.0, 1.0))], [((None, None), (0.0, None), (0.0, 1.0)), ((None, None), (0.0, None), (0.0, 1.0))]]

[(None, None), (None, None), (None, None), (None, None), (None, None)]


[(447.91199999999998, 116.292), (443.58199999999999, 131.04400000000001), (474.767, 132.797), (482.84399999999999, 134.11099999999999), (452.03899999999999, 146.386)]

[1, 1, 2, 2, 3]

(8.0, 8.0)

As I said above I get the following error.

Traceback (most recent call last):
  File "./glueFit.py", line 69, in <module>
    fit = ng.analysis.linesh.fit_spectrum(*fit_args, error_flag=True, verb=True)
  File "/home/latham/.local/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 275, in fit_spectrum
    campbounds, wmask, error_flag, **kw)
  File "/home/latham/.local/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 428, in fit_NDregion
    raise ValueError(err % (i, j))
ValueError: Incorrect number of parameters in peak 0 dimension 0

It seems that fit_spectrum is only passing 2 parameters to fit_NDregion, which knows that it needs 3.

Thoughts???

Thanks again for your help!
-Mike

Jonathan Helmus

unread,
Nov 26, 2013, 11:29:01 AM11/26/13
to nmrglue...@googlegroups.com
Michael,

    The cause of your error was a bug in nmrglue that deals with the adjusting the location parameter in the Vought lineshapes.   I've fixed the bug in the GitHub nmrglue repository [1], if you update nmrglue using git or download the zip file from GitHub you should be able to get things working.  The fix was only two lines [2] in the lineshapes1d.py file if you want to modify your nmrglue install by hand.

    Also you might want to change the error_flag to False initially.  I was getting an error with the fake data I was testing with when I get the error_flag to True, but it may be due to having data consisting of all zeros.

Cheers,

    - Jonathan Helmus

[1] https://github.com/jjhelmus/nmrglue
[2] https://github.com/jjhelmus/nmrglue/commit/ce094a3ae22a200ca8ee9ac44396684f42cfb263

Michael

unread,
Dec 3, 2013, 2:35:24 PM12/3/13
to nmrglue...@googlegroups.com
Hi Jonathan,

Thank you for getting back to me with that fix.  Sorry I have not written back sooner.  I just got back to the lab from Thanksgiving holiday...

I applied your fix, and now I'm getting the following error.  Is this the same as what you were getting with your fake data?


Traceback (most recent call last):
  File "./glueFit.py", line 69, in <module>
    fit = ng.analysis.linesh.fit_spectrum(*fit_args, error_flag=False, verb=True)
  File "/usr/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 275, in fit_spectrum
    campbounds, wmask, error_flag, **kw)
  File "/usr/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 519, in fit_NDregion
    r = f_NDregion(region, ls_classes, p0, p_bounds, n_peaks, wmask, **kw)
  File "/usr/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 806, in f_NDregion
    p_best = leastsqbound(err_NDregion, p0, bounds=p_bounds, args=args, **kw)
  File "/usr/lib/python2.7/site-packages/nmrglue/analysis/leastsqbound.py", line 253, in leastsqbound
    m = _check_func('leastsq', 'func', func, x0, args, n)[0]
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/minpack.py", line 13, in _check_func
    res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
  File "/usr/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 770, in err_NDregion
    sim_region = s_NDregion(list(p), shape, ls_classes, n_peaks)
  File "/usr/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 723, in s_NDregion
    r = s_single_NDregion([A] + curr_p, shape, ls_classes)
  File "/usr/lib/python2.7/site-packages/nmrglue/analysis/linesh.py", line 762, in s_single_NDregion
    r = np.kron(r, ls)   # vector direct product flattened
  File "/usr/lib64/python2.7/site-packages/numpy/lib/shape_base.py", line 756, in kron
    result = concatenate(result, axis=axis)
ValueError: concatenation of zero-length sequences is impossible

Sorry to keep spamming the group list.  I am very keen to get this fitting to work.  Thanks again for all your help.

Cheers,
Mike

Jonathan Helmus

unread,
Dec 4, 2013, 1:21:32 PM12/4/13
to nmrglue...@googlegroups.com
Michael,

    It looks likes the length of one of the simulated dimensions becomes zero during the fitting.  Make sure the values in your box_width parameter are not zero and none of the peaks are close to the edge of the spectrum.  If you have any 9999's in the NMRPipe peak list you may need to prune them or give better values.  If you can attach or upload the script and additional files somewhere it would be easier to debug this further. 

    - Jonathan Helmus
Reply all
Reply to author
Forward
0 new messages