Error using flux sampling

228 views
Skip to first unread message

Pablo Teixeira

unread,
Feb 25, 2018, 12:38:32 PM2/25/18
to cobra pie
Hello,

I'm trying to use COBRApy to get samples of steady-state fluxes for a metabolic model. I've been using the most simple model I could come up with to try to understand how this works, so I've decided to reproduce the flux split found on figure 1A of this article: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1304643/ (I'm attaching it here as well).

The code I'm using is as follows:

from cobra import Model, Reaction, Metabolite
model=Model('flux_split')

reaction1 = Reaction('V1')
reaction2 = Reaction('V2')
reaction3 = Reaction('V3')

reaction1.lower_bound = 0
reaction2.lower_bound = 0
reaction3.lower_bound = 0

reaction1.upper_bound = 6
reaction2.upper_bound = 8
reaction3.upper_bound = 10

A=Metabolite('A')

reaction1.add_metabolites({A:-1})
reaction2.add_metabolites({A:-1})
reaction3.add_metabolites({A:1})

model.add_reactions([reaction1])
model.add_reactions([reaction2])
model.add_reactions([reaction3])

Then, I've tried using the sampling commands in COBRApy, however I don't get the results I expected (many of the sample points I get are not in the null space of my stoichiometric matrix, and plotting the results I don't see the solution space that I'm supposed to get). I've done:

from cobra.flux_analysis import sample
s = sample(model, 100)

And I've also tried doing:

from cobra.flux_analysis.sampling import OptGPSampler, ACHRSampler
optgp = OptGPSampler(model, processes=3, thinning=1000)
optgp_samples = optgp.sample(1000)
optgp_samples = pd.DataFrame(optgp_samples, columns=[r.id for r in model.reactions])

import numpy as np
b=np.asmatrix(optgp_samples)

I'm attaching the plot I get from this last one here, so you can see the samples I get don't make sense (they should be evenly distributed and in only one plane, and also they don't match the solution space described in the paper).

I'll appreciate any help you can give me. I assume the way I'm building my model is probably wrong, but I've tried to do this in the simplest way possible and I just can't see what's wrong.

Thank you,
Pablo
biophysj00043000F01_HT.jpg
plot.png

Christian Diener

unread,
Feb 26, 2018, 4:11:25 PM2/26/18
to cobra pie
Hi Pablo,

how you construct the model is completely correct. That particular model is pretty difficult for artificial centering since it gets stuck easily and due to the lower dimension there is a chance of choosing a center which makes the sampler walk exclusively along a line.

I will try to find a fix and add the model to the tests.

Cheers
Christian

Christian Diener

unread,
Feb 26, 2018, 4:26:32 PM2/26/18
to cobra pie
Okay, just to let you know I found the issue. This actually only occurs if the nullspace dimnsion of your problem is 2 or less which is not too common. I will add a fix. In the appendix the sample with the fix, which looks pretty much like the figure you includes (just rotated).

Cheers
Christian
optgp.png

Pablo Teixeira

unread,
Feb 26, 2018, 5:08:09 PM2/26/18
to cobra pie
Hi Christian,

Thank you so much! I think your plot looks exactly like what it's supposed to. Would you mind letting me know how I can fix it?

Thanks again,
Pablo

Christian Diener

unread,
Feb 27, 2018, 3:51:30 PM2/27/18
to cobra pie
Hi Pablo, I am completing my fix and will send a PR today. So it should be fixed in a new release soon. The major problem was that there were only two unique sampling directions in the problem setup and thus the mean of those would fall on a line between the two mich made sampling only walk along that line. I fixed that now by checking for that condition. The model actually turned out to have several other little numerical instabilities so it helped me to fix a variety of issues in the sampling code, so thanks for providing this model :)

Note that even with all of the mentioned changes the sampler still generates infeasible samples from time to time (|S*v| > 1e-6). Those are pretty close to feasible, but you can exclude them anyways by doing something like:

```Python
s = optgp.sample(100)
valid = optgp.validate(s) == "v"
s_valid = s[valid]
```

Alternatively you can set the `nproj` parameter to 1 when creating the `OptGPSampler` object. This will check feasibility on every iteration but will also make sampling pretty slow. Thus, I recommend the above solution since for your particular model the infeasible samples are pretty close to feasibility and only occur in less than 5% of all samples...

Cheers
Christian

Pablo Teixeira

unread,
Feb 27, 2018, 10:20:44 PM2/27/18
to cobra pie
Awesome, thank you Christian!

Pablo Teixeira

unread,
Mar 1, 2018, 2:39:19 PM3/1/18
to cobra pie
Hello Christian,

Sorry to bother you again. I tried using a bigger toy model, this time with 12 reactions and 7 metabolites. For one of the fluxes in particular, the samples I get are always fixed to the maximum value allowed, which should not be the case. I can email you my model if necessary. Is this a known issue with sampling in COBRApy?

Thanks a lot!

Best,
Pablo

Christian Diener

unread,
Mar 1, 2018, 8:21:05 PM3/1/18
to cobra pie
Yes, send it to me and I will take a look.

Thanks
Christian

Christian Diener

unread,
Mar 2, 2018, 1:31:44 PM3/2/18
to cobra pie
Ok thanks. Again a problem with the sampler :( But it should be fixed now in https://github.com/opencobra/cobrapy/pull/672. So basically there was a problem if you sampled a completely irreversible model since it would never include any minimizations in the search directions.

Cheers
Christian

Christian Diener

unread,
Mar 6, 2018, 12:28:53 PM3/6/18
to cobra pie
Ok, the fixes have been merged but it will probably take a little longer for a new release. You can test it installing the development version of cobrapy (for instance in a virtual environment) using:

```bash
pip install git+https://github.com/opencobra/cobrapy@devel
```

Cheers
Christian
Reply all
Reply to author
Forward
0 new messages