howto go from protein fastafile to model in cobrapy

82 views
Skip to first unread message

hakon dahle

unread,
Nov 17, 2017, 1:10:14 PM11/17/17
to cobra pie
Hi

Have a newly sequenced genome that I want to use in FBA through cobrapy. What is the best way to go from a protein fastafile to a model in cobrapy?

So far I've made a model in http://modelseed.org/my-models/ based on a genome uploaded to RAST. I've also installed mackinac on my computer. I realize how I can use mackinac to build a COBRApy model from a genome in PATRIC, but can I use it to get my new genome into COBRApy? If not, are there aleternative solutions to using mackinac?

Cheers
Hakonda

Joshua Lerman

unread,
Nov 17, 2017, 2:21:53 PM11/17/17
to hakon dahle, cobra pie
Best,
Josh

--
You received this message because you are subscribed to the Google Groups "cobra pie" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cobra-pie+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

hakon dahle

unread,
Nov 17, 2017, 3:14:26 PM11/17/17
to cobra pie
He, he... Thanks for the paper Josh, but it doesn't really answer my question.
To unsubscribe from this group and stop receiving emails from it, send an email to cobra-pie+...@googlegroups.com.

Ali Ebrahim

unread,
Nov 17, 2017, 4:21:37 PM11/17/17
to hakon dahle, cobra-pie
Josh's point is that there isn't a "one command" option to do this. A FASTA file doesn't have most of the information you need for a model. It takes a lot of inference to figure out which metabolic reactions occur from just a sequence, and then lots of curation on top of it to make it computationally sound.

Going from a FASTA file to a model is like going from a blueprint to a fully built house 🏠. It's done a lot, but it's not an easy step by step task.

To unsubscribe from this group and stop receiving emails from it, send an email to cobra-pie+unsubscribe@googlegroups.com.

Joshua Lerman

unread,
Nov 17, 2017, 5:12:27 PM11/17/17
to Ali Ebrahim, hakon dahle, cobra-pie
I haven't used of the automated tools that Hakonda mentioned. It might be easier these days than just a few years ago!

Best,
Josh

Ali Ebrahim

unread,
Nov 17, 2017, 8:13:42 PM11/17/17
to Joshua Lerman, hakon dahle, cobra-pie
I'm very curious how automatic "de novo" model generation has come too. If you don't mind Hakonda I'd love to hear your opinion on how well they work.

I don't have any answers for you. I just chimed in to explain Josh's joke (that makes them funnier right) and to warn that I think a manual process will likely still be needed.

hakon dahle

unread,
Nov 18, 2017, 6:02:10 AM11/18/17
to cobra pie
Well, first I should say that I am totally new to FBA and at the moment I'm just trying to figure out the 'big picture'. I realize that there is a lot of refinement involved in FBA. On the other hand, I imagine that if one at least can make initial draft models automatically, that would be a good starting point. Through http://modelseed.org I was able to make a model and even to run an FBA analysis on it. So my question is really how I can get my model in modelseed.org to COBRApy to do firther refinement and more advanced analyses. There is an option to download the model from my workspace in modelseed.org, and I can choose among several formats (eg json). However, I've not been able to upload any of these files to COBRApy (any attempt gives me an error message).

Best
Håkon

Nikolaus Sonnenschein

unread,
Nov 18, 2017, 12:49:21 PM11/18/17
to hakon dahle, cobra pie
Hi Hakon, 

Check out https://www.ncbi.nlm.nih.gov/m/pubmed/28379466/ it provides exactly what you need. 

Best, 

Niko



To unsubscribe from this group and stop receiving emails from it, send an email to cobra-pie+unsubscribe@googlegroups.com.

hakon dahle

unread,
Nov 18, 2017, 2:53:00 PM11/18/17
to cobra pie
Thanks Nikolaus.

I've downloaded mackinac, but I cant find any command giving me the model which is in my workspace of modelseed with id 'test2' (http://modelseed.org/my-models/).

For example, I've tried 'create_cobra_model_from_modelseed_model', but that doesn't work (see below).

#########
mackinac.create_cobra_model_from_modelseed_model("test2")


Traceback (most recent call last):
  File "/Users/nimhd/COPY_folder/COPY/lib/python3.5/site-packages/mackinac/modelseed.py", line 233, in get_modelseed_model_data
    return ms_client.call('get_model', {'model': reference, 'to': 1})
  File "/Users/nimhd/COPY_folder/COPY/lib/python3.5/site-packages/mackinac/SeedClient.py", line 310, in call
    raise ServerError(**err['error'])
mackinac.SeedClient.ServerError: Object not found!

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/nimhd/COPY_folder/COPY/lib/python3.5/site-packages/mackinac/modelseed.py", line 576, in create_cobra_model_from_modelseed_model
    data = get_modelseed_model_data(model_id)
  File "/Users/nimhd/COPY_folder/COPY/lib/python3.5/site-packages/mackinac/modelseed.py", line 235, in get_modelseed_model_data
    handle_server_error(e, [reference])
  File "/Users/nimhd/COPY_folder/COPY/lib/python3.5/site-packages/mackinac/SeedClient.py", line 212, in handle_server_error
    raise ObjectNotFoundError(msg, e.data)
mackinac.SeedClient.ObjectNotFoundError: An object was not found in workspace: "/hak...@patricbrc.org/modelseed/test2"

Mike Mundy

unread,
Nov 20, 2017, 9:53:38 AM11/20/17
to cobra pie
For the workspace error, you can try a couple of things to confirm that the model exists in the workspace. You can get a summary list of models from the ModelSEED server with this command:

mackinac.list_modelseed_models(print_output=True)

Or you can get the details on all of the objects in your workspace from the Workspace server with this command:

mackinac.list_workspace_objects('/hak...@patricbrc.org/modelseed', print_output=True)

This will show you all of the objects in your workspace. You should see a "test2" object of type "modelfolder" if the model exists. It's also possible that something changed on the ModelSEED server and I need to update mackinac.

On the big picture of automated reconstruction, there are a couple of different methods. First, ModelSEED / PATRIC / KBase all use the same method under the covers. A draft reconstruction is created by matching the functional roles in the annotated features in a genome to a database of known function roles. The database also links functional roles to reactions. So if a functional role in a feature from the organism's genome matches a functional role in the database, then the linked reactions are added to the draft model. Typically the matching process does not create a model that produces biomass so you need to run gap filling to get a working model. The key point here is that you must annotate the organism's genome with the same system that was used to create the database because the matching is simply string matching of the functional role. With ModelSEED / PATRIC / KBase the annotation must be done by RAST.

Second, carveme (https://github.com/cdanielmachado/carveme) creates a draft reconstruction by matching protein sequences from the organism to a database of known proteins that are linked to reactions. Again, the draft reconstruction typically needs to be gap filled to produce biomass. The advantage with carveme is that it doesn't rely on the annotation.

In both methods what reactions are put in the draft reconstruction depends on the underlying of database which is based on already known links between genes, proteins, and reactions.

You'll probably still need to make manual updates to the model. For example, the AGORA models of human gut bacteria (https://vmh.uni.lu/#microbes/search) were built from ModelSEED and then manually curated for the conditions in the human gut.

And also keep in mind that different systems use different databases. ModelSEED uses its own biochemistry, carveme uses BiGG biochemistry, and AGORA uses the Virtual Metabolic Human biochemistry (which is close to BiGG).

Rodrigo Colpo

unread,
Nov 24, 2017, 1:35:57 PM11/24/17
to cobra pie
Hi Mike,

I'm trying to do something very similar of what Hakonda is asking for. I couldn't create a model with the command "reconstruct_modelseed_model". So, I created a model using the PATRIC webpage and stored it in a folder named "modelseed". Then, with the command "mackinac.create_cobra_model_from_modelseed_model" I imported the model to CobraPy. However, I'm trying to work with this model on Cobra ToolBox, so removed the dot "." that appeared as the first character in the model ID and I exported it as a ".mat". Then, I ran the fastGapFill function but it didn't worked, probably because of names incompatibility, once Cobra ToolBox was expecting the metabolites names to fit the expression ^[^\[]*\[[a-z]\]$
But the met names after exporting are like cpd00443_c

There is something I can do to be able to work with the model on Cobra ToolBox?

Kind regards,
Rodrigo.

Mike Mundy

unread,
Nov 27, 2017, 12:43:41 PM11/27/17
to cobra pie
You can change the format of the IDs when creating the cobra model with the id_type parameter to create_cobra_model_from_modelseed_model(). For example:

model = create_cobra_model_from_modelseed_model('226186.12', id_type='bigg')

In the returned model, the metabolite and reaction IDs will use the [c] suffix.

Moritz Beber

unread,
Nov 28, 2017, 1:17:59 PM11/28/17
to cobr...@googlegroups.com

Hello Håkon,

I cannot offer you an opinion on how to best construct a model automatically but I would like to draw your attention to a tool we're working on that will allow you to quality control your model as you make manual adjustments to it: https://memote.readthedocs.io/en/latest/

All the best for your effort,

Moritz

Christian Diener

unread,
Dec 4, 2017, 10:24:47 AM12/4/17
to cobra pie
For the sake of completeness: there is also Daniel Machado's CarveME (https://github.com/cdanielmachado/carveme).

Mike Mundy

unread,
Dec 5, 2017, 9:39:50 AM12/5/17
to cobra pie
Rodrigo,

I've been experimenting with running fastGapFill from Cobra Toolbox too. It looks like fastGapFill is expecting a universal database that uses KEGG IDs. I've tried creating a universal database file and dictionary file with ModelSEED reactions.  But the createUniversalReactionModel2 function does lots of reformatting and parsing that assumes KEGG IDs and format of the reaction definition strings. I have a few more ideas to try but I'm not sure that using a non-KEGG universal database will work.

Mike
Reply all
Reply to author
Forward
0 new messages