On protein expression in eukaryotes

113 views
Skip to first unread message

Sebastian Cocioba

unread,
Mar 11, 2014, 9:37:34 PM3/11/14
to diybio
Hello everyone!

Hope all is well in your parts of the world. I had a question that one of the many bright minds on the list may have an answer to. I've been thinking about what's important, rather more important, in the successful expression of eukaryotic proteins. In what order would you list the following (most important 10 to least important 1) when it comes to optimal protein expression:

Promoter strength
Transcription factors
Codon optimization
mRNA stability (UTRs, polyA, etc)
Kozak sequence
Terminator efficiency

The for this question came to me in the shower this morning and I've been stumped since. If you print a ton of functional mRNA via a strong constitutive promoter, but its not the most stable thus degrades quickly, would it compare in terms of total protein content to a weaker promoter with a nicely stabilized (native-like) mRNA?

I see super viral promoters as a quick and dirty way to increase protein content. In plants for example the cauliflower mosaic virus promoter region (35s) is the most widely used (especially since final Monsanto patent expired last year or so), and is the cornerstone for basic transient and stable expression vectors. That's all fine and dandy but no viral promoter-driven protein can match rubisco in terms of total protein by mass. Since its the most important protein in plants, its produced ad nauseum and maintained at a very high concentration. Attempts to replicate that kind of protein concentration have failed and IIRC 10% is the current ceiling in transient plant based protein expression. Simply overdriving the transcription is not enough, but what % of all the factors contributing to protein production does strong transcription rate fall under. On the other hand, would a ton of stable mRNA floating around be detrimental to the system? Its hogging ribosomes, so to speak, and the whole cytosol gets flooded with non stop ticker tape parade of recipes for making this one protein. Could that actually limit the production of other proteins by occupying more ribosomes than the rest, statistically speaking? It seems like we tend to underestimate just how large cytosolic space is in comparison to a strand of rna, but also how crowded it is when you take into account the cytoskeleton and its multi-lane super highway of kinase bound proteins. Not sure how to see it. Kinda like the whole argument about spaceship battle scenes in star wars, especially those involving bobbing and weaving around asteroids, where in reality the average distance between any two bodies in our solar system alone is about 15-20km. Anything closer would begin to interact via gravity and accrete. Digression...

Is there some synergistic property that allows for speed greater than the individual effects combined? Assume we are talking about a simplified or minimal promoter just for arguments sake. No fancy 1000 base upstream enhancer sequence that magically folds onto RNAPII and does amazing things. :P

I'm on a hard promoter design kick but want to toy with UTRs and kozaks to see what changes what. Its a long term project with tons of work but I think its worth it, if anything for personal discovery. Ill share my results if I get anything interesting but until then, could someone spare their two cents on the matter? Thanks!



Sebastian S. Cocioba
CEO & Founder
New York Botanics, LLC
Plant Biotech R&D

Sebastian Cocioba

unread,
Mar 11, 2014, 9:45:44 PM3/11/14
to diybio
I meant 20,000km not 20km, pardon the typos, my thumbs are dumb today.


Sebastian S. Cocioba
CEO & Founder
New York Botanics, LLC
Plant Biotech R&D

From: Sebastian Cocioba
Sent: 3/11/2014 9:38 PM
To: diybio
Subject: On protein expression in eukaryotes

Josiah Zayner

unread,
Mar 12, 2014, 2:27:22 PM3/12/14
to diy...@googlegroups.com
You probably can't compare something like RuBisCo to other proteins. What I think you are talking about is how much of a protein can be in a cell. This is a different question then, how much protein can we express.

This is governed:
#1 by promoter & RNAP binding site.
#2 by The protein

Sure, things such as transcription factors and such play a role but let's assume everything like that is the same between RuBisCo and our protein of interest.

Why, how is there so much RuBisCo?
Proteins need to be able to fold properly, not have many off pathway intermediates, not aggregate when folded and not have a penchant to being marked for degradation. They need to not have many nonspecific interactions and play a role in the positive fitness of the organism. I think once promoters and RNAP binding sites are the same this is what it boils down to: How stable is the protein in the cell and what does it interact with? This will keep them around. In protein accumulation staying power is the most important.

People have generated > 10% cell mass of a protein in bacteria and >10% in plants (http://www.plantcell.org/content/25/7/2429.full and http://www.plantphysiol.org/content/148/3/1212.abstract?ijkey=be8dbdf615bc1e3a8de904ba45a8319cb0810bd1&keytype2=tf_ipsecsha) similar to what RuBisCo is, maybe?(it is hard to find definitive numbers but people who say 50% are crazy because that would mean there is more RuBisCo than lignin? but who knows) So it seems to have been done.

In my knowledgeable but sometimes fault opinion it comes down to two things:
How fast can you make the protein?
But More specifically how long can you keep the protein around?
You want protein to accumulate and not be degraded.

Though some people rave about codon optimization I have never seen much of an effect in increasing the quantity of protein overexpression significantly (mostly in bacteria though) I think it's applicability is in unique circumstances.

mRNA stability and Terminator efficiency again probably have their place but I don't think these would be major factors in the majority of cases. If you are talking about squeezing every last ounce of protein out then yes take everything possible into account but the majority of the time I think you can focus on just promoter.RNAP and protein stability.


Josiah Zayner, Ph.D.
NASA Ames Research Center
http://DoItOurselfScience.blogspot.com

Mega [Andreas Stuermer]

unread,
Mar 13, 2014, 7:34:07 AM3/13/14
to diy...@googlegroups.com
Josiah, so what you mean (or part of it) would be the N-rule then? 

Some first or second amino acids on the N-terminus make the protein unstable... But this may differ in plant/bacteria/fungi/mammals? 

Mega [Andreas Stuermer]

unread,
Mar 13, 2014, 7:40:50 AM3/13/14
to diy...@googlegroups.com
What is always helpful: adding the N-terminus of a native protein to the N-terminus of your intended protein...

Expression signals, ribosome ramp, codon usage, stability... All enhanced if you choose the right protein... 

Like add the first (7-15) amino acids of (Rubisco is a bad example because it has to be stable in the chloroplast rather than in the cytosolic environment) ... a very stable protein in the cytosol to your protein...

Josiah Zayner

unread,
Mar 13, 2014, 11:39:09 AM3/13/14
to diy...@googlegroups.com
Do you know of any papers on this N-terminus phenomenon? I have never heard of it.


--
-- You received this message because you are subscribed to the Google Groups DIYbio group. To post to this group, send email to diy...@googlegroups.com. To unsubscribe from this group, send email to diybio+un...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/diybio?hl=en
Learn more at www.diybio.org
---
You received this message because you are subscribed to a topic in the Google Groups "DIYbio" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/diybio/SauMZq1HhgI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to diybio+un...@googlegroups.com.
To post to this group, send email to diy...@googlegroups.com.
Visit this group at http://groups.google.com/group/diybio.
To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/791fb26e-df4b-4938-a143-d4964dc8c252%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Koeng

unread,
Mar 13, 2014, 12:20:45 PM3/13/14
to diy...@googlegroups.com
I have kinda heard of this "rule" but only in prokaryotes since that sequence can influence how the RBS binds to the ribosome. I haven't heard about it in proteins as a rule, though, but I just have experience in prokaryotic cells so I don't know about eukaryotes

Sebastian Cocioba

unread,
Mar 13, 2014, 1:41:26 PM3/13/14
to diy...@googlegroups.com
You mean adding the 5' UTR of a "stable" cytosolic protein? It makes sense since the  cap and polyA tail are important to stability and protects against degradation. Although ubiquitous proteins and long lived mRNA may be two different things. I was planning to do some digging in uniprot and stability (functional not thermodynamic) papers to see if anything has been done on the matter. just have to remember to not use both 5' and 3' UTRs from the same cytosolic (nuclear expressed) protein so it wont recombine. Last thing I want is to replace a tubulin subunit with my gfp lol. Worst case ill slap in some changes and transform some plants with agro. I already have a few combinatorial experiments lined up so a few more won't be too bad. Just really need some dwarf tobacco. Mine is still too big for my tiny lab and arabidopsis still sucks. :P


Sebastian S. Cocioba
CEO & Founder
New York Botanics, LLC
Plant Biotech R&D

From: Josiah Zayner
Sent: 3/13/2014 11:39 AM
To: diy...@googlegroups.com
Subject: Re: [DIYbio] Re: On protein expression in eukaryotes

You received this message because you are subscribed to the Google Groups "DIYbio" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diybio+un...@googlegroups.com.

To post to this group, send email to diy...@googlegroups.com.
Visit this group at http://groups.google.com/group/diybio.
To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/CAEUkM4tp%2BVnO%2BpFBBfdEptxQrq87HczTSRNz%2BTPOn8v7ELoMLg%40mail.gmail.com.

Cory Tobin

unread,
Mar 13, 2014, 2:13:05 PM3/13/14
to diybio
> You mean adding the 5' UTR of a "stable" cytosolic protein?

I think he meant the N-End rule http://en.wikipedia.org/wiki/N-end_rule
http://www.ncbi.nlm.nih.gov/pubmed/22524314

The amino acids at the N-terminus determine how long the protein will
exist before being degraded by the proteasome. There's way more to it
than that, but that's the general concept.

-cory

Cathal Garvey

unread,
Mar 15, 2014, 5:19:07 AM3/15/14
to diy...@googlegroups.com
As far as N-terminal amino rule, yes; it differs by kingdom, sometimes
even at lower branches in the tree. There are scripts online that
calculate relative stability but they all require you to select your target!

> Expression signals, ribosome ramp, codon usage, stability... All
> enhanced if you choose the right protein...

There might be cleaner ways to do ribosome ramp and codon usage.
PySplicer does both, DNA2.0's app does at least codon usage properly,
may also do Ribosome Ramp by now.

As far as more cryptic expression signals, not sure of any tools right
now, so in difficult cases the hacky way of stealing some N-term from
another protein may still be useful.

This does make me think that PySplicer needs a patch to treat the first
codon specially, to try and enhance the RBS without changing the codons,
if possible. Would require additional info for each species' codon
table, but I could always implement it as "if there's RBS info in the
species table, use that, otherwise skip this optimisation".
--
Please help support my crowdfunding campaign, IndieBB: Currently at
73.2% of funding goal, with 0 time left:
http://igg.me/at/yourfirstgmo/x/4252296
T: @onetruecathal, @IndieBBDNA
P: +3538763663185
W: http://indiebiotech.com
0x988B9099.asc
signature.asc

Patrik D'haeseleer

unread,
May 7, 2014, 5:52:15 PM5/7/14
to diy...@googlegroups.com, cathal...@cathalgarvey.me
Hey Cathal - to what extent do the various tricks built into pysplicer hold for expression in a yeast like S. cerevisiae, rather than bacteria? 

Some of the basic principles, such as matching codon frequencies and avoiding hairpins I would assume to be universal. But is the "onramp" phenomenon something that has been observed in yeast as well? How about the association of NGG codons with failed transcription?

Patrik

Cathal Garvey

unread,
May 8, 2014, 4:57:41 AM5/8/14
to diy...@googlegroups.com
Hey Patrik;
WRT On-ramp, that was down to the physics of translation initiation and
ribosomal collision, so although at-last-glance there was little study
outside bacteria, I'd expect it to hold.

As far as NGG codons, I have no idea; I'm not sure the mechanism in
bacteria is well understood yet, and I don't know whether it's been
observed elsewhere yet.
--
T: @onetruecathal, @IndieBBDNA
P: +353876363185
W: http://indiebiotech.com
0x988B9099.asc
signature.asc
Reply all
Reply to author
Forward
0 new messages