Targeting Compounds for Biosynthesis using Data

everymanbio.com

unread,

Jul 22, 2020, 3:28:24 PM7/22/20

to DIYbio

Hi friends. First and foremost, I want to wish everyone reading this my best wishes during these challenging times.

After spending 15 years working in tech as a software engineer + exec and eye-opening psychedelic experience, I've decided to pivot my life into a new direction: documenting my journey to self-learn genetic engineering and the development of a new product that helps mankind.

It's early days, but I have a working site and social media up here if you'd like to learn more: https://everymanbio.com/ + https://www.instagram.com/everymanbio/

One of the projects I'm using to concentrate and advance my learning is to engineer an organism and use it to produce a valuable compound en masse. The idea isn't new; a couple of youngsters recently utilized this technique to create a $200M company that produces hydrogen peroxide with yeast.

I'm enthralled by the idea of using something like Saccharomyces cerevisiae or e.coli to produce a highly valued compound at low-cost and reasonable biological efficiency.

My question is this: how can I use public data sources to identify potential targets for biosynthesis? Given I can code and things like crawling, big data collection and analysis are well within my wheelhouse, I can't help but to think of the value it might offer to me and the community in narrowing in on which compounds would be best suited for biosynthesis from am GMO'd organism.

But! I don't know what I don't know and I could use some insights into the following:

What features or attributes make a compound a feasible target for biosynthesis? What makes it completely infeasible?
What data sources can be used to search and assess compound targets? I'm looking for ways to both quantify high-demand /valuable compounds and compounds that are suited for biosynthesis.

I realize this is probably fairly wide in scope of a question, but I hope to at least get the conversation going and to be pointed in the right direction.

If you'd like to reach out to me directly to discuss the idea or explore a collaboration, please feel free to email me jo...@everymanbio.com

With much gratitude,

Josh McGinnis, Founder of EverymanBio

https://everymanbio.com

https://www.instagram.com/everymanbio/

Jonathan Cline

unread,

Jul 29, 2020, 4:26:11 PM7/29/20

to DIYbio

https://www.theguardian.com/science/blog/2011/jun/21/scientists-make-lsd-from-microbes

Hint: it didnt work.

Dakota Hamill

unread,

Jul 29, 2020, 4:35:21 PM7/29/20

to diy...@googlegroups.com

https://en.wikipedia.org/wiki/Raspberry_ketone

Raspberry ketone is sometimes used in perfumery, in cosmetics, and as a food additive to impart a fruity odor. It is one of the most expensive natural flavor components used in the food industry. The natural compound can cost as much as $20,000 per kg.^[5] Synthetic raspberry ketone is cheaper, with estimates ranging from a couple of dollars per pound^[9] to one fifth of the cost of the natural product

Not beating a dead horse but that is an example of flavoring that's used a lot. It's the same chemical compound of course, but being made not in a test tube with synthesis means it can be marketed as "natural".

Plenty of high-value flavorings and fragrances I'm sure, and I know there are companies trying to make them with microbes as well.

There's also high value natural products, cancer drugs, etc - that are made by some really crazy chemistry in nature but that can't be done synthetically due to stereochemistry. The more chiral centers the more miserable the synthetic route.

Taxol/Paclitaxel is a good example, starting with Gary Strobel.

They used to legit harvest tree bark from a pacific yew and extract it, talk about a slow growing organism. Then Dr. Strobel found a fungal endophyte that produces the same compound. Now I believe it's done via plant-cell culture in reactors. https://en.wikipedia.org/wiki/Paclitaxel

An interesting journey following it though from bark to bioreactor.

There have been a few natural products I've stumbled across in reading but whose names escape me, that were given up on by pharma because they couldn't be produced at scale. One was from a marine tunicate I think...can't go rip up all the tunicates off this little island to make 1 compound. Sometimes they find those products are produced by endosymbiotic bacteria or fungi, but not always. Idk who is doing tunicate cell expression either, ha.

Another high-value biological is horse-shoe crab blood. LAL for ensuring medical devices are free from pyrogens. That's all off the top of my head. https://en.wikipedia.org/wiki/Limulus_amebocyte_lysate

On Wed, Jul 29, 2020 at 4:26 PM Jonathan Cline <jnc...@gmail.com> wrote:

https://www.theguardian.com/science/blog/2011/jun/21/scientists-make-lsd-from-microbes

Hint: it didnt work.

--
-- You received this message because you are subscribed to the Google Groups DIYbio group. To post to this group, send email to diy...@googlegroups.com. To unsubscribe from this group, send email to diybio+un...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/diybio?hl=en
Learn more at www.diybio.org
---
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diybio+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/e5ba56c9-e575-46ce-9b55-caec693bc955o%40googlegroups.com.

S James Parsons Jr

unread,

Jul 29, 2020, 6:07:37 PM7/29/20

to diy...@googlegroups.com

Hi Josh, let me start off by wishing you the best of luck.

I know the high-value compound you are looking for, its synthetic “FBS”

Read the literature, then identify out each note of the symphony; proteins, salts, growth factors, hormones. After you’ve classified each part, start a series of controls removing one note in different combinations until you find the minimum amount of notes needed to create FBS. Then you’ll know what proteins and growth factors you need to focus on.

This is a similar process Shinya Yamanaka used to isolate the correct growth factors to create iPSC, and win himself the Nobel Prize.

--
-- You received this message because you are subscribed to the Google Groups DIYbio group. To post to this group, send email to diy...@googlegroups.com. To unsubscribe from this group, send email to diybio+un...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/diybio?hl=en
Learn more at www.diybio.org
---
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To unsubscribe from this group and stop receiving emails from it, send an email to diybio+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/diybio/f7e80485-1080-4ab0-ae13-243e9462f846n%40googlegroups.com.

Jonathan Cline

unread,

Jul 30, 2020, 11:05:43 AM7/30/20

to DIYbio

The topic of valuable targets is a good one however given that it's
being asked by someone who is a "software engineer and former exec
looking to focus on synbio to help mankind" I would have to say the
most expeditious route would best be to leave expert biology to the
expert biologists who have individually studied the topic rigorously
for at least a decade via graduate and/or postdoc studies, since you
can always partner with an expert biologist (especially if they have
complimentary skills compared to a technology exec), and instead focus
on the growing number of biosoftware areas which need innovation and
expansion, for example HealthKit, or protein folding gamification such
as Fold.it, or bioinformatics data science algorithms for finding a
new antibiotic.

--
## Jonathan Cline
## jcl...@ieee.org
## Mobile: +1-805-617-0223
########################

唐大綬

unread,

Jul 30, 2020, 9:17:25 PM7/30/20

to diy...@googlegroups.com

everymanbio.com <jos...@mcginn.is> 於 2020年7月23日週四上午3:28寫道：

--

Sean Sullivan

unread,

Aug 3, 2020, 4:09:31 PM8/3/20

to DIYbio

I know you are interested in finding economically attractive targets to synthesize, so I wanted to provide links for a few resources that the government has published where they analyze what chemicals are most attractive to be produced biotechnologically:

Foundational report from 2004 about chemicals that could be made from biomass - https://www.nrel.gov/docs/fy04osti/35523.pdf and another focusing on what can be made out of lignin - https://www.pnnl.gov/main/publications/external/technical_reports/PNNL-16983.pdf

A good community that is very focused on this space - Biofuels Digest - https://www.biofuelsdigest.com/

The caveat to these is that these chemicals would need to be produced at scale, so maybe not the type of target that you need.

The margin on a biotech product tends to be proportional to the difficulty it is to synthesize and inversely proportional to the volume that can be produced. Biological therapeutics are low volume/high margin but there's a lot of regulations around this space. Something I haven't looked too much into but I know the margins are good and the regulatory regime much softer is the Flavors and Fragrances (F&F) industry, especially for fragrances.

One idea I had would be to produce the active ingredients in saffron, the most prominent of which is safranal - https://en.wikipedia.org/wiki/Safranal

Saffron is currently painstakingly harvested from wild-growing flowers and has very low yield which results in a very high price. Check out the following video - Why Saffron Is The World's Most Expensive Spice

You would need to figure out what subset of the compounds that the saffron plant makes that you would need to make to recapitulate the flavor profile, but this can often be surprisingly few of the hundreds of compounds a plan makes. Researchers associated with Jay Keasling found they could make a delicious hoppy beer using no hops by having the brewing yeast make just two (I think) of the natural oils that the hop plant makes - https://www.nature.com/articles/s41467-018-03293-x

My (not current) understanding is that there's a bit of a hop shortage world-wide, could be an opportunity to recreate different hop flavors by having the yeast make their most important chemical components during the brewing process…

In terms of really cool software tools you could develop, the first big idea I came up with would be a single software package or web interface that made it very easy to collect data from disparate biological databases to answer the types of questions a bioengineer might have such as:

Upon seeing the name of a gene/protein that they haven't heard of before - "What does it do? Has it been characterized in vitro or in vivo?"

This could involve pulling information on the substrates that the protein acts on from KEGG, then cross-referencing the BRENDA enzyme database to see if is any published activity data, maybe for a mammalian protein you would be very interested in knowing where the enzyme localizes in the cell which could be extracted from UniProt…

Having a target molecule they want to biosynthesize - "What reactions are necessary to synthesize this metabolite in my chosen organism? What different homologous enzymes are known that I could express to catalyze these reactions? Have any been characterized?"

Also an amazing thing would be enabling better natural language searching of the primary literature. Right now you have to know what keywords you are looking for and go through lots of papers to find relevant information. Being able to query something like, "What gene expression changes happen when yeast is placed in a hyperosmotic solution?" and get papers that focus on that question would be so helpful.

everymanbio.com

unread,

Aug 4, 2020, 11:58:50 PM8/4/20

to DIYbio

To those so gracious enough to invest your valuable time in contributing your thoughts towards my original post, I extend my sincerest gratitude.

I'll add my replies on a few of the responses below and then leave the thread as it sits, less there is a direct question or response that warrants my return. The responses have given me ample information to sit with and explore.

> I would have to say the most expeditious route would best be to leave expert biology to the expert biologists who have individually studied the topic rigorously for at least a decade via graduate and/or postdoc studies

I'd say your reply was the most practical and would likely be the same advice I'd give to someone else in my shoes. This approach would certainly optimize my time and skills for success (success being defined as making some sort of valuable contribution or progress towards the end objective of creating a synthetically engineered compound of high value).

It does not; however, address other personal goals and needs in the venture that correlate to my desire for meaningful work. The details are personal in nature but I'll just say that my definition of success is not contingent upon some end state, but rather to embark on the journey itself.

Thus far, the journey of self-learning synthetic biology and reaching towards an arguably unobtainable goal has already yielded experiences, connections and knowledge that I couldn't have imagined and provided a level of satisfaction in my life that I had yet to obtain from my many years grinding it out in business + tech.

This conversation reminds me of something I share with folks from time-to-time.

I grew up playing a brass instrument for many years for a while, I really struggled hitting the high C. My band director once told me that instead of aiming for the high C, aim higher - aim for the E whenever I practiced. Lo-and-behold, after a few tries, I broke past the barrier and not only did hitting the high C become a non-issue, I found myself playing in the upper registers with relative ease.

I have embraced this idea of aiming higher than what I think I'm capable of in life and it has yielded great results. I can't say that I've always hit the upper register in all attempts, but aiming for something beyond my reach has made me a better person.

> Hi Josh, let me start off by wishing you the best of luck. I know the high-value compound you are looking for, its synthetic “FBS”

I did a little digging into this and I think this is simply just beyond the capabilities of any one man, certainly this one! There are many fine folks working on this, some of which who've made progress, but the shear diversity of uses for FBS across a variety of cell-types makes a wholesale replacement infeasible for what I'm after. That said, there was enormous personal value in researching FBS, it's uses, challenges, etc - I learned a ton. So thanks for the recommendation and of course, the well-wishes.

--

Sean, you've presented several interesting ideas and resources and you're onto something on the software side that perhaps we'll need to dig into further when we next chat.

Re: saffranol

It's a great idea. I did find some relatively recent work on safranal biosythesis, in particular the identification of genes and metabolic pathways for the production of the various apo-carotenoid compounds found in saffron (crocins, crocetin, picrocrocin, safranal, etc)

As recently as June, this paper demonstrated how crispr/cas9 was used to transform yeast to produce crocetin:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7339864/

This could be a great jumping off point for further research.

Your software ideas are much aligned with where my original question was headed and I could ask a million follow-on questions so I'll need to dig into these a bit more. The idea of creating a bioengineering platform that sort of provides a library of components and a tool for linking them together is quite intriguing. That said, I believe there are already some biotech startups right now working to develop ML for doing some variation of this at-scale. Your idea is more of a self-serve whereas I feel like the investment community is more interested in large-scale automated approaches to same problem space and I can see why.

> Also an amazing thing would be enabling better natural language searching of the primary literature. Right now you have to know what keywords you are looking for and go through lots of papers to find relevant information.

There're really two parts to this idea in my mind. One is the NLP component, which is challenging given the varied and unstructured way in which research is published. Some papers leave out or obfuscate important data too which also makes it challenging. I'm also quite surprised that there isn't already some sort of structured way in which research data is shared and published back with the broader community.

> Being able to query something like, "What gene expression changes happen when yeast is placed in a hyperosmotic solution?" and get papers that focus on that question would be so helpful.

Now this is actually a more tractable problem that could be solved with existing search engine technology. I suspect the challenge here will be in gaining access to the research itself as most of it is behind paywall. Presuming access to the research could be addressed, it would then be a matter of identifying metadata or annotating the research to augment the search index. Once that is done, existing search engines are already adept at responding to question-based queries.

--

OK folks. I'll leave it at that. Thank you again to all who responded and expressed support. The journey continues and I look forward to sharing updates on the site, social and my YouTube channel (under construction) in the coming weeks.

Reply all

Reply to author

Forward