A request from SBOL Industrial

10 views
Skip to first unread message

Jake Beal

unread,
May 3, 2022, 5:09:05 PM5/3/22
to SBOL Developers
Hi, folks:

The SBOL Industrial Consortium is interested in getting some help with a need. 

Some companies in the consortium need to communicate about combinatorial libraries of site variants. These are sequences of amino acids (or sometimes nucleic acids) with large degrees of variation in individual residues.  An example might be: "here is a 20 amino-acid sequence, but replace the 14th with any other amino acid, the 12th with one of these three acids, and do these insertions and deletions at the 17th and 20th sites".

The questions posed are:
 - What are some best practices for expressing such things in SBOL?
 - Are there tools, such as Excel-to-SBOL, that can be readily adapted to address this pain point?

Thanks,
-Jake

Chris Myers

unread,
May 3, 2022, 5:23:54 PM5/3/22
to sbol...@googlegroups.com
Hi Jake,

This sounds like a combinatorial derivation of a protein sequence.  In theory, you could construct this with SBOLCanvas, but I just tried and we do not currently support combinatorial derivations for proteins, though may not be difficult to add.

It also possibly could be done with Excel-to-SBOL, but I think it may need some special purpose code.  Jet: thoughts?

Thanks,
Chris

--
You received this message because you are subscribed to the Google Groups "SBOL Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sbol-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sbol-dev/6c9c76ff-b089-4223-95db-484dc02d85can%40googlegroups.com.

Gonzalo Vidal

unread,
May 3, 2022, 7:41:49 PM5/3/22
to sbol...@googlegroups.com
Hi Jake,

I would solve this problem using pySBOL. First define the original sequence as a string. Define the changes using lists, and replace them on the string. Then make a sbol component with each string inside the for loop, and provide a sensible name and description with f strings. Here you can use the higher level constructors to create the sequence en the same step. At the end you add all of them to a collection inside a document. Finally check and validate the document using sbol-utilities functions.

We can discuss implementation details.

Kind regards,

Bryan Bartley

unread,
May 4, 2022, 12:56:00 PM5/4/22
to sbol...@googlegroups.com
Hi Jake,

I think the Sequence Ontology contains terms to describes most of these sequence variants, e.g. amino acid substitution, deletion, insertion etc.  I think it ought to be possible to use these SO terms in conjunction with a CombinatorialDerivation to specify the use cases described.

Bryan

--

Jake Beal

unread,
May 4, 2022, 1:02:25 PM5/4/22
to SBOL Developers
Thank you all --- I agree that the CombinatorialDerivations ought to be able to be pushed in this direction, and I think it would be really cool if we could figure out how to do this with Excel-to-SBOL or another biologist-friendly interface.

Thanks,
-Jake

Nicholas Roehner

unread,
May 4, 2022, 2:53:24 PM5/4/22
to <sbol-dev@googlegroups.com>
Hi Jake, Chris, and Bryan,

Another set of applicable tools are the Densmore lab's Constellation software and the associated GOLDBAR formalism that I worked on. The attached text file contains a GOLDBAR specification similar to what Jake describes and the JSON file defines the collections of parts (in this case amino acids) referenced in the GOLDBAR spec. If you enter these at https://constellationcad.org/, it produces the SBOL (version 2) for the equivalent CombinatorialDerivation (see the attached XML file).

There are few things that would likely need to be tweaked (currently Constellation gives everything the DNA type, supports a subset of commonly use roles, and does not use SBOL 3), but it is close in terms of backend functionality. The other main difference that I can see between Jake's description and the GOLDBAR specification is that the GOLDBAR does not explicitly describe changes with respect to a parent/template sequence, just combinatorial structure. The CombinatorialDerivation generated from the GOLDBAR does include a template ComponentDefinition, but it is abstract (its sub-ComponentDefinitions lack sequences). In this case, I chose to specify the changes in Jake's description as options alongside the original design, but I did not have to do so. Depending on how desirable it is to represent or compute a combinatorial library in terms of changes to a parent/template design or another library, additional changes would need to be made, but GOLDBAR provides a decent formal basis that could be built upon.

Here are the full demo steps:

1. Go to https://constellationcad.org/, then click "Playground" and "Type in using GOLDBAR Syntax".
2. Copy contents of protein_20mer.txt into the "Specification" box and the contents of amino_acids.json into the "Categories" box.
3. Enter a design name (I chose "protein_20mer").
4. Click "Submit", then click the download icon ("Export design as SBOL") above and to the right of the "Graph" box.


protein_20mer.txt
amino_acids.json
constellation_protein_20mer_sbol.xml

Gonzalo Vidal

unread,
May 4, 2022, 3:38:31 PM5/4/22
to sbol...@googlegroups.com
Hi all,

Is this a problem for which we need to provide/develop a tool or provide/develop a service?

I wonder which is the best format to interface with industry.

Kind regards,

--
Gonzalo Vidal
Synthetic Biologist

Jake Beal

unread,
May 6, 2022, 6:56:09 AM5/6/22
to SBOL Developers
Hi, Nic:

I tried out your directions on Constellation and it told me "undefined".  Not quite sure where to take that...

Thanks,
-Jake

Jake Beal

unread,
May 6, 2022, 7:27:25 AM5/6/22
to SBOL Developers
Hi, Gonzalo:

In general, you shouldn't expect an industrial organization to want to simply start using an academic software tool into its workflows, much less an online service (can't give proprietary information of your company or its customers to a third party). The challenge is that if you make a business process dependent on a piece of software, you need to make sure that software will be able to be maintained and improved in a predictable manner over time. Academic software developers usually cannot meet those requirements, particularly when it comes to the more complex and volatile aspects of software like user interfaces (students are not "butts in seats" programmers and shouldn't be asked to be).

The more likely patterns for transfer to industry are either:
1) A group develops a very focused algorithm or library that can be used as an isolated "function" wrapped by internal process software, or
2) a lightweight tool demonstrates a capability that the organization can readily fork or rebuild for adaptation to internal usage.

In the case of this particular problem, the current practice is typically something along the lines of ad-hoc spreadsheets. Presenting a spiffy new user interface is unlikely to be a winner, because Excel is already a pretty good user interface. The question is more how to make the contents of those spreadsheets (or something equivalently simple to use) less ad-hoc and more expressive.

Thanks,
-Jake

Nicholas Roehner

unread,
May 6, 2022, 10:59:03 AM5/6/22
to <sbol-dev@googlegroups.com>, Densmore, Douglas Michael
Hi Jake,

I'm getting the same message when I try to run any design, so I think some part of the tool's deployment may have gone down between when I tried it and when you tried it. The source is also available on GitHub: https://github.com/hicsail/constellation-js

Thanks,

Nic

Todd Slaby

unread,
May 6, 2022, 11:57:01 AM5/6/22
to sbol...@googlegroups.com
I agree that industrial adoption of SBOL has been hindered by reliability and disagreement about conventions.

IMO SBOL tried to quickly to advance into the realm of functional annotation and ontology and should have focused purely on manufacturability, a much more tractable problem with universal appeal.

I’ve largely. It contributed to this community because it’s far too academic in its interests.

If anyone wants to focus on just SBOL for DNA manipulation and not data analytics, please reach out to me.

Representing libraries should have been a goal of SBOL a decade ago IMO. We were making libraries back then just like we are now. Nothing has changed about what the core business need for SBOL or an equivalent is. There just a lack of organized will to address that business need IMO.

Take care and stay safe.
T

--
Sent from Gmail Mobile. CONFIDENTIALITY NOTICE: This email message and any attachments are intended solely for the use of the individual or entity to which it is addressed and may contain information that is confidential or privileged. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and permanently delete this message and any attachments.

Jake Beal

unread,
May 6, 2022, 12:43:05 PM5/6/22
to SBOL Developers
Todd: have you had a chance to look at the pending SEP 055 practices document on build planning? 
You may be interested in that and the collaborated design processes we've been using working with iGEM on its new distribution. If so, please feel free to get in contact directly or start a new thread (I don't want to divert this thread too badly)

Thanks,
-Jake

Todd Slaby

unread,
May 6, 2022, 1:10:41 PM5/6/22
to sbol...@googlegroups.com
I haven't yet. let me just add that to my to-do list :)



--

Todd B. Slaby

Confidentiality Notice
This e-mail and all emails in response to this e-mail, including all attachments, may contain information that is protected by law as privileged and/or confidential information. The information is intended only for use by the addressee(s) named herein. If you are not the intended recipient, you are hereby notified that any use, dissemination, copying or retention of this e-mail or the information contained herein is strictly prohibited. If you have received this e-mail in error, please immediately notify the sender by telephone or reply e-mail, and permanently delete this e-mail from your computer system along with any copy or printout thereof. 

Todd Slaby

unread,
May 6, 2022, 1:11:14 PM5/6/22
to sbol...@googlegroups.com
fwiw i think iGEM is way to functionally defined for Build but i'll take a look.

Gonzalo Vidal

unread,
May 6, 2022, 3:39:07 PM5/6/22
to sbol...@googlegroups.com
Hi all,

My perspective after talking with people on industry at SEED was that industry want to use SBOL and related tools but don’t know how and are willing to pay for an ad-hoc solution as a service. I.e. industry has a need and a price, it interface the SBOL industrial consortium, they match the need with a developer and take a %, a developer solves the problem, the SBOL industrial consortium verify it with the enterprise and everyone wins. The solution can be code, an excel template or something else. The implementation can vary, maybe it’s already done or I’m dreaming but I feel that something like this is needed. If is not done I have a more clear and extense implementation on mind that I’m willing to discuss. 

Kind regards,

Reply all
Reply to author
Forward
0 new messages