Selection projects for DNA assembly and one pot assembly reactions

105 views

Skip to first unread message

Bryan Bishop

unread,

Nov 18, 2016, 4:24:50 PM11/18/16

to enzymaticsynthesis, Bryan Bishop

Suppose we were stuck with modern DNA synthesis technology. Giant arrays of millions of elements that each get built up to become 100 to 200 bp oligonucleotides. The trouble with this approach is validation (DNA sequencing among other techniques) and DNA assembly from multiple fragments.

To my knowledge, we still do not have an all-in-one pot DNA assembly reaction for whole genomes and super long DNA molecules. However, there are in fact a number of one pot DNA assembly reactions out there anyway, just nothing that seems to work for "assembly of 1 billion bp from 100s of thousands of small 100 bp fragments".

There are a number of ways that selection (like the various *-display techniques) could be used to evolve, bruteforce and rationally design some molecular machinery for better DNA assembly.

One of the interesting ideas here is that if you had 10^11 emulsion droplets with different genotype-phenotype elements and their links, it might be possible to design a selection project where you use random DNA and you use DNA sequencing to look for the occurrence of correct DNA ligation. This is a much more statistical approach. Ahead of selection time, you must know the random DNA contents of each emulsion droplet-- or, if you don't already know the DNA content, then you could probably guess what the DNA content was from the DNA sequencing results because there should still be small molecule DNA floating around anyway, assuming reaction yield of DNA ligation was not too close to 100%. Each round of selection, you would look for the results where (1) DNA was ligated, and (2) only DNA that had matching overlapping fragments was ligated together. The initial selection conditions would be much weaker, looking only for a small correlation between DNA sequence and which molecules get ligated together. Over time the selection project's setup should be designed with the goal in mind of amplifying that correlation signal. Of course, this is not the only way to use selection for better DNA assembly tools. A statistical approach like this one is interesting because you can throw lots of bruteforcing at the problem and lots of mutagens and mutagenesis, lots of different proteins and protein systems could be tested at the same time, even giant libraries of randomly chimeric ligases and DNA-handling proteins, because any slight increase in signal (by looking for longer DNA molecules) can be selected for, using gel electrophoresis and DNA sequencing. Basically this is like betting the farm on DNA sequencing resolving all the bottlenecks for us....

There are a number of starting enzymes that I would recommend looking at: ligases, polymerases, recombinases, nucleases, restriction enzymes (TALENs, whatever, etc..), proof-reading mechanisms, high DNA affinity DNA binding protein parts, CRISPR/Cas9, etc.

Another variation that I am presently unsure about is using alternating molecules of DNA and XNA, and then engineering an enzyme that prefers one side to be DNA and the other side to be XNA. However, since XNA and related molecules were developed as substitutes that proteins tend not to discriminate against, this might be self-defeating.

Also, keep in mind that the solution for DNA assembly might be a system of multiple proteins rather than a single protein.

Another weird method would be one that initially requires primers, and then over 10,000 iterations of selection perhaps the length of the primers can go down significantly until they are irrelevant and the proteins assemble the correct DNA sequences anyway.

For larger DNA molecules, a different set of enzymes might be required for "one pot" DNA assembly. Big DNA behaves differently. The assembly reactions might require different conditions, too. Overall I would guess that requiring different reactions or different proteins is not too damaging and not too impractical, but who knows at this point.

Yeast assembly is a particularly interesting place to start, because not only are yeast themselves tiny compartments that can be sorted and selected individually, but also yeast assembly is an existing molecular biology technique. I am not sure how much selection has been done on yeast cultures for DNA assembly efficiency--- whether in emulsion compartments or otherwise. The same statistical selection approach would have to apply here. 100,000 unique sequences is probably beyond previous investigations into yeast assembly fidelity, but it could probably be selected by ramping up from <100 fragments to 1000s of fragments over a few thousand (or million?) generations. Other organisms (probably ecoli....) might be better in the long run for homologous recombination.

"Recent advances in DNA assembly technologies"

http://www.scs.illinois.edu/~zhaogrp/publications/HZ182.pdf

"""
In the in vivo homologous recombination-based assembly methods, NHEJ of the linearized vectors contributes to most of the false positives. One solution could be to introduce a counter-selective marker at the cloning site (Anderson and Haj-Ahmad, 2003). In order to survive, the host must have the counter-selective marker replaced by the designated inserts. To further minimize this problem, separating the essential elements on the vector, selection marker and episome has been proposed (Kuijpers et al., 2013). As the host organism needs at least 2 NHEJs to incorporate both elements, the probability of false positives was greatly reduced. With this new scheme, nine fragments were assembled by 60-bp overlapping regions with a correct yield of 95%.

"""

"High molecular weight DNA assembly in vivo for synthetic biology applications"

http://www.tandfonline.com/doi/full/10.3109/07388551.2016.1141394

"Improving ancient DNA assembly in yeast"

https://peerj.com/preprints/2383.pdf

- Bryan
http://heybryan.org/
1 512 203 0507

Reply all

Reply to author

Forward

0 new messages