Copying over e-mail
correspondence:
On Wed, Sep 6, 2017 at 2:04 PM, Sakul Ratanalert wrote:
Hi Ryan,
Thanks for your e-mail! I saw and replied to your Google forum post last week, thanks for contributing! I will reply here as well, and we can continue our conversation whichever way is most convenient for you.
1. Can you please send me your CAD file, in any format that works without errors? I will see if I can convert it to PLY using MeshLab (http://meshlab.sourceforge.net/) and identify the possible issues that may have arisen. "Index is too large" errors tend to occur when the PLY file has duplicated vertices (i.e. a cube that should have 8 vertices has instead 16), which seems to be an issue with other to-PLY converters.
2. Yes, these scaffolded DNA origami wireframe structures can have any number of base and backbone modifications applied to them; the folding and hybridization are robust enough.
3. Just to clarify, do you want to design a structure such that a desired sequence serves as a staple in a specified location on the DNA structure? If so, DAEDALUS does not currently have backward-calculation of staple sequence to scaffold sequence, i.e. given some or all of the staple sequences, what is the scaffold sequence required to hybridize correctly. However, if you have control over customizing scaffold sequence, I can show you how to use the routing *.txt file to help with manual modification. We are also currently working on an interface to help edit the outputs of DAEDALUS for more facile customization.
4. I have been working on the automated design of using a single ssDNA molecule to fold a staple-less DNA origami. This work is not yet published, but hopefully will be soon!
Please let me know if you have any more questions or comments!
Thanks,
Sakul
On Wed, Sep 6, 2017 at 3:23 PM, Ryan Dikdan wrote:
Thanks so much for the prompt replies.
Here are some of the files I was testing out (I can't remember if they were
what I was aiming for, but I was planning on making a square torus if that
makes sense).
And I am interested in having staples in specific locations. Their sequence doesn't really matter that much. What is that *.txt file? I'm definitely interested in learning more about this program and your techniques for these great predictions.
On Sep 11, 2017 5:18 PM, Sakul Ratanalert wrote:
Hi Ryan,
Sorry for the delay in response! I was able to open the square torus (Learning the Moves.stl) and convert it into a PLY file using MeshLab (LearningtheMoves-SR.ply, attached). The structure you designed has 12 vertices, but if you open the PLY file you had generated using another program, you'll see it has 48 vertices (compare Line 4 of the PLY file). After the "end header line" in a PLY file, the first block of numbers represents the (X,Y,Z) coordinates of each vertex in order. The next block specifies the details of each face per line: the first number is how many vertices comprise the face, and the next numbers in the line specify the vertex ID #s that are part of that face. Hopefully by comparing our different PLY files you can see the differences!
(As a side note, MeshLab by default triangulates all non-triangular faces, so all of your trapezoidal and square faces have been turned into two triangles each. If this isn't what you wanted -- say, you wanted to keep co-planar faces as one -- you can try to manually edit the PLY file. The vertices will remain the same, but you will need to edit the block of face information. In your case, you will have 12 lines that start with "4" to indicate that the faces have four corners, and then figure out the vertex IDs that correspond to each of the 12 faces, listing them counterclockwise relative to an outward normal. Let me know if you need/want help with this step, it's a challenging but useful way to learn about how your input files work!)
I was able to run LearningtheMoves-SR.ply into DAEDALUS, using an minimum edge length of 42 bp, and I've attached the output zip file below. See if you can get the same thing! The three key files/folders to note are:
- a subfolder that contains the atomic file (.pdb) and renderings (.tif) for your structure. You'll notice that one of the corners looks shorter than the others; this is because the triangulation chosen by MeshLab has made that corner high in degree, i.e. many edges are coming out of it. Choosing a different triangulation may be better, depending on what you want the structure to look like. You can visualize the .pdb in a program like UCSF Chimera, Discovery Studio Visualizer, or PyMol.
- a text file named seq_..._.txt. This is the aforementioned *.txt file that shows you how the structure is being routed. An exercise I've made others in my lab do is to cut out the edges as rectangular slips of paper and tape them together and see if you can reconstruct a paper version of your origami design. The numbers listed are the scaffold IDs, i.e. what number nucleotide on the scaffold that position is. If you construct it right, you should be able to start from #1 and go in sequential order all the way back to the start, forming the Eulerian circuit! Then, you should be able to identify where you want to put your custom sequences in relation to the rest of the structure by identifying the closest scaffold ID to your position.
- a csv file named staples_..._.csv. This is a list of all the staple and scaffold sequences required to fold your target structure. Each staple ends with three hyphenated values: the first tells you what edge # the 5' end of the staple begins at (see the .txt file for the edge IDs), the second tells you the scaffold ID that the 5' end is paired with, and the third has "E" for "edge staple" and "V" for "vertex staple". Together with the .txt file you should be able to identify which staple(s) you need to modify for your custom purposes.
Let me know if I made any sense and if you have any questions!
Sakul
On Mon, Sep 11, 2017 at 5:30 PM, Ryan Dikdan wrote:
This is great, thank you so much! I'll look over it and let you know if I have any more questions =]
On Tue, Sep 12, 2017 at 6:49 PM, Ryan Dikdan wrote:
Wow, this is awesome, I do have some more questions though. On the pdb file I can't see all of the DNA structures, it looks like a side is missing. Also, what's the smallest that that hole int he middle of the ring can be? Could it be a couple angstrom like 10-20, if we plan the ply the right way?
On Sep 12, 2017 7:16 PM, Sakul Ratanalert wrote:
Hi Ryan,
Ah, I assume you can see only a single DNA strand where there should be a double helix? On some viewers the PDB parsing step has trouble with such a large molecule (the scaffold), so it becomes invisible, leaving behind only the staple strands. Which program are you using to visualize the PDB?
The minimum edge length is 31 nt, which is ~10 nm. (The width of a B-form DNA double helix is 2 nm, or 20 Angstroms, for reference). The way the scaling works is that you choose the minimum edge length (in nt), and the coordinates of the PLY file are scaled such that the shortest edge is that specified minimum, and the rest are rounded to the nearest 10.5 nt (keeping with B-form DNA winding). The exact coordinates of the PLY are therefore irrelevant; only the relative spacing matters. But unfortunately, due to the minimum edge length constraints, we won't be able to get the hole down to 10-20 A.
Also, do you mind if I post these questions and answers to the forum under the topic you started? Hopefully others will be able to learn from our discussion.
Let me know, thanks!
Sakul
On Tue, Sep 12, 2017 at 7:20 PM, Ryan Dikdan wrote:
Yea, of course, feel free to post them. I'm using education pymol. And what if we did a little trick and designed a macro structure where there were internal corners that came together at a narrow point?
On Thu, Sep 14, 2017 at 8:36 AM, Ryan Dikdan wrote:
I got help from a friend to make these, but I still get an error. There's probably multiple corners, but I don't know how to tell or to fix it. I'm sorry to keep bugging you about this.
On Thu, Sep 14, 2017 at 1:16 PM, Sakul Ratanalert wrote:
Hi Ryan,
No problem, I should have included these steps before!
1) In MeshLab (v1.3.3 is what I currently use), you want to Import the STL file you want to convert: File > Import Mesh.
2) It may ask you "Post-Open Processing: Unify Duplicated Vertices?" Select OK/Yes to do so.
3) If it didn't, you'll want to go to Filters > Cleaning and Repairing > ... and apply the following filters:
- Remove Duplicate Faces
- Remove Duplicated Vertex
- Remove Unreferenced Vertex
Feel free to play around with the other filters if those three still don't solve your issue.
4) Once you have your corrected file, you want to export it. File > Export Mesh As
5) Choose "Stanford Polygon File Format (*.ply)", hit Next, then Unselect Binary Encoding so the file remains human-readable
6) You're done! gap.ply is below.
Just a note about this object: I think you were trying to design an object such that the points of those triangular panels created a small gap? Unfortunately, the smallest edge length is actually the thickness of those panels, so if that is set to be 31 nt / 10 nm, the overall object will be huge.
An alternative is to use the fact that there are natural holes formed at the vertices. Given that the width of a DX tile (that railroad tracks formed by the pair of double helices) is 40 A, the inradius of the equilateral triangle formed by a 3-way vertex (like the one on a tetrahedron), will have a diameter of 40/sqrt(3) ~ 23 A. Maybe you can engineer something using this for your purposes?
Hope this helps!
Sakul
On Thu, Sep 14, 2017 at 9:46 PM, Ryan Dikdan wrote:
Thanks for the directions! I redid the files and fixed the attached one, and I didn't get any errors, but it's been an hour and a half and I haven't gotten the results page yet. Does it normally take this long? Sorry to bug you so much, but would you mind looking over the attached *.ply file. The natural holes is a good idea, but I think having this variability of spacing like the attached file would be more optimizable for my purposes.
On Thu, Sep 14, 2017 at 10:05 PM, Ryan Dikdan wrote:
Oh, I started messing with MeshLab and discovered where the extra vertices are. I'm in the process of fixing it.
On Fri, Sep 15, 2017 at 9:48 PM, Ryan Dikdan wrote:
I got it to work! Attached are the images and the pdb. Now it is rather large, and keeps crashing my pymol. Is there any other way to get it to work? I tried Chimera, and it is still quite laggy, and I am not sure how the focusing option works, but I'll keep trying. Thanks for all your help!
On Mon, Sep 18, 2017 at 6:07 PM, Sakul Ratanalert wrote:
Hi Ryan,
Glad to see it worked! Yeah, it looks like your object is very large, so rendering issues are bound to happen. If you open the staples_..._.csv file, the last line is the scaffold strand. How long is that sequence? (The length should be indicated in the name, in the first column.)
I'm currently working on other visualization methods to handle large macromolecules like this, I'll hopefully have something in the next few months.
Cheers,
Sakul
On Mon, Sep 18, 2017 at 6:50 PM, Ryan Dikdan wrote:
Yea it's got 249 primers and I think it's 9.6 kb (which I guess isn't that bad). Probably impractical right now, but are there any ways to make it smaller?
On Tue, Sep 19, 2017 at 5:36 PM, Sakul Ratanalert wrote:
Hi Ryan,
The size is dependent on the smallest and largest edge lengths you have in the structure, which are currently your plate thickness and your box width, respectively. You could try to redesign it such that the plates taper to a point in the middle, which could hopefully make your edge lengths more alike and lead to a smaller overall size. If you can get the structure under 7249 bp, then you can use M13 as your scaffold strand and potentially synthesize your design.
Cheers,
Sakul
On Tue, Sep 19, 2017 at 6:44 PM, Ryan Dikdan wrote:
That's a great idea! I tried it and got an error even after I cleaned it up in meshlab. Would you mind looking at it?
On Sep 25, 2017 8:42 PM, Sakul Ratanalert wrote:
Hey Ryan,
I managed to run your PLY file locally in Matlab (using the available downloadable source code). The error occurs after calculating the spanning tree, which is a good sign. Here is a figure that is generated right before the error occurs:
As you can see from the first two images, there are a few edges missing on two sides of your object, which means that some of your faces may not be properly defined. (Hopefully you can see from the second version which vertices seem to be the issue. Note that Matlab uses 1-indexing when counting in your PLY file where the problem is, whereas the numbers used in the PLY file are 0-indexed.) If you look at the Schlegel diagram (the 2D projection shown in the 3rd image), Vertex 7 is only joined by 2 edges, which should never be the case.
Opening up your PLY file in Notepad++, I first look to lines 4 and 8, which count how many vertices and faces the PLY is counting, and see if that matches what I expect for the object.
For you, I see:
element vertex 14 [line 4]
property float x
property float y
property float z
element face 20 [line 8]
which actually does match expectation. The number of edges that the parser found is 30, whereas the number of edges should satisfy the Euler characteristic equation V - E + F = 2 if the object is topologically equivalent to a sphere (i.e. has no holes, which yours satisfies), so E = 32. I count 3 edges missing from the object by visual inspection, and then I just noticed that the edge between 9 and 12 is doubled up for some reason:
At this point I would then go through each of the faces (lines 25+) and check to make sure they are defined correctly. The good thing is that only vertices and faces are defined in the PLY file, with edges derived by calculating from the faces, so unless the source of error is vertex coordinates, the issue must be in the face definitions. The face definitions are formatted like so: N a b c ..., where N is the number of vertices on that face (in your case they are all 3, which is good), and a, b, c, etc. are the vertices belonging to that face. We know that vertices 9, 12, 8, 7, and 1 (using 1-indexing. Using 0-indexing they would be 8, 11, 7, 6, and 0.) are the problem areas.
What I would do next if I had the time would be to check the following:
1) Check each line.
a) Do each of the triangles specified in the faces match what I expect?
b) Are the order of vertex IDs such that it is counter clockwise relative to an outward normal of that face?
2) Which faces involve vertices (0-indexed) 8, 11, 7, 6, and 0? Pay close attention to those.
Let me know if this does or doesn't work, and I'll take a closer look at it.
Hope this helps!
Sakul
On Mon, Sep 25, 2017 at 8:45 PM, Ryan Dikdan wrote:
Wow this is so much help, thanks so much! I'll have to grab you a beer or something next time I'm in Boston =D I'll take a look at it now.
On Wed, Sep 27, 2017 at 1:40 AM, Ryan Dikdan wrote:
I got it to work! After a bunch of debugging I realized that I made triangular faces with 4 vertices, without connecting the forth vertex. I remade the faces and it's looking good now, at <100 primers and only ~3.6kb.
On Sep 27, 2017 10:42 AM, Sakul Ratanalert wrote:
Hey Ryan,
Awesome! Glad you were able to figure it out and get it to work! What do you plan to do with the design now?
Cheers,
Sakul
On Wed, Sep 27, 2017 at 10:51 AM, Ryan Dikdan wrote:
Well I just had an idea about using preconfigured modified ribozymes. Since proteins have functionality, but we don't understand their spatial arrangements and since we understand DNA folding but it doesn't have functionality I was curious if I could design an "active site" via DNA structure then add primers with reactive groups (modified nucleotides) which I've been looking into their syntheses. Then there might be a way to directedly evolve, via flow cytometry, reactive combinations of primers (which each one would have different reactive groups or binding placements). It's a pretty radical idea, and probably wouldn't work, but I'm surprised we got this far! Thanks again for all your help, I definitely understand how your program works a bit better now. Now I'm looking into where exactly the primers of this design bind so that I could find out which primers I would need to modify.
On Wed, Sep 27, 2017 at 1:45 PM, Sakul Ratanalert wrote:
Hi Ryan,
That sounds interesting, glad I could help! The routing *.txt file will hopefully help you identify which primers you need to modify.
Let me know if you have any more questions, and update me when you can with your progress!
Best,
Sakul
On Wed, Sep 27, 2017 at 11:15 PM, Ryan Dikdan wrote:
Hmmm, does it normally require so many primers? It seems pretty cost prohibitive at this amount.
On Thu, Sep 28, 2017 at 2:20 PM, Sakul Ratanalert wrote:
Hey Ryan,
The structures we published require many staple strands as well, e.g. the icosahedron requires 84 staples, and the nested cube requires 94 staples. Since you are already using the minimum edge length of 31 bp, you can't easily go smaller without changing your PLY file. I'm not sure what purpose the bottom square pyramid is providing, but you could try to move that bottom vertex up closer to the center of the square base (the origin, I believe) to shrink the lengths of those 8 bottom-most edges, which would therefore require fewer staples.
When you've reached a design you're satisfied with, if you want to know a bit more about what your structure would look like in solution (as opposed to the idealized model that DAEDALUS provides), you can use our finite element modeling framework CanDo (https://cando-dna-origami.org/) to obtain an equilibrium structure. Once you make an account, you can go to Submission > CanDo and input the *.cndo file that is one of the outputs from DAEDALUS. This *.cndo file contains all the connection, position, and orientation information for each nucleotide, and CanDo use this to calculate the position information for an equilibrated average particle. If you have more questions about this tool (which is a bit outside my realm of expertise), I can forward you to more knowledgeable people.
Cheers,
Sakul