HW3 Questions

24 views
Skip to first unread message

Lev Naiman

unread,
Nov 30, 2010, 8:11:41 PM11/30/10
to csc24...@googlegroups.com
2. 

a) I am assuming that since the sequence contains T's it is in-fact representing DNA rather than RNA

b)
i) If we find a start codon, and the file ends before we hit an end codon, do we discard the sequence? or is there any guarantee that this will not happen?
ii) Can ORF's overlap? Or do we consider an ATG after we started an ORF to just be the codon for an amino acid instead of another start?



--
The statement below is true.
The statement above is false.

Michael Brudno

unread,
Nov 30, 2010, 8:42:24 PM11/30/10
to csc24...@googlegroups.com
On Tue, Nov 30, 2010 at 8:11 PM, Lev Naiman <naim...@gmail.com> wrote:
> 2.
> a) I am assuming that since the sequence contains T's it is in-fact
> representing DNA rather than RNA
yes

> b)
> i) If we find a start codon, and the file ends before we hit an end codon,
> do we discard the sequence? or is there any guarantee that this will not
> happen?

Yes, discard it.

> ii) Can ORF's overlap? Or do we consider an ATG after we started an ORF to
> just be the codon for an amino acid instead of another start?

As discussed in class, an ATG can appear in the middle of an ORF -- it
just codes for a Methionine.

-M

Yeleiny Bonilla

unread,
Nov 30, 2010, 9:53:02 PM11/30/10
to csc24...@googlegroups.com
Which are the office hours for this assignment?
It will be great having at least one this week. Is that possible?


Yele.

Michael Brudno

unread,
Nov 30, 2010, 9:57:51 PM11/30/10
to csc24...@googlegroups.com
Ah good idea, but this week is pretty insane for me. Let's do OH
Monday 2-4pm. If you (or anyone else) cannot make this time please
e-mail me off the thread and we will set up a one-on-one meeting.

-M

Yeleiny Bonilla

unread,
Nov 30, 2010, 10:18:32 PM11/30/10
to csc24...@googlegroups.com
No that time is imposible for me , have hci classes :(

Brian

unread,
Dec 1, 2010, 6:24:39 PM12/1/10
to csc2417-f10
Sorry, but I'd like a little clarification on this point. Yes, an ATG
can appear in the middle of an ORF and codes for a methionine, but
that doesn't really answer the question of whether ORFs can overlap.
It would seem to me that ORFs, as found by our ORF-finder should be
allowed to overlap, since every ORF we find is only a potential gene,
even though realistically (I think) genes don't overlap with each
other on the same strand, since this would wreak havoc with the gene
regulation and transcription machinery. (Any transcriptional
repression of the second gene would also block transcription of the
first gene.)

But for the purposes of this assignment, for example:

ATG CCC CCC CCC CCC CCC ATG CCC CCC CCC CCC CCC TGA.

I assume that's two ORFs of length 39 and 21. And similarly:

ATG CCC CCA TGC CCC TGA. CCT GA...

is two ORFs of length 18 and 15 respectively.

On Nov 30, 8:42 pm, Michael Brudno <bru...@gmail.com> wrote:

Yeleiny Bonilla

unread,
Dec 1, 2010, 7:21:28 PM12/1/10
to csc24...@googlegroups.com
Brian, in your second example, which one is the orf with length 15?

Yele.

Orion Buske

unread,
Dec 1, 2010, 7:25:14 PM12/1/10
to csc24...@googlegroups.com
I would call your second example as two separate ORFS, but I would call the first as only one, the longest.

Yele: it's orf that is out of frame with the obvious one. It starts half-way through, and ends after.

We'll see what Mike says.

Yeleiny Bonilla

unread,
Dec 1, 2010, 7:34:42 PM12/1/10
to csc24...@googlegroups.com
Right right im seing it now.
And regarding the first one, i will count two as brian did, but also i
understand your aproach orion. I consider that we need to arrive to an
agreenment regarding this because that changes the implementation a
little bit and also the number of orfs that we are going to obtain is
different.

In the implementation i have right now, i will have 2 orfs in both
examples, same as brian. Now i wonder , is that correct?

Yele.

Michael Brudno

unread,
Dec 1, 2010, 8:41:08 PM12/1/10
to csc24...@googlegroups.com
My opinion is that the first case is one ORF, the second two. I think
that would also be the case easiest to implement. :)

Brian

unread,
Dec 2, 2010, 3:32:26 PM12/2/10
to csc2417-f10
Well then my question is that if the first case is one ORF... which
one is it? The 39 length or the 21 length?

It would seem to me that, practically speaking, an ORF-finder would
want to return both. If it only returned one, it would "miss" the
other, and it might turn out that the other was the "correct" choice
in that it was the gene, and the other was just a random ATG.

On Dec 1, 8:41 pm, Michael Brudno <bru...@gmail.com> wrote:
> My opinion is that the first case is one ORF, the second two. I think
> that would also be the case easiest to implement. :)
>
> On Wed, Dec 1, 2010 at 7:34 PM, Yeleiny Bonilla
>
> <neininyboni...@gmail.com> wrote:
> > Right right im seing it now.
> > And regarding the first one, i will count two as brian did, but also i
> > understand your aproach orion. I consider that we need to arrive to an
> > agreenment regarding this because that changes the implementation a
> > little bit and also the number of orfs that we are going to obtain is
> > different.
>
> > In the implementation i have right now, i will have 2 orfs in both
> > examples, same as brian. Now i wonder , is that correct?
>
> > Yele.
>

Michael Brudno

unread,
Dec 2, 2010, 3:36:34 PM12/2/10
to csc24...@googlegroups.com
The 39.
Reply all
Reply to author
Forward
0 new messages