Re 2417 hw3

23 views
Skip to first unread message

Orion Buske

unread,
Dec 14, 2010, 1:40:33 AM12/14/10
to csc24...@googlegroups.com
* If you received a bunch of copies of this message, I'm terribly sorry. My mail client has been having fun with this one. *

Hey Marc (and Misko, for when you get around to it) and others who I thought might appreciate this (okay, I'm just CCing the google group),

At bottom is a test genome that I think pretty clearly highlights a number of boundary cases. If you think I have an error in my code (entirely possible), please everyone know. :)


DEBUG GUIDE:
- If you have 8 CCC's and 7 GGG's, you're not counting ORF-codons on the - strand correctly.
- If you get any CCG's CGG's GCC's, etc, your not properly handling the frame inside of ORFs.
- If you don't have 8 AAA's, or have any ORF AAA's, you're not handling the non-ORF regions or boundaries correctly.


MY RELEVANT OUTPUT:

ORFs (1-indexed, inclusive):
9 35 +
56 67 +
24 53 -

non-zero codon counts (ORF, not-ORF):
AAA: 0 8
ATG: 3 0 # the starts of the three ORFS
CCC: 7 0
GGG: 8 0
TAA: 3 0 # the stops of the three orfs
TTA: 2 0

non-zero codon frequencies (ORF, not-ORF):
AAA: 0.00% 100.00%
ATG: 13.04% 0.00%
CCC: 30.43% 0.00%
GGG: 34.78% 0.00%
TAA: 13.04% 0.00%
TTA: 8.70% 0.00%

non-zero AA frequencies:
Gly: 34.78%
Leu: 8.70%
Met: 13.04%
Pro: 30.43%
Stop: 13.04%

Longest ORF (bp): 30
Shortest ORF (bp): 12
Mean ORF length (bp): 23.0


TEST GENOME:
>test
AAAAAAAAATGCCCGGGCCCGGGTTAGGGCCCTAACCCGGGCCCGGGCCCCATAAATGCCCGGGTAAAAAA

Reply all
Reply to author
Forward
0 new messages