Programming exercises

Steven Ahrendt

unread,

Oct 19, 2011, 4:44:49 PM10/19/11

to UCR Perl Group

Hi all,

Attached is one solution to the regular expression problem set sent out last week ("regex_problems.pl") and the sequence file that it needs ("regex_seq.txt"). Note the use of file I/O, "chomp", and "join". Compare it to yours (if you have one) or to Zhigang's if you want to see how certain things can be implemented differently.

Below are some problems for subroutines. Email your script either to me (sahr...@ucr.edu) or to the group and we can discuss the solution during next week's workshop.

Feel free to email me or Sofia with any questions you might have.

-- Steven

Subroutine problem set:

1. Write a subroutine that joins two DNA strings.

2. Write a second subroutine that reports the GC content of this (or any) DNA sequence.

3. Write a third subroutine that counts the instances of any restriction site in any DNA sequence. (Don't worry about inserting the 'cut' character: ^)
4. Using some of these subroutines and anything else we've learned so far, create a script that reads the attached fasta file ("multi_seq.fna") and reports the following information for each sequence:

- sequence name

- sequence length

- GC content

- counts of restriction sites: EcoRI (GAATTC), SduI (GDGCHC), and HindII (GTYRAC). (Again, don't worry about the cut character. Just report the counts. IUPAC Table for reference)

====

Steven Ahrendt

Graduate Student Researcher

Genetics, Genomics, and Bioinformatics

University of California, Riverside

http://sahrendt.wordpress.com

http://lab.stajich.org/

regex_seq.txt

multi_seq.fna

regex_problems.pl

Sofia Robb

unread,

Oct 19, 2011, 5:18:51 PM10/19/11

to ucr-perl-bi...@googlegroups.com

hi steven,

what fungal species are you working on that you said was really close to the split of plants and fungi?

thanks

sofia

<regex_seq.txt><multi_seq.fna><regex_problems.pl>

Steven Ahrendt

unread,

Oct 19, 2011, 5:31:51 PM10/19/11

to ucr-perl-bi...@googlegroups.com

Batrachochytrium dendrobatidis, which is from a lineage that is very close to the split of animals and fungi.

The genes in the multi_seq.fna file are from this organism.

-- Steven

Sofia Robb

unread,

Oct 19, 2011, 7:53:03 PM10/19/11

to ucr-perl-bi...@googlegroups.com

thank you.

Reply all

Reply to author

Forward