Does this mean 508 genes predicted? With 141 subfamilis?

21 views
Skip to first unread message

Liangliang Gao

unread,
Jun 28, 2012, 11:34:42 AM6/28/12
to sp...@googlegroups.com
gaoll@VM:~/spada_soft/spada$ perl spada.pl --cfg conf.txt --org stuberosum --fas /home/gaoll/spada_data/stuberosum/01_refseq.fa --gff /home/gaoll/spada_data/stuberosum/51_gene.gff
##########  working on stuberosum  ##########
  copying FASTA file to data directory...
  breaking genome sequence & translating in 6 frames ...
  copying GFF file to data directory...
  converting Gff to Gtb... ( 51472 RNA | 35119 gene ) done
  converting Gtb to Gff... ( 51472 RNA | 35119 gene ) done
  extracting sequence from Gtb... 51472 out of 51472 done
  running hmmsearch (might take long time) ...
  parsing HMM output...
  recovering to global coordinate for hmmsearchX...  5475 out of 5475 done
  tiling hits...
  collapsing neighboring hits...
  running hmmsearch (might take long time) ...
  parsing HMM output...
  recovering to global coordinate for hmmsearchP...  9625 out of 9625 done
  tiling hits...
  collapsing neighboring hits...
  collecting hits from multiple sources... 1830 in total
  E-vlaue filer... 1830 out of 1830 passed filter
  sort hits into groups... 1375 groups in total
  re-formatting hit info...
  removing multi hits... 238 out of 1375 removed
  removing nearby hits... 52 out of 1137 removed
  removing within-group hits...
  re-ordering htis...
  removing pseudogenes...?    711    SCS**RVTSLVSACASFVNYGTPDTIPGAPCCIAMTTLSTVASSTGIQTRQSVCRCMMDLITTCNPNATAIATLPGFCGVSLGFTIDPNTDCE
 30 out of 1085 removed
  computing MSA scores:  1055/1055 done...
  writing hit Gff file...
  preparing hit sequence...
  running Augustus:  1055/1055 done...
  collecting prediction results from /home/gaoll/spada_data/stuberosum/31_model_SPADA/12_augustus/02_raw
  changing to relative position...
  building 1/2 exon models...  1055 out of 1055 done
  writing output in Gtb format...
  collecting all prediction results...
  refining incomplete models...
  merging redundant models...  1583 in total
  converting genomic positions to global coordinates...
  removing incompatible models...  118 deleted
  converting Gtb to Gff... (  1465 RNA |   997 gene ) done
  computing MSA scores:  1465/1465 done...
  assessing SignalP scores:  1465 /  1465 done...
  assessing peptide scores:  1465/1465 done...
  merging statistics...
  picking best models...  508 picked
  converting Gtb to Gff... (   508 RNA |   508 gene ) done
  making sub-family alignments:   141 /   141 done...

Message has been deleted

Liangliang Gao

unread,
Jun 28, 2012, 11:57:16 AM6/28/12
to sp...@googlegroups.com
I think it would be great if I have the cDNA sequences of predicted secreted peptides. Also, it would be great if I know which of them is already included in the original gff annotation, and their corresponding gene/transcript IDs.


Peng Zhou

unread,
Jun 28, 2012, 12:03:36 PM6/28/12
to sp...@googlegroups.com
Right, seems the pipeline made it through.

Peng Zhou

unread,
Jun 28, 2012, 12:17:48 PM6/28/12
to sp...@googlegroups.com
Good idea - will probably add that to the software TODO list.
Reply all
Reply to author
Forward
0 new messages