How can I transfer the genewise gff format output into the alignments gff3 format that EVM can use?

163 views
Skip to first unread message

杜康

unread,
Nov 7, 2016, 9:53:02 AM11/7/16
to EVidenceModeler-users

Dear there,

This might be stupid, but I really stuck on this. 
The genewise gff output doesn't meet the EVM's request, I'm putting the genewise output there:

flattened_line_15351:0:2165     GeneWise        match   783     1652    206.67  +       .       flattened_line_15351:0:2165-genewise-prediction-1
flattened_line_15351:0:2165     GeneWise        cds     783     924     0.00    +       0       flattened_line_15351:0:2165-genewise-prediction-1
flattened_line_15351:0:2165     GeneWise        intron  925     1050    0.00    +       .       flattened_line_15351:0:2165-genewise-prediction-1
flattened_line_15351:0:2165     GeneWise        cds     1051    1206    0.00    +       2       flattened_line_15351:0:2165-genewise-prediction-1
flattened_line_15351:0:2165     GeneWise        intron  1207    1311    0.00    +       .       flattened_line_15351:0:2165-genewise-prediction-1
flattened_line_15351:0:2165     GeneWise        cds     1312    1399    0.00    +       2       flattened_line_15351:0:2165-genewise-prediction-1
flattened_line_15351:0:2165     GeneWise        intron  1400    1507    0.00    +       .       flattened_line_15351:0:2165-genewise-prediction-1
flattened_line_15351:0:2165     GeneWise        cds     1508    1652    0.00    +       1       flattened_line_15351:0:2165-genewise-prediction-1

while the one in need is like this kind:

Contig1 nap-nr_minus_rice.fasta nucleotide_to_protein_match     8392    8470    50.00   -       .       ID=match.nap.nr_minus_rice.fasta.37;Target=RF|YP_440341.1|83716234|NC_007650 196 222
Contig1 nap-nr_minus_rice.fasta nucleotide_to_protein_match     7650    7786    26.09   -       .       ID=match.nap.nr_minus_rice.fasta.37;Target=RF|YP_440341.1|83716234|NC_007650 222 268

Compared to the one in need, my genewise output obviously miss the "Target" protein information. and not even in the proper format. 

Can I have your support here? Thanks very much.
 
      Sincerely,
            Du Kang
Message has been deleted

杜康

unread,
May 11, 2017, 1:03:19 PM5/11/17
to EVidenceModeler-users
Sorry, the script might be wrong. Maybe try this:

$cat genewise.gff.best|perl -lane 'next unless $F[2]=~/cds/;s/$F[2]/match/;s/$F[3]\t$F[4]/$F[4]\t$F[3]/ if $F[6]=~/-/;if($F[0] ne $flag){$i++;$flag=$F[0]} s/$F[8]/ID=GeneWise.$i;$F[8]/;print' >genewise.gff.best.gff3


在 2016年11月7日星期一 UTC+1下午3:53:02,杜康写道:
Reply all
Reply to author
Forward
0 new messages