Filtering Transcripts Based on Expression Values

574 views
Skip to first unread message

Adriana Fróes

unread,
Nov 17, 2016, 1:30:25 PM11/17/16
to trinityrnaseq-users

 Dear Trinity developers,

 I'm trying to use the script from util directory called "filter_low_expr_transcripts.pl", using the following command line:

trinityrnaseq-2.3.0_PRERELEASE/util/filter_low_expr_transcripts.pl --matrix ../MBH_WP.TMM.fpkm.matrix --transcripts ../Trinity.fasta --min_expr_any 10 --trinity_mode


But I an error keep coming, as this:

Error, no expression record stored for acc: [TR86_c0_g12] at util/filter_low_expr_transcripts.pl line 119, <$filehandle> line 93.

Do you know whatś going on?

Thank you very much.

B
Adriana

Mark Chapman

unread,
Nov 18, 2016, 3:39:38 AM11/18/16
to Adriana Fróes, trinityrnaseq-users
Hi Adriana,
Did you assemble the data, map the reads and run this script all with the same version of trinity?
Also the missing transcript "TR86_c0_g12" seems to be a gene not a transcript, if your expression matrix from the genes results or the transcripts results? I presume that even though the scripts may have changed they still require the transcripts matrix. Have you actually looked for this in your matrix to check its there?
Not sure if this will help, its just some ideas.
Cheers, Mark

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
Dr. Mark A. Chapman
+44 (0)2380 594396
------------------------------------
Centre for Biological Sciences
University of Southampton
Life Sciences Building 85
Highfield Campus
Southampton
SO17 1BJ

Adriana Fróes

unread,
Nov 18, 2016, 7:47:16 AM11/18/16
to Mark Chapman, trinityrnaseq-users
Hi Mark,

Thanks a lot for the suggestions. I think you are right. The initial version of Trinity I used (2.0.6) didn't have the script to filter low expressed transcripts. And also, I used the expression matrix from the genes results. I didn't see a expression matrix from the transcripts results...just genes and isoforms. What matrix would be?

Thanks again for the support.
Best
Adriana


Adriana M. Froes
Laboratório de Microbiologia, Instituto de Biologia, Depto de Biologia Marinha
Universidade Federal do Rio de Janeiro   
Av. Carlos Chagas Filho 373, Sala A3-202, Bloco A (Anexo) do CCS
21941-599, Ilha do Fundão, Rio de Janeiro, RJ

On Fri, Nov 18, 2016 at 6:39 AM, Mark Chapman <markcha...@gmail.com> wrote:
Hi Adriana,
Did you assemble the data, map the reads and run this script all with the same version of trinity?
Also the missing transcript "TR86_c0_g12" seems to be a gene not a transcript, if your expression matrix from the genes results or the transcripts results? I presume that even though the scripts may have changed they still require the transcripts matrix. Have you actually looked for this in your matrix to check its there?
Not sure if this will help, its just some ideas.
Cheers, Mark
On 17 November 2016 at 18:30, Adriana Fróes <drica...@gmail.com> wrote:

 Dear Trinity developers,

 I'm trying to use the script from util directory called "filter_low_expr_transcripts.pl", using the following command line:

trinityrnaseq-2.3.0_PRERELEASE/util/filter_low_expr_transcripts.pl --matrix ../MBH_WP.TMM.fpkm.matrix --transcripts ../Trinity.fasta --min_expr_any 10 --trinity_mode


But I an error keep coming, as this:

Error, no expression record stored for acc: [TR86_c0_g12] at util/filter_low_expr_transcripts.pl line 119, <$filehandle> line 93.

Do you know whatś going on?

Thank you very much.

B
Adriana

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.

To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.
--
Dr. Mark A. Chapman
------------------------------------
Centre for Biological Sciences
University of Southampton
Life Sciences Building 85
Highfield Campus
Southampton
SO17 1BJ

Adriana Fróes

unread,
Dec 20, 2016, 9:12:03 AM12/20/16
to trinityrnaseq-users
Dear Trinity developers,

I'm trying to use again the script from util directory called "filter_low_expr_transcripts.pl", using the following command line:

trinityrnaseq-2.2.0/util/filter_low_expr_transcripts.pl --matrix matrix.TMM.EXPR.matrix --transcripts Trinity.bac_arc.fasta --trinity_mode

this time I reassembled the reads with the same Trinity version (
2.2.0), but I got the following errors:

ISO: TRINITY_DN21772_c0_g1_i2    $VAR1 = {};
Use of uninitialized value $expr in numeric gt (>) at ../../trinityrnaseq-2.2.0/util/filter_low_expr_transcripts.pl line 187.
Use of uninitialized value $expr in addition (+) at ../../trinityrnaseq-2.2.0/util/filter_low_expr_transcripts.pl line 192.
ISO: TRINITY_DN21772_c0_g1_i1    $VAR1 = {};
Use of uninitialized value $expr in numeric gt (>) at ../../trinityrnaseq-2.2.0/util/filter_low_expr_transcripts.pl line 187.
Use of uninitialized value $expr in addition (+) at ../../trinityrnaseq-2.2.0/util/filter_low_expr_transcripts.pl line 192.
Use of uninitialized value $expr in division (/) at ../../trinityrnaseq-2.2.0/util/filter_low_expr_transcripts.pl line 197.

My Trinity.fasta file has this look:

>TRINITY_DN94_c0_g1_i1 len=750 path=[728:0-179 234:180-464 234:465-749] [-1, 728, 234, 234, -2]
TTGTTGTTTCAAGTGTACTGACATATAGCCAACTGCAGGCGATGCTGACGCTGGGCGACC
AGCATATTTCAGATCAGCTCCCGCTGGAATCGCAGCACGGAAGTTATGTTGGCTAGAGTA

And the file
matrix.TMM.EXPR.matrix has this look:

        0h-ctrl_bac_arc 0h-vivo_bac_arc 1h-7.5A_bac.arc 1h-8.1A_bac_arc 6h-7.5A.bac_arc 6h-8.1A.bac_arc 96h-7.5A.bac_arc        96h-8.1A_bac.arc        96h-ctrl_7.5A.bac_arc   96h-ctrl_8.1A.bac_arc

TRINITY_DN168672_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN86497_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN154254_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN12228_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   7.070   0.000
TRINITY_DN8781_c0_g2    0.000   0.000   0.000   0.000   13.468  0.000   0.000   0.000   0.000   0.000
TRINITY_DN112423_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN125582_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN59858_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN88946_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN68385_c0_g1   8.725   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   10.465

Adriana Fróes

unread,
Dec 20, 2016, 9:20:03 AM12/20/16
to trinityrnaseq-users
Sorry for the mistake...I was using another version of Trinity.

Actually, I used the new version now (also for assembling), and nothing happens. It just appears the manual.

I used the following command line:
trinityrnaseq-Trinity-v2.3.1_PRERELEASE/util/filter_low_expr_transcripts.pl --matrix matrix.TMM.EXPR.matrix --transcripts Trinity.bac_arc.fasta --trinity_mode

My Trinity.fasta file has this look:

>TRINITY_DN94_c0_g1_i1 len=750 path=[728:0-179 234:180-464 234:465-749] [-1, 728, 234, 234, -2]
TTGTTGTTTCAAGTGTACTGACATATAGCCAACTGCAGGCGATGCTGACGCTGGGCGACC
AGCATATTTCAGATCAGCTCCCGCTGGAATCGCAGCACGGAAGTTATGTTGGCTAGAGTA


And the file matrix.TMM.EXPR.matrix has this look:

        0h-ctrl_bac_arc 0h-vivo_bac_arc 1h-7.5A_bac.arc 1h-8.1A_bac_arc 6h-7.5A.bac_arc 6h-8.1A.bac_arc 96h-7.5A.bac_arc        96h-8.1A_bac.arc        96h-ctrl_7.5A.bac_arc   96h-ctrl_8.1A.bac_arc

TRINITY_DN168672_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN86497_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN154254_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN12228_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   7.070   0.000
TRINITY_DN8781_c0_g2    0.000   0.000   0.000   0.000   13.468  0.000   0.000   0.000   0.000   0.000
TRINITY_DN112423_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN125582_c0_g1  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN59858_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN88946_c0_g1   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
TRINITY_DN68385_c0_g1   8.725   0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000   10.465

after the command line just appeared the manual like this:

##########################################################################################
#
#  --matrix|m <string>            expression matrix (TPM or FPKM, *not* raw counts)
#
#  --transcripts|t <string>       transcripts fasta file (eg. Trinity.fasta)
#
#
#  # expression level filter:
#
#     --min_expr_any <float>      minimum expression level required across any sample (default: 0)
#
#  # Isoform-level filtering
#
#     --min_pct_dom_iso <int>         minimum percent of dominant isoform expression (default: 0)
#          or
#     --highest_iso_only          only retain the most highly expressed isoform per gene (default: off)
#                                 (mutually exclusive with --min_pct_iso param)
#
#     # requires gene-to-transcript mappings
#
#     --trinity_mode              targets are Trinity-assembled transcripts
#         or
#     --gene_to_trans_map <string>   file containing gene-to-transcript mappings
#                                    (format is:   gene(tab)transcript )
#
#########################################################################################

What is missing?

Thanks!

B

Adriana


Adriana M. Froes
Laboratório de Microbiologia, Instituto de Biologia, Depto de Biologia Marinha
Universidade Federal do Rio de Janeiro   
Av. Carlos Chagas Filho 373, Sala A3-202, Bloco A (Anexo) do CCS
21941-599, Ilha do Fundão, Rio de Janeiro, RJ

--

Mark Chapman

unread,
Dec 20, 2016, 10:02:24 AM12/20/16
to Adriana Fróes, trinityrnaseq-users
Hi Adriana,
It doesnt look like youve told it what to filter, ie you need to set either or both of
--min_expr_any
and
--min_pct_dom_iso or --highest_iso_only
Best, Mark

To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsubscribe...@googlegroups.com.

To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

Adriana Fróes

unread,
Dec 20, 2016, 10:48:44 AM12/20/16
to Mark Chapman, trinityrnaseq-users
Hi Mark,

You were right. I thought this options were optional.
Now I used this command line:

trinityrnaseq-Trinity-v2.3.1_PRERELEASE/util/filter_low_expr_transcripts.pl --matrix matrix.TMM.EXPR.matrix --transcripts Trinity.bac_arc.fasta --min_expr_any 1 --trinity_mode

And another error is appearing, like this:
Error, no expression record stored for acc: [TRINITY_DN328_c0_g12] at ../trinityrnaseq-Trinity-v2.3.1_PRERELEASE/util/filter_low_expr_transcripts.pl line 119, <$filehandle> line 513.

It seems this sequence is not in Trinity.fasta...is that correct?

Thanks
Adriana


Adriana M. Froes
Laboratório de Microbiologia, Instituto de Biologia, Depto de Biologia Marinha
Universidade Federal do Rio de Janeiro   
Av. Carlos Chagas Filho 373, Sala A3-202, Bloco A (Anexo) do CCS
21941-599, Ilha do Fundão, Rio de Janeiro, RJ

--

Brian Haas

unread,
Dec 20, 2016, 10:54:20 AM12/20/16
to Adriana Fróes, Mark Chapman, trinityrnaseq-users
Hi Adriana,

You need to give it the transcript expression matrix, not the gene expression matrix.

best,

~b

To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.

To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 
Reply all
Reply to author
Forward
0 new messages