Mouse Fusion detection

295 views
Skip to first unread message

ninn...@gmail.com

unread,
Jun 7, 2016, 7:45:54 AM6/7/16
to STAR-Fusion
Dear all,

I would like to use the nice startfusiopn tool on mouse data.
For human I have used this as a resource lib:
https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/
but as it does not seem to be there for mouse, I was wondering if you made it, but did not provide it, or if I should do it myself?
If I should do it myself, how?:)

Thanks a lot for your help!



Brian Haas

unread,
Jun 7, 2016, 8:00:44 AM6/7/16
to ninn...@gmail.com, STAR-Fusion
Hi,

You can build a resource library for mouse by following the protocol here:


See the section under 'building a custom Fusionfilter Dataset'

best,

~brian


--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To post to this group, send email to star-...@googlegroups.com.
Visit this group at https://groups.google.com/group/star-fusion.
To view this discussion on the web visit https://groups.google.com/d/msgid/star-fusion/8c1b4af8-8c3a-4c7f-b604-b052c221c873%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

hideldi...@gmail.com

unread,
Jun 13, 2016, 3:15:13 AM6/13/16
to STAR-Fusion
Hello,

I would also like to use STAR-Fusion on mouse data, however I am having several difficulties in getting RepeatMasker to run in order to create my own mouse database.
Is there any way for anybody to upload the repeat-masked cDNA file for mm10 ?

Kind Regards,

Christoph

ninn...@gmail.com

unread,
Jun 16, 2016, 10:10:59 AM6/16/16
to STAR-Fusion
Hi Christoph,

You can find this on the UCSC webpage:

Availability of repeat-masked data


 

Question:
"Are the repeat annotation files available for every chromosome?"

Response:
Yes, you can obtain the repeat-masked files via the Table Browser or from the organism's annotation database downloads directory. The RepeatMasker annotation tables are named chrN_rmsk (where N represents the chromosome number) and the Tandem Repeat Finder (TRF) tables are named simpleRepeat.




  RepeatMasker version differences - UCSC vs. RepeatMasker website

 

Question:
"When I run RepeatMasker independently from the RepeatMasker web server, my results vary from those of UCSC. What's the cause?"

Response:
UCSC occasionally uses updated versions of the RepeatMasker software and repeat libraries that are not yet available on the RepeatMasker website (see Repeat-masking data for more information).




Maybe this can help you further.

Best Nina


Am Dienstag, 7. Juni 2016 13:45:54 UTC+2 schrieb ninn...@gmail.com:

Brian Haas

unread,
Jun 16, 2016, 10:34:21 AM6/16/16
to ninn...@gmail.com, STAR-Fusion
I can try to build a mouse version.  Which mouse gencode release would be best for everyone?

Note, I don't have a test-bed for mouse, so this would all be for exploratory purposes.

~b

--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To post to this group, send email to star-...@googlegroups.com.
Visit this group at https://groups.google.com/group/star-fusion.

For more options, visit https://groups.google.com/d/optout.

hideldi...@gmail.com

unread,
Jun 16, 2016, 10:53:07 AM6/16/16
to STAR-Fusion
Thank you both!

Brian, building a mouse version would be much appreciated.
I'd prefer the current Gencode release for mouse (M9).
Thank you again!

Regards

Christoph


Am Dienstag, 7. Juni 2016 13:45:54 UTC+2 schrieb ninn...@gmail.com:

Brian Haas

unread,
Jun 16, 2016, 11:25:29 AM6/16/16
to hideldi...@gmail.com, STAR-Fusion
You got it.   I'll kick off the build shortly and see how it goes.


~b

--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion...@googlegroups.com.
To post to this group, send email to star-...@googlegroups.com.
Visit this group at https://groups.google.com/group/star-fusion.

For more options, visit https://groups.google.com/d/optout.

Brian Haas

unread,
Jun 18, 2016, 5:49:36 PM6/18/16
to hideldi...@gmail.com, STAR-Fusion
Here's a build for mouse using M9/gencode


All I did was build it, though, and haven't done any testing....  I don't have a set of mouse fusion transcripts to verify it works with as of yet.

Please let me know how it goes.

best,

~b

Mahesh Vangala

unread,
Nov 2, 2016, 1:33:44 PM11/2/16
to STAR-Fusion
Hello Brian -

I am trying to build STAR_Fusion reference files with UCSC mm9 gtf and fasta files. In getting cDNA_seq.fasta using the command, I get following error

/zfs/cores/mbcf/mbcf-storage/devel/umv/software/miniconda3/lib/STAR-Fusion/FusionFilter/util/gtf_file_to_cDNA_seqs.pl /zfs/cores/mbcf/mbcf-storage/devel/umv/ref_files/mouse/Mus_musculus/UCSC/mm9/Annotation/Genes/genes.gtf /zfs/cores/mbcf/mbcf-storage/devel/umv/ref_files/mouse/Mus_musculus/UCSC/mm9/Sequence/WholeGenomeFasta/genome.fa 1> cDNA_seqs.fa
Error, no seek pos for acc: chr13_random at /zfs/cores/mbcf/mbcf-storage/devel/umv/software/miniconda3/lib/STAR-Fusion/FusionFilter/util/../lib/Fasta_retriever.pm line 71, <$fh> line 66909760.
        Fasta_retriever::get_seq(Fasta_retriever=HASH(0x26434c0), "chr13_random") called at /zfs/cores/mbcf/mbcf-storage/devel/umv/software/miniconda3/lib/STAR-Fusion/FusionFilter/util/gtf_file_to_cDNA_seqs.pl line 54

Let me know your thoughts.

Thanks,
Mahesh

Mahesh Vangala

unread,
Nov 2, 2016, 1:41:38 PM11/2/16
to STAR-Fusion
Brian -

Ignore my question above.
It seems like I have in my gtf_file chr13_random info, however, I don't have the sequence chr13_random in my fasta file.
That did it.

Thanks,
Mahesh

David Zhang

unread,
Jun 27, 2017, 1:46:12 PM6/27/17
to STAR-Fusion

Brian & Mahesh,

I met the same issue with:
Error, no seek pos for acc: chr1_GL456211_random at /opt/Anaconda2-4.3.0/lib/STAR-Fusion/FusionFilter/util/../lib/Fasta_retriever.pm line 71, <$fh> line 79772306.
        Fasta_retriever::get_seq(Fasta_retriever=HASH(0x1309e28), "chr1_GL456211_random") called at /opt/Anaconda2-4.3.0/lib/STAR-Fusion/FusionFilter/util/gtf_file_to_cDNA_seqs.pl line 54

How have you solved the problem?

Thanks,

David

Brian Haas

unread,
Jun 27, 2017, 2:41:25 PM6/27/17
to David Zhang, STAR-Fusion
Hi

Just make sure the reference annotation (gtf) file that you're using is restricted to those features that map to contigs in your genome.fasta file.

best,

~brian

--
You received this message because you are subscribed to the Google Groups "STAR-Fusion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to star-fusion+unsubscribe@googlegroups.com.

To post to this group, send email to star-...@googlegroups.com.
Visit this group at https://groups.google.com/group/star-fusion.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages