Full-Length LINE1 elements for hg38

282 views
Skip to first unread message

r.pyt...@gmail.com

unread,
Aug 22, 2018, 5:22:06 AM8/22/18
to MELT Help
Hi,

I'm using MELT and as a part of my project, I need to find 3' transduction's events caused by MEIs. Since my bam files have been generated by using Hg38 as the reference genome, hence, I need a full length of L1s in the form of the .bed format for hg38 to be able to use transduction tool. I'm not sure how I can obtain such a file. 

So, any help would be appreciated.

Thanks,
Jakob


egar...@umaryland.edu

unread,
Aug 22, 2018, 10:01:46 AM8/22/18
to MELT Help
Hello Jakob,

MELT includes a tool to create such a file, 'Source'.

As input it requires
  1. A bed file (-bed) of reference L1 elements. You can get this by downloading the repeatmasker track at UCSC and doing:
    grep L1Hs repeatmasker.bed
    to get a list of all L1Hs elements in the genome
  2. The VCF file (-vcf) from your own MELT analysis
  3. A length (-length) -- for the MELT paper we used 5900
The 'Source' tool will take care of the rest (i.e. selecting only L1 elements >5900bp in length from both the bed and VCF file) and output a file with the suffix *.source that you can use as input to 'TransductionFind'. Let me know if I can be of any other assistance.

Best,

Eugene

Robin J

unread,
Feb 4, 2021, 3:48:02 AM2/4/21
to MELT Help

Hi Eugene,

Thank you for providing the information about where to find the bed file of the repeartmasker for the transduction step.
Is the "RepeatMasker annotations (bed files for human genome assemblies)" in https://repeatbrowser.ucsc.edu/data/  the one you suggest? However, it seems currently not available. 

I'm also thinking about including more source, like the table S9 in your published paper, and some other sites reported by the pan-cancer paper.

Will that affects the rate of false positive? 

Bests,
Jue

egar...@umaryland.edu

unread,
Feb 8, 2021, 4:13:02 AM2/8/21
to MELT Help
Hello,

No, I get all coordinates via the UCSC genome browser.

Best,

Eugene

Robin J

unread,
Feb 8, 2021, 7:40:58 PM2/8/21
to MELT Help
Hi Eugene,

Just to double check, is this the resource you suggesting? 
I find 1556 sites for L1HS.
Screen Shot 2021-02-09 at 11.38.29 am.png
Another question is, I've seen Pan-cancer paper on L1 report their source element sites and that seems not included in this bed file. Do you think I should also include them when doing the Transduction step?

Bests,
Jue
Reply all
Reply to author
Forward
0 new messages