Data of restriction enzyme DpnII

493 views

Skip to first unread message

Paula Soler

unread,

Jan 20, 2016, 11:40:01 AM1/20/16

to gen...@soe.ucsc.edu

To whom it may concern,

I am Paula Soler, PhD Student from Spain and I would know how I can download the restricction enzyme data from your tracks inside mapping and sequencing.

Thank you for your help.

Paula Soler Vila

PhD Student

Centre Nacional d'Análisi Genòmica(CNAG-CRG)

Centre de Regulació Genòmica

Parc Científic de Barcelona – Torre I

Baldiri Reixac, 4

08028 Barcelona

E-mail: paula...@cnag.crg.eu

Cath Tyner

unread,

Jan 21, 2016, 7:44:35 PM1/21/16

to Paula Soler, UCSC Genome Browser Public Help Forum

Hello Paula,

Thank you for using the UCSC Genome Browser and for submitting your question regarding the possibility of downloading the "Restriction Enzymes" Mapping and Sequencing track (from REBASE). The actual mappings of the sequences to the genome are generated "on the fly," and therefore these data are not downloadable from the Table Browser. You can generate these coordinates for whole-genome or whole-chromosome yourself by following Option 2 below.

Option 1. View in the Genome Browser & export a coordinate range within a chromosome

Note: This option provides output limited to the genomic range you have selected in the browser.

Using the Genome Browser, you can select a coordinate range to view enzymes in the "Restr Enzymes" track under the "Mapping and Sequencing" track group. Be aware that zooming out provides fewer viewable enzymes (see the track description page for details). Once your region is selected, you can click into the enzyme on the browser graphic and then choose an option to download a BED formatted file of genomic coordinates for either all enzymes in the range, or you can filter for one or more enzymes in the range.

This video explains how to get BED file output from the browser for restriction enzymes:

http://blog.openhelix.eu/?p=15185

Example rows from BED file:

#chrom chromStart chromEnd name score strand
chr21 33034807 33034811 DpnII 1000 +
chr21 33035103 33035107 DpnII 1000 +

Option 2. Use the command line utility "oligoMatch" to find perfect sequence matches

oligoMatch is a command-line utility which will find any sequence (e.g., a restriction enzyme (RE) cutting site) and match it to sequence. For example, if your RE cutting site sequence is "GATC" then oligoMatch will find that exact sequence, searching both strands by default. To do this, you will need the following:

The oligioMatch utility
Your RE sequence (e.g., GATC)
Your reference sequence

Both sequence files must be one of these file formats: FASTA (.fa) or .2bit: File format information

Step A. Obtain your RE sequence

1. Set the following fields:

clade: Mammal

genome: Human

assembly: Select whichever human assembly you would like. I will use GRCh37/hg19 for this example.

group: All Tables

database: hgFixed

table: hgFixed.cutters Note: You can click on "describe table schema" to see the table description (enzyme cut site information).

filter: click "edit" and then

for the first row, set as name "does" match "DpnII" and click submit.

output format: selected fields from primary and related tables

output file: leave blank

2. Click "get output" to choose fields from the hgFixed.cutters table

Select the checkboxes for "name" and "seq".

3. Click "get output".

4. Example output:

#filter: cutters.name = 'DpnII'
#name seq
DpnII GATC

4. Convert this to a FASTA file:

>DpnII
GATC

For this example, we can call this file "RE.fa"

Step B: Obtain your reference sequence as the appropriate fie type ( .fa or.2bit). For this example, we will name this file "chr1.2bit".

Download full data sets or data by chromosome here: http://hgdownload.soe.ucsc.edu/downloads.html

Step C: Install the oligoMatch utility

We provide pre-compiled binaries of the oligoMatch utility in http://hgdownload.soe.ucsc.edu/admin/exe/. If the pre-compiled versions aren't compatible with your system, you can compile from the source code in "userApps.src.tgz" in that directory. You may want to review the online README for building userApps.

Step D: Run the utility

For example:

$ oligoMatch RE.fa chr1.2bit oligoOutput.bed

Thank you again for your inquiry and for using the UCSC Genome Browser. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Enjoy,

Cath
. . .

Cath Tyner

UC Santa Cruz Genomics Institute

UCSC Genome Browser: Public Help Forum, Suggestions, Contact

--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser discussion list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.

Reply all

Reply to author

Forward

0 new messages