Re: [genome] RepeatMasker BED track

3 views
Skip to first unread message

Hiram Clawson

unread,
Oct 3, 2022, 1:03:32 PM10/3/22
to Babak Alipanahi, gen...@soe.ucsc.edu, genome...@soe.ucsc.edu
Good Morning Babak:

Please note, the repeat masker track data MySQL table is
not a bed format trask.

$ hgsql -e 'desc rmsk;' hg38
+-----------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+----------------------+------+-----+---------+-------+
| bin | smallint(5) unsigned | NO | | NULL | |
| swScore | int(10) unsigned | NO | | NULL | |
| milliDiv | int(10) unsigned | NO | | NULL | |
| milliDel | int(10) unsigned | NO | | NULL | |
| milliIns | int(10) unsigned | NO | | NULL | |
| genoName | varchar(255) | NO | MUL | NULL | |
| genoStart | int(10) unsigned | NO | | NULL | |
| genoEnd | int(10) unsigned | NO | | NULL | |
| genoLeft | int(11) | NO | | NULL | |
| strand | char(1) | NO | | NULL | |
| repName | varchar(255) | NO | | NULL | |
| repClass | varchar(255) | NO | | NULL | |
| repFamily | varchar(255) | NO | | NULL | |
| repStart | int(11) | NO | | NULL | |
| repEnd | int(11) | NO | | NULL | |
| repLeft | int(11) | NO | | NULL | |
| id | char(1) | NO | | NULL | |
+-----------+----------------------+------+-----+---------+-------+

The shading of the items in the repeat masker track is calculated
from the percent ID:

percId = 1000 - ro.milliDiv - ro.milliDel - ro.milliIns;
grayLevel = grayInRange(percId, 500, 1000);
col = shadesOfGray[grayLevel];

It is not shaded from the swScore:

$ hgsql -e 'SELECT MIN(swScore),MAX(swScore) FROM rmsk;' hg38
+--------------+--------------+
| MIN(swScore) | MAX(swScore) |
+--------------+--------------+
| 11 | 75233 |
+--------------+--------------+

Which is the score in the download bed file and actually violates
the rules of a bed file where score should only be in the range 0-1000

You can obtain the repeat masker output file, which is the source
for the repeat masker track data, from the download server URL:

https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.out.gz

--Hiram

On 10/2/22 11:48 AM, Babak Alipanahi wrote:
> Hello,
>
> I have two questions:
>
> 1. When downloading RepeatMasker, the BED file and the TABLE file
> (selecting all fields) are different, see screenshot below, in which I have
> uploaded the downloaded BED file as a track:
>
> [image: Screen Shot 2022-10-01 at 4.53.54 PM.png]
>
> 2. How does UCSC Genome Browser assign SCORE to the repeats? As you can see
> above, what UCSC displays repeats with different shades, while in the
> supplied BED format, SW Scores are generally large (> 945), so all tracks
> are displayed in black. There must be a way for the Browser to normalize
> the SW score.
>
> Thanks,
> Babak

Babak Alipanahi

unread,
Oct 3, 2022, 6:53:04 PM10/3/22
to Hiram Clawson, gen...@soe.ucsc.edu, genome...@soe.ucsc.edu
Thanks so much, Hiram! This is very helpful.

Where can I find the definitions for grayInRange() and shadesOfGray[]?

Babak

Hiram Clawson

unread,
Oct 3, 2022, 7:18:58 PM10/3/22
to Babak Alipanahi, gen...@soe.ucsc.edu, genome...@soe.ucsc.edu
Good Afternoon Babak:


See also source code files for these definitions and functions:

https://genome-source.gi.ucsc.edu/gitlist/kent.git/raw/master/src/hg/lib/hgColors.c
https://genome-source.gi.ucsc.edu/gitlist/kent.git/raw/master/src/hg/hgTracks/sampleTracks.c

Color shadesOfGray[10+1]; /* 10 shades of gray from white to black
* Red is put at end to alert overflow. */

void makeGrayShades(struct hvGfx *hvg)
/* Make eight shades of gray in display. */
{
hMakeGrayShades(hvg, shadesOfGray, maxShade);
shadesOfGray[maxShade+1] = MG_RED;
}

void hMakeGrayShades(struct hvGfx *hvg, Color *shades, int maxShade)
/* Make up gray scale with 0 = white, and maxShade = black.
* Shades needs to have maxShade+1 colors. */
{
int i;
for (i=0; i<=maxShade; ++i)
{
struct rgbColor rgb;
int level = 255 - (255*i/maxShade);
if (level < 0) level = 0;
rgb.r = rgb.g = rgb.b = level;
shades[i] = hvGfxFindRgb(hvg, &rgb);
}
}

int grayInRange(int val, int minVal, int maxVal)
/* Return gray shade corresponding to a number from minVal - maxVal */
{
return hGrayInRange(val, minVal, maxVal, maxShade);
}


int hGrayInRange(int oldVal, int oldMin, int oldMax, int newMax)
/* Return oldVal, which lies between oldMin and oldMax, to
* equivalent number between 1 and newMax. The way this does it
* is perhaps a little odd, forcing 0 go to 1, but visually it works
* out nicely when 0 is white. */
{
int range = oldMax - oldMin;
int newVal = ((oldVal-oldMin)*newMax + (range>>1))/range;
if (newVal <= 0) newVal = 1;
if (newVal > newMax) newVal = newMax;
return newVal;
Reply all
Reply to author
Forward
0 new messages