Wrong genomic coodinates in axt Alignment Format

12 views
Skip to first unread message

Akihiro KUNO

unread,
Feb 26, 2018, 1:27:47 PM2/26/18
to gen...@soe.ucsc.edu
Dear UCSC Genome Informatics Group,

I am using axt alignment files from UCSC data repository:
http://hgdownload.soe.ucsc.edu/goldenPath/mm9/vsHg19/axtNet/

But I found genomic coordinates seem to be wrong.
For example, attached axt_file.txt includes one alignment comparing mm9 to hg19.
mm9 genomic coordinate (chr1 3050202 3050698) is no problem, but hg19
(chr8 89824042 89824491) may be wrong. chr8:56539781-56539981 is
correct.

In addition, a web page of explaining axt format contains the same problem.
https://genome.ucsc.edu/goldenPath/help/axt.html

It contains the following example:
0 chr19 3001012 3001075 chr11 70568380 70568443 - 3500
TCAGCTCATAAATCACCTCCTGCCACAAGCCTGGCCTGGTCCCAGGAGAGTGTCCAGGCTCAGA
TCTGTTCATAAACCACCTGCCATGACAAGCCTGGCCTGTTCCCAAGACAATGTCCAGGCTCAGA

I do not know what is the reference genomes, but if they are mm9 and
hg19, the genomic coordinates are not "chr19 3001012 3001075" and
"chr11 70568380 70568443", but "chr19:6889117-6889180" and
"chr11:64159468-64159531", respectively.

If my point is correct, I hope the problem will be solved soon.

Thank you in advance.

Sincerely,

Akihiro Kuno
Department of Anatomy and Embryology
The University of Tsukuba
axt_file.txt

Jairo Navarro Gonzalez

unread,
Feb 28, 2018, 3:01:15 PM2/28/18
to Akihiro KUNO, UCSC Genome Browser Mailing List

Dear Akihiro,

Thank you for using the UCSC Genome Browser and your inquiry.

The discrepancy you notice is due to how we store the alignment coordinates. If you are unfamiliar with our coordinate system, please refer to the following blog post, The UCSC Genome Browser Coordinate Counting Systems. The minus strand coordinates in axt files are stored differently than how they are displayed in the Genome Browser. Specifically,

  1. Coordinates positions are stored with respect to the positive strand
  2. Start position, chromStart, is 0-based (add 1 to get the actual base position) and the chromEnd is 1-based.
  3. If the strand is negative, the start position is the chromEnd, and the end position is the chromStart.

According to the axt format page, http://genome.ucsc.edu/goldenPath/help/axt.html:

If the strand value is "-", the values of the aligning organism's start and end fields are relative to the reverse-complemented coordinates of its chromosome.

You can use the following formula to convert minus-strand position format axt coordinates to Genome Browser display coordinates: 

start = (chromSize - axtEnd) + 1 
end   = chromSize - (axtStart - 1) 

For BED format coordinates, use the following formula:

start = chromSize - axtEnd 
end   = chromSize - axtStart 

In this case, the size of chr8 in hg19 is 146,364,022 bp, so for the following position, chr8 89824042 89824491, the conversion is: 

Start = 146,364,022 - 89,824,491 = 56,539,531
End   = 146,364,022 - 89,824,042 = 56,539,980

and results in the following position: chr8:56,539,530-56,539,980.
These coordinates are fairly close to the position you shared: chr8:56,539,781-56,539,981.

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu.
All messages sent to that address are archived on a publicly-accessible Google Groups forum.
If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Jairo Navarro 
UCSC Genomics Institute

Want to share the Browser with colleagues?
Host a workshop: http://bit.ly/ucscTraining




--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To post to this group, send email to gen...@soe.ucsc.edu.
Visit this group at https://groups.google.com/a/soe.ucsc.edu/group/genome/.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAOzs1aWipYpkfcXa2pfX6sfcoxytAERMXqTHc3ses9-vS8AMJQ%40mail.gmail.com.
For more options, visit https://groups.google.com/a/soe.ucsc.edu/d/optout.

Reply all
Reply to author
Forward
0 new messages