Question about the syntenic alignment

14 views
Skip to first unread message

Duo Xie

unread,
Aug 6, 2021, 12:07:42 PM8/6/21
to gen...@soe.ucsc.edu

Dear Sir/Madam,

Thank you for your reading. I have several questions about the syntenic alignment in UCSC.

1.how does the command netFilter -syn filter net based on synteny? What is the difference between the netSyntenic and netFilter -syn?

2.What is the difference between the hg38.panTro6.net.axt.gz and hg38.panTro6.synNet.maf.gz except for the format difference? I found that these two files contains different numbers of alignment blocks.

zcat  /hwfssz1/pub/database/hgdownload.cse.ucsc.edu/apache/htdocs/goldenPath/hg38/vsPanTro6/hg38.panTro6.net.axt.gz|grep -E "^[0-9]"|wc -l
157734

zcat hg38.panTro6.synNet.maf.gz|grep -E "^a"|wc -l
126953

Best,
Duo

Matthew Speir

unread,
Aug 11, 2021, 9:25:09 AM8/11/21
to Duo Xie, UCSC Genome Browser Discussion List
Hello, Duo. 

Thank you for your question about syntenic alignments in the UCSC Genome Browser. 

For your first question, netSyntenic is what adds synteny information to the net file itself. This synteny information is then used by netFilter -syn to extract only those alignments with "syn" in the "type" column of the net format: http://genome.ucsc.edu/goldenPath/help/net.html

For your second question, the hg38.panTro6.net.axt.gz file is just the human/chimp net alignment in axt format, while the hg38.panTro6.synNet.maf.gz is the net alignment filtered for synteny. The filtered synNet.maf file is going to have less items/alignment blocks since some of those in the net.axt file may not be syntenic and thus filtered out of the synNet.maf file. The README in the hg38/vsPanTro6 directory has this to say about the two files:

  - hg38.panTro6.net.axt.gz: chained and netted alignments,
    i.e. the best chains in the Human genome, with gaps in the best
    chains filled in by next-best chains where possible.  The axt format is
    described in http://genome.ucsc.edu/goldenPath/help/axt.html .

  - hg38.panTro6.synNet.maf.gz - filtered net file for syntenic alignments
               only, in MAF format, see also, description of MAF format:
               http://genome.ucsc.edu/FAQ/FAQformat.html#format5

I hope this is helpful. If you have any further questions, please reply to gen...@soe.ucsc.edu. All messages sent to that address are archived on a publicly-accessible Google Groups forum. If your question includes sensitive data, you may send it instead to genom...@soe.ucsc.edu.

Training videos & resources: http://genome.ucsc.edu/training/index.html

Want to share the Browser with colleagues? Host a workshop: http://bit.ly/ucscTraining

---

Matthew Speir

UCSC Cell Browser, Quality Assurance and Data Wrangler

Human Cell Atlas, User Experience Researcher

UCSC Genome Browser, User Support

UC Santa Cruz Genomics Institute

Revealing life’s code.



--

---
You received this message because you are subscribed to the Google Groups "UCSC Genome Browser Public Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genome+un...@soe.ucsc.edu.
To view this discussion on the web visit https://groups.google.com/a/soe.ucsc.edu/d/msgid/genome/CAF9%2B2skFDVqh1xZ%3DWBSfkhOtsb%3Dr4W6aYdW8BWxMwxsrBv%3DxEQ%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages