Horizontal gap across the Manhattan Plot

400 views
Skip to first unread message

Kiran Kaur

unread,
Feb 8, 2021, 6:38:53 AM2/8/21
to locuszoom
Hi, 

My input file includes a gz compressed file from chr1-23 produced by PLINK. It is worth mentioning my endpoint of intrested is a quantitative trait. The results show a horizontal gap across my manhattan plot with a few snps in this gap at around the p value of 4 (-log 10). The qqplot doesn't look great and my errors on the log file state:
Excluded row 2680932 from output due to parse error  - this was repeated 22 times for a different rows.

The whole process of producing my manhattan plot on Locuszoom took a long time (1hr 30 mins) and I am wondering whether there is something wrong about the way I am plotting my GWAS results or my data? I have never seen a horizontal gap across my manhattan plot before? 

Any help would be extremely useful!

Regards,
Keran

Andrew Boughton

unread,
Feb 8, 2021, 6:49:41 AM2/8/21
to locu...@googlegroups.com
Thanks for the question. It's hard to say what's going on here without seeing an example- if you'd like to attach a screenshot I'd be happy to take a look. For privacy, you can reply to this email address rather than the public email list.

We're working on some improvements to make the upload faster, but the largest files (1GB) do tend to be a bit slow at present, particularly due to the large number of annotations. I don't think that 22 missing rows in and of themselves would lead to a gap in the Manhattan plot- is there any data filtering being done on your results prior to upload?

(As for why those rows are skipped - PLINK is a program with many options, and sometimes we can use these error reports to find edge cases that need better parsing support)

-Andy Boughton

On Feb 8, 2021, at 6:39 AM, Kiran Kaur <avniva...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "locuszoom" group.
To unsubscribe from this group and stop receiving emails from it, send an email to locuszoom+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/locuszoom/1557450d-3dbd-412a-8284-5b9d1ba60397n%40googlegroups.com.

Avni Vassilis

unread,
Feb 8, 2021, 7:14:46 AM2/8/21
to locuszoom
Thank you for getting back to me. Please find the manhattan plot and the qqplot attached. 

{F9B7B2A4-D1C4-4D27-8F60-F94BA2911C95}.png.jpg

{8F9E8C8D-DB2E-4130-9E7B-21000FD6A674}.png.jpg

Andy Boughton

unread,
Feb 8, 2021, 1:58:00 PM2/8/21
to locu...@googlegroups.com
*blinks*

Thanks for sharing these screenshots. I've run them by our colleagues, and none of us recall ever seeing an analysis do this before. There are no known bugs that would lead to this behavior internally, and the sections of code that generate the QQ and manhattan plots are reasonably independent (since both show oddities, this suggests it is a real artifact in your data).

Since the behavior appears systematic across the entire genome, my colleague suggests running some QC on the results, such as calculating the genomic control (GC). If it is high, the problem is likely to be in your original data. (similarly, a simple histogram of -log10pvalues in your data could help shed light on whether the manhattan plot behavior reflects your data)

I'd be curious to hear what you find- even if it doesn't indicate a bug, it sounds like a chance to learn something about the odd corners of GWAS, and perhaps to improve the summary info that we provide to users in the future.

-Andy Boughton
abo...@umich.edu

Senior Applications Programmer/Analyst
Center for Statistical Genetics
University of Michigan

On Feb 8, 2021, at 7:14 AM, Avni Vassilis <avniva...@gmail.com> wrote:

Thank you for getting back to me. Please find the manhattan plot and the qqplot attached. 

<{F9B7B2A4-D1C4-4D27-8F60-F94BA2911C95}.png.jpg>

<{8F9E8C8D-DB2E-4130-9E7B-21000FD6A674}.png.jpg>
To view this discussion on the web visit https://groups.google.com/d/msgid/locuszoom/d906aaf2-eb8c-400d-96e5-5705ae5e0468n%40googlegroups.com.
<{8F9E8C8D-DB2E-4130-9E7B-21000FD6A674}.png.jpg><{F9B7B2A4-D1C4-4D27-8F60-F94BA2911C95}.png.jpg>

Avni Vassilis

unread,
Feb 16, 2021, 6:32:03 AM2/16/21
to locuszoom
Hi Andy

Thanks for your message. I realised I was inputting the wrong p values (I was including the p values for each of the covariates rather than just the p value from the ADD model) therefore, the p values looked stratified on the histogram. The Manhattan plot looks much better now. How would I go about getting the dots on LocusZoom coloured with  LD?

Kind Regards,
Keran

Andy Boughton

unread,
Feb 16, 2021, 12:17:22 PM2/16/21
to locu...@googlegroups.com
Thanks for solving this mystery; that's good to know!

There are many GWAS file formats, so there is no one answer to the question about missing LD. Here are some of the most common reasons why dots might appear grey:

As always- if your answer is missing, or new features would be helpful, let us know. We're continuing to make improvements based on feedback.

-Andy Boughton
abo...@umich.edu

Senior Applications Programmer/Analyst
Center for Statistical Genetics
University of Michigan


Reply all
Reply to author
Forward
0 new messages