error in munge_sumstats.py

660 views
Skip to first unread message

studen...@hotmail.com

unread,
Jul 16, 2018, 1:52:35 PM7/16/18
to ldsc_users
When I use this pipeline to test the example (schizophrenia and bipolar disorder), always have this error:


here is my log file:

> --sumstats pgc.cross.SCZ17.2013-05.txt \
> --N 17115 \
> --out scz \
> --merge-alleles w_hm3.snplist
*********************************************************************
* LD Score Regression (LDSC)
* Version 1.0.0
* (C) 2014-2015 Brendan Bulik-Sullivan and Hilary Finucane
* Broad Institute of MIT and Harvard / MIT Department of Mathematics
* GNU General Public License v3
*********************************************************************
Call: 
./munge_sumstats.py \
--out scz \
--merge-alleles w_hm3.snplist \
--N 17115.0 \
--sumstats pgc.cross.SCZ17.2013-05.txt 

Interpreting column names as follows:
info: INFO score (imputation quality; higher --> better imputation)
snpid: Variant ID (e.g., rs number)
a1: Allele 1, interpreted as ref allele for signed sumstat.
pval: p-Value
a2: Allele 2, interpreted as non-ref allele for signed sumstat.
or: Odds ratio (1 --> no effect; above 1 --> A1 is risk increasing)

Reading list of SNPs for allele merge from w_hm3.snplist
Read 1217311 SNPs for allele merge.
Reading sumstats from pgc.cross.SCZ17.2013-05.txt into memory 5000000 SNPs at a time.
. done
Read 1237958 SNPs from --sumstats file.
Removed 137131 SNPs not in --merge-alleles.
Removed 0 SNPs with missing values.
Removed 256286 SNPs with INFO <= 0.9.
Removed 0 SNPs with MAF <= 0.01.
Removed 0 SNPs with out-of-bounds p-values.
Removed 2 variants that were not SNPs or were strand-ambiguous.
844539 SNPs remain.
Removed 0 SNPs with duplicated rs numbers (844539 SNPs remain).
Using N = 17115.0
Median value of or was 1.0, which seems sensible.
Removed 39 SNPs whose alleles did not match --merge-alleles (844500 SNPs remain).

ERROR converting summary statistics:

Traceback (most recent call last):
  File "../munge_sumstats.py", line 707, in munge_sumstats
    dat = allele_merge(dat, merge_alleles, log)
  File "../munge_sumstats.py", line 445, in allele_merge
    dat.loc[~jj, [i for i in dat.columns if i != 'SNP']] = float('nan')
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 193, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 171, in _get_setitem_indexer
    return self._convert_tuple(key, is_setter=True)
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 242, in _convert_tuple
    idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 1269, in _convert_to_indexer
    .format(mask=objarr[mask]))
KeyError: '[-1 -1 -2 ... -1 -1 -1] not in index'


Conversion finished at Sat Jul 14 19:40:33 2018
Total time elapsed: 1.0m:58.13s
Traceback (most recent call last):
  File "../munge_sumstats.py", line 746, in <module>
    munge_sumstats(parser.parse_args(), p=True)
  File "../munge_sumstats.py", line 707, in munge_sumstats
    dat = allele_merge(dat, merge_alleles, log)
  File "../munge_sumstats.py", line 445, in allele_merge
    dat.loc[~jj, [i for i in dat.columns if i != 'SNP']] = float('nan')
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 193, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 171, in _get_setitem_indexer
    return self._convert_tuple(key, is_setter=True)
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 242, in _convert_tuple
    idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
  File "/home/ys/software/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 1269, in _convert_to_indexer
    .format(mask=objarr[mask]))
KeyError: '[-1 -1 -2 ... -1 -1 -1] not in index'




If i remove --merge-alleles, it will be ok. But it's not right when i use sumstats to calculate the genetic correlation.

Raymond Walters

unread,
Jul 16, 2018, 2:05:29 PM7/16/18
to studen...@hotmail.com, ldsc_users
Hi,
This error is caused by an incompatible version of pandas. We currently recommend using the provided conda environment as described in the github readme. Otherwise, you'll need to modify your python environment to use a version of pandas before 0.21.
Cheers,
Raymond



--
You received this message because you are subscribed to the Google Groups "ldsc_users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ldsc_users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ldsc_users/b277c790-0532-40a8-b6bb-288b8ae850c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages