*********************************************************************
* LD Score Regression (LDSC)
* Version 1.0.0
* (C) 2014-2015 Brendan Bulik-Sullivan and Hilary Finucane
* Broad Institute of MIT and Harvard / MIT Department of Mathematics
* GNU General Public License v3
*********************************************************************
Call:
./munge_sumstats.py \
--daner \
--out munged/test \
--merge-alleles ../w_hm3.snplist \
--sumstats daner_PGC_SCZ49.sh2_mds10_1000G-frq_2.gz
Inferred that N_cas = 33640.0, N_con = 43456.0 from the FRQ_[A/U] columns.
Interpreting column names as follows:
INFO: INFO score (imputation quality; higher --> better imputation)
A1: Allele 1, interpreted as ref allele for signed sumstat.
P: p-Value
A2: Allele 2, interpreted as non-ref allele for signed sumstat.
SNP: Variant ID (e.g., rs number)
FRQ_U_43456: Allele frequency
OR: Odds ratio (1 --> no effect; above 1 --> A1 is risk increasing)
Reading list of SNPs for allele merge from ../w_hm3.snplist
Read 1217311 SNPs for allele merge.
Reading sumstats from daner_PGC_SCZ49.sh2_mds10_1000G-frq_2.gz into memory 5000000 SNPs at a time.
ERROR converting summary statistics:
Traceback (most recent call last):
File "/home-3/pho...@jhu.edu/my-python-modules/ldsc/munge_sumstats.py", line 686, in munge_sumstats
dat = parse_dat(dat_gen, cname_translation, merge_alleles, log, args)
File "/home-3/pho...@jhu.edu/my-python-modules/ldsc/munge_sumstats.py", line 238, in parse_dat
for block_num, dat in enumerate(dat_gen):
File "/cm/shared/apps/anaconda2/4.4.0/lib/python2.7/site-packages/pandas/io/common.py", line 93, in <lambda>
BaseIterator.next = lambda self: self.__next__()
File "/cm/shared/apps/anaconda2/4.4.0/lib/python2.7/site-packages/pandas/io/parsers.py", line 959, in __next__
return self.get_chunk()
File "/cm/shared/apps/anaconda2/4.4.0/lib/python2.7/site-packages/pandas/io/parsers.py", line 1019, in get_chunk
return self.read(nrows=size)
File "/cm/shared/apps/anaconda2/4.4.0/lib/python2.7/site-packages/pandas/io/parsers.py", line 982, in read
ret = self._engine.read(nrows)
File "/cm/shared/apps/anaconda2/4.4.0/lib/python2.7/site-packages/pandas/io/parsers.py", line 1719, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 890, in pandas._libs.parsers.TextReader.read (pandas/_libs/parsers.c:10862)
File "pandas/_libs/parsers.pyx", line 924, in pandas._libs.parsers.TextReader._read_low_memory (pandas/_libs/parsers.c:11343)
File "pandas/_libs/parsers.pyx", line 989, in pandas._libs.parsers.TextReader._read_rows (pandas/_libs/parsers.c:12175)
File "pandas/_libs/parsers.pyx", line 1117, in pandas._libs.parsers.TextReader._convert_column_data (pandas/_libs/parsers.c:14136)
File "pandas/_libs/parsers.pyx", line 1192, in pandas._libs.parsers.TextReader._convert_tokens (pandas/_libs/parsers.c:15475)
ValueError: cannot safely convert passed user dtype of float64 for object dtyped data in column 8
I haven't seen this error mentioned before but "column 8" in this file is the "INFO" column. I have run this exact set up on files from the PGC before with no issues.
Any idea what this error is pointing to? Do I need to use a previous version of pandas? Is there something wrong with the file?
Thanks
Paul
...as any version of pandas after that seems to die.sudo pip uninstall pandas
sudo pip install pandas==0.18.1