Hi all,
I am trying to run STRUCTURE (cmd-line version, on Ubuntu WSL) for a large dataset (822 individuals, 38987 loci), following conversion of the original vcf to structure files using PGDSpider (and subsequently ensuring that the structure file is in UNIX format, rather than Windows). The data have popdata (41 different populations). I do, however, get the following error messages, and I have to no avail try to find out why this problem keeps occuring:
WARNING! Probable error in the input file.
Individual 822, locus 18260; encountered the following data
"NA19239" when expecting an integer
readlociEOF
WARNING: Unexpected end of input file. The details of the
input file are set in mainparams. I ran out of data while reading
the data for individual 822.
----------------------------------
There were errors in the input file (listed above). According to
"mainparams" the input file should contain one row of markernames with 38987 entries,
822 rows with 77977 entries. .
There are 1645 rows of data in the input file, with an average of 38987.00
entries per line. The following shows the number of entries in each
line of the input file:
# Entries: Line numbers
38985: 1
38987: 2--1645
----------------------------------
The mainparams file looks like this:
#define NUMINDS 822
#define NUMLOCI 38987
#define LABEL 1
#define POPDATA 1
#define POPFLAG 0
#define LOCDATA 0
#define PHENOTYPE 0
#define MARKERNAMES 1
#define MAPDISTANCES 0
#define ONEROWPERIND 1
#define PHASEINFO 0
#define PHASED 0
#define RECESSIVEALLELES 0
#define EXTRACOLS 1
#define MISSING -9
#define PLOIDY 2
#define MAXPOPS 2
#define BURNIN 100
#define NUMREPS 100
#define NOADMIX 0
#define LINKAGE 0
#define USEPOPINFO 0
#define LOCPRIOR 0
#define INFERALPHA 1
#define ALPHA 1.0
#define POPALPHAS 0
#define UNIFPRIORALPHA 1
#define ALPHAMAX 10.0
#define ALPHAPROPSD 0.025
#define FREQSCORR 1
#define ONEFST 0
#define FPRIORMEAN 0.01
#define FPRIORSD 0.05
#define INFERLAMBDA 0
#define LAMBDA 1.0
#define COMPUTEPROB 1
#define PFROMPOPFLAGONLY 0
#define ANCESTDIST 0
#define STARTATPOPINFO 0
#define METROFREQ 10
#define UPDATEFREQ 1
Any ideas on why these errors keep cropping up? And why is STRUCTURE saying that the input file should contain 822 rows with 77977 entries, when there are only 38987 loci?
Thanks in advance!
John