I am getting an error message saying that the sample names are not unique. But the names in my input file are unique. So I do not understand why I am getting this error.
./STITCH/STITCH.R --chr="ChrX" --bamlist="newer.bamlist.txt" --posfile="/share/adl/tdlong/mouse_GWAS/data/vcf/P.leu_X_uniq.pos" --outputdir="STI8/ChrX" --K=8 --nGen=60 --nCores=16 --output_haplotype_dosages=TRUE
[2019-08-28 10:06:47] Running STITCH(chr = ChrX, nGen = 60, posfile = /share/adl/tdlong/mouse_GWAS/data/vcf/P.leu_X_uniq.pos, K = 8, S = 1, outputdir = STI8/ChrX, nStarts = , tempdir = NA, bamlist = newer.bamlist.txt, cramlist = , sampleNames_file = , reference = , genfile = , method = diploid, output_format = bgvcf, B_bit_prob = 16, outputInputInVCFFormat = FALSE, downsampleToCov = 50, downsampleFraction = 1, readAware = TRUE, chrStart = NA, chrEnd = NA, regionStart = NA, regionEnd = NA, buffer = NA, maxDifferenceBetweenReads = 1000, maxEmissionMatrixDifference = 1e+10, alphaMatThreshold = 1e-04, emissionThreshold = 1e-04, iSizeUpperLimit = 600, bqFilter = 17, niterations = 40, shuffleHaplotypeIterations = c(4, 8, 12, 16), splitReadIterations = 25, nCores = 16, expRate = 0.5, maxRate = 100, minRate = 0.1, Jmax = 1000, regenerateInput = TRUE, originalRegionName = NA, keepInterimFiles = FALSE, keepTempDir = FALSE, outputHaplotypeProbabilities = FALSE, switchModelIteration = NA, generateInputOnly = FALSE, restartIterations = NA, refillIterations = c(6, 10, 14, 18), downsampleSamples = 1, downsampleSamplesKeepList = NA, subsetSNPsfile = NA, useSoftClippedBases = FALSE, outputBlockSize = 1000, outputSNPBlockSize = 10000, inputBundleBlockSize = NA, genetic_map_file = , reference_haplotype_file = , reference_legend_file = , reference_sample_file = , reference_populations = NA, reference_phred = 20, reference_iterations = 40, reference_shuffleHaplotypeIterations = c(4, 8, 12, 16), output_filename = NULL, initial_min_hapProb = 0.2, initial_max_hapProb = 0.8, regenerateInputWithDefaultValues = FALSE, plotHapSumDuringIterations = FALSE, plot_shuffle_haplotype_attempts = FALSE, plotAfterImputation = TRUE, save_sampleReadsInfo = FALSE, gridWindowSize = NA, shuffle_bin_nSNPs = NULL, shuffle_bin_radius = 5000, keepSampleReadsInRAM = FALSE, useTempdirWhileWriting = FALSE, output_haplotype_dosages = TRUE)
[2019-08-28 10:06:47] Program start
[2019-08-28 10:06:47] Get and validate pos and gen
[2019-08-28 10:06:48] Done get and validate pos and gen
[2019-08-28 10:06:48] Get BAM sample names
[2019-08-28 10:06:48] Done getting BAM sample names
Error in get_sample_names(bamlist = bamlist, cramlist = cramlist, nCores = nCores, :
There are repeat sample names
Calls: STITCH -> get_sample_names
In addition: Warning message:
In mclapply(files, mc.cores = nCores, get_sample_name_from_bam_file_using_SeqLib) :
all scheduled cores encountered errors in user code
Execution halted
cat newer.bamlist.txt
bam/merge/11001.rmdup.bam
bam/merge/11002.rmdup.bam
bam/merge/11003.rmdup.bam
bam/merge/18923.rmdup.bam
bam/merge/18929.rmdup.bam
bam/merge/18950.rmdup.bam
bam/merge/18953.rmdup.bam
bam/merge/18957.rmdup.bam
bam/merge/19037.rmdup.bam
bam/merge/19046.rmdup.bam
...
bam/merge/reference.RG.q30.bam.rmdup.bam
bam/merge/s37.rmdup.bam
bam/merge/s38.rmdup.bam
bam/merge/s39.rmdup.bam
...
cat newer.bamlist.txt | sort | uniq -c
all names are uniq