These 332 samples are from an 11 generation pedigree founded with 4 individuals (K=8), if you look at the attached hapSum plot. Would you expect the expectation below to hold?
" The most important thing to see in this plot is that there is not drastic, but rather gradual, movement between the relative proportions of the ancestral haplotypes. Large swings in ancestral haplotype sums indicate the heuristics are working sub-optimally “
> STITCH(tempdir = tempdir(), chrStart = 1, chrEnd = 23506264, chr = "chr2L", sampleNames_file = "names_GuysOldBam.txt" ,bamlist = "listOfFiles_GuysOldBam.txt", posfile = "posChr2Lnoindel.txt", outputdir = paste0(getwd(), "/"), K = 8, nGen = 11, nCores = 20)
[2021-03-05 07:43:25] Running STITCH(chr = chr2L, nGen = 11, posfile = posChr2Lnoindel.txt, K = 8, S = 1, outputdir = /home/reeves/oldbamtest/, nStarts = , tempdir = /tmp/Rtmpo531A2, bamlist = listOfFiles_GuysOldBam.txt, cramlist = , sampleNames_file = names_GuysOldBam.txt, reference = , genfile = , method = diploid, output_format = bgvcf, B_bit_prob = 16, outputInputInVCFFormat = FALSE, downsampleToCov = 50, downsampleFraction = 1, readAware = TRUE, chrStart = 1, chrEnd = 23506264, regionStart = NA, regionEnd = NA, buffer = NA, maxDifferenceBetweenReads = 1000, maxEmissionMatrixDifference = 10000000000, alphaMatThreshold = 0.0001, emissionThreshold = 0.0001, iSizeUpperLimit = 600, bqFilter = 17, niterations = 40, shuffleHaplotypeIterations = c(4, 8, 12, 16), splitReadIterations = 25, nCores = 20, expRate = 0.5, maxRate = 100, minRate = 0.1, Jmax = 1000, regenerateInput = TRUE, originalRegionName = NA, keepInterimFiles = FALSE, keepTempDir = FALSE, outputHaplotypeProbabilities = FALSE, switchModelIteration = NA, generateInputOnly = FALSE, restartIterations = NA, refillIterations = c(6, 10, 14, 18), downsampleSamples = 1, downsampleSamplesKeepList = NA, subsetSNPsfile = NA, useSoftClippedBases = FALSE, outputBlockSize = 1000, outputSNPBlockSize = 10000, inputBundleBlockSize = NA, genetic_map_file = , reference_haplotype_file = , reference_legend_file = , reference_sample_file = , reference_populations = NA, reference_phred = 20, reference_iterations = 40, reference_shuffleHaplotypeIterations = c(4, 8, 12, 16), output_filename = NULL, initial_min_hapProb = 0.2, initial_max_hapProb = 0.8, regenerateInputWithDefaultValues = FALSE, plotHapSumDuringIterations = FALSE, plot_shuffle_haplotype_attempts = FALSE, plotAfterImputation = TRUE, save_sampleReadsInfo = FALSE, gridWindowSize = NA, shuffle_bin_nSNPs = NULL, shuffle_bin_radius = 5000, keepSampleReadsInRAM = FALSE, useTempdirWhileWriting = FALSE, output_haplotype_dosages = FALSE, use_bx_tag = TRUE, bxTagUpperLimit = 50000)
[2021-03-05 07:43:25] Program start
[2021-03-05 07:43:25] Get and validate pos and gen
[2021-03-05 07:43:26] Done get and validate pos and gen
[2021-03-05 07:43:26] Generate inputs
[2021-03-05 08:29:18] Done generating inputs
[2021-03-05 08:29:18] Copying files onto tempdir
[2021-03-05 08:33:33] Done copying files onto tempdir
[2021-03-05 08:33:33] Generate allele count
[2021-03-05 08:41:52] Quantiles across SNPs of per-sample depth of coverage
[2021-03-05 08:41:52] 5% 25% 50% 75% 95%
[2021-03-05 08:41:52] 3.308 4.038 4.462 4.859 5.486
[2021-03-05 08:41:52] Done generating allele count
[2021-03-05 08:41:52] Outputting will be done in 24 blocks with on average 9651.7 SNPs in them
[2021-03-05 08:41:52] Begin parameter initialization
[2021-03-05 08:41:52] Done parameter initialization
[2021-03-05 08:41:52] Start EM
[2021-03-05 08:41:52] Number of samples: 332
[2021-03-05 08:41:52] Number of SNPs: 231640
[2021-03-05 08:41:52] Start of iteration 1
[2021-03-05 08:49:05] Start of iteration 2
[2021-03-05 08:56:28] Start of iteration 3
[2021-03-05 09:03:35] Start of iteration 4
[2021-03-05 09:10:58] Shuffle haplotypes - Iteration 4 - change on average 115 intervals out of 115 considered
[2021-03-05 09:10:59] Start of iteration 5
[2021-03-05 09:17:39] Start of iteration 6
[2021-03-05 09:24:26] Iteration - 6 - refill infrequently used haplotypes
[2021-03-05 09:24:29] Refill infrequently used haplotypes - on average, 1.9% of regions replaced
[2021-03-05 09:24:31] Start of iteration 7
[2021-03-05 09:32:22] Start of iteration 8
[2021-03-05 09:39:51] Shuffle haplotypes - Iteration 8 - change on average 267 intervals out of 271 considered
[2021-03-05 09:39:53] Start of iteration 9
[2021-03-05 09:47:07] Start of iteration 10
[2021-03-05 09:54:18] Iteration - 10 - refill infrequently used haplotypes
[2021-03-05 09:54:23] Refill infrequently used haplotypes - on average, 3.5% of regions replaced
[2021-03-05 09:54:24] Start of iteration 11
[2021-03-05 10:01:40] Start of iteration 12
[2021-03-05 10:09:05] Shuffle haplotypes - Iteration 12 - change on average 315 intervals out of 343 considered
[2021-03-05 10:09:07] Start of iteration 13
[2021-03-05 10:16:36] Start of iteration 14
[2021-03-05 10:24:01] Iteration - 14 - refill infrequently used haplotypes
[2021-03-05 10:24:07] Refill infrequently used haplotypes - on average, 4.6% of regions replaced
[2021-03-05 10:24:09] Start of iteration 15
[2021-03-05 10:31:53] Start of iteration 16
[2021-03-05 10:39:54] Shuffle haplotypes - Iteration 16 - change on average 299 intervals out of 377 considered
[2021-03-05 10:39:56] Start of iteration 17
[2021-03-05 10:47:25] Start of iteration 18
[2021-03-05 10:55:08] Iteration - 18 - refill infrequently used haplotypes
[2021-03-05 10:55:15] Refill infrequently used haplotypes - on average, 5.8% of regions replaced
[2021-03-05 10:55:17] Start of iteration 19
[2021-03-05 11:02:25] Start of iteration 20
[2021-03-05 11:09:51] Start of iteration 21
[2021-03-05 11:17:31] Start of iteration 22
[2021-03-05 11:24:21] Start of iteration 23
[2021-03-05 11:31:56] Start of iteration 24
[2021-03-05 11:39:36] Start of iteration 25
[2021-03-05 13:13:00] Split reads, average N=88 (0.019 %)
[2021-03-05 13:13:01] Start of iteration 26
[2021-03-05 13:19:49] Start of iteration 27
[2021-03-05 13:27:17] Start of iteration 28
[2021-03-05 13:34:10] Start of iteration 29
[2021-03-05 13:40:57] Start of iteration 30
[2021-03-05 13:48:27] Start of iteration 31
[2021-03-05 13:55:27] Start of iteration 32
[2021-03-05 14:02:24] Start of iteration 33
[2021-03-05 14:10:00] Start of iteration 34
[2021-03-05 14:17:39] Start of iteration 35
[2021-03-05 14:25:11] Start of iteration 36
[2021-03-05 14:32:56] Start of iteration 37
[2021-03-05 14:40:16] Start of iteration 38
[2021-03-05 14:47:59] Start of iteration 39
[2021-03-05 14:55:11] Start of iteration 40
[2021-03-05 15:01:16] End EM
[2021-03-05 15:01:16] Begin making and writing output file
[2021-03-05 15:01:16] Determine reads in output blocks
[2021-03-05 15:07:01] Done determining reads in output blocks
[2021-03-05 15:07:01] Initialize output file
[2021-03-05 15:07:01] Done initializing output file
[2021-03-05 15:07:02] Loop over and write output file
[2021-03-05 15:07:02] Making output piece 1 / 24
[2021-03-05 15:19:24] Making output piece 4 / 24
[2021-03-05 15:26:34] Making output piece 6 / 24
[2021-03-05 15:38:02] Making output piece 9 / 24
[2021-03-05 15:41:38] [error] handle_read_frame error: websocketpp.transport:7 (End of File)
[2021-03-05 15:46:02] Making output piece 11 / 24
[2021-03-05 15:57:20] Making output piece 14 / 24
[2021-03-05 16:04:38] Making output piece 16 / 24
[2021-03-05 16:15:42] Making output piece 19 / 24
[2021-03-05 16:23:18] Making output piece 21 / 24
[2021-03-05 16:35:28] Making output piece 24 / 24
[2021-03-05 16:39:20] Done looping over and writing output file
[2021-03-05 16:39:20] bgzip output file and move to final location
[2021-03-05 16:39:44] Done making and writing output file
[2021-03-05 16:39:44] Save RData objects to disk
[2021-03-05 16:39:45] Make metrics plot
[2021-03-05 16:41:49] Make estimated against real
[2021-03-05 16:42:15] Make other plots
[2021-03-05 16:43:23] Clean up and end
[2021-03-05 16:43:29] Program done
NULL