When I run these fastq files through Tassel5 GBS v2 Pipeline, I get through all plugins, except for -ProductionSNPCallerPluginV2, where I get the following output (see errors in red):
Memory Settings: -Xms512m -Xmx16G
Tassel Pipeline Arguments: -fork1 -ProductionSNPCallerPluginV2 -db /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db -e ApeKI -i /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018 -k /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/keyfile.barcoded.txt -kmerLength 64 -o /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/GBS_2016_2018_9May2018.vcf -endPlugin -runfork1
[main] INFO net.maizegenetics.tassel.TasselLogging - Tassel Version: 5.2.43 Date: February 22, 2018
[main] INFO net.maizegenetics.tassel.TasselLogging - Max Available Memory Reported by JVM: 14564 MB
[main] INFO net.maizegenetics.tassel.TasselLogging - Java Version: 1.8.0_162
[main] INFO net.maizegenetics.tassel.TasselLogging - OS: Linux
[main] INFO net.maizegenetics.tassel.TasselLogging - Number of Processors: 8
[main] INFO net.maizegenetics.pipeline.TasselPipeline - Tassel Pipeline Arguments: [-fork1, -ProductionSNPCallerPluginV2, -db, /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db, -e, ApeKI, -i, /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018, -k, /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/keyfile.barcoded.txt, -kmerLength, 64, -o, /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/GBS_2016_2018_9May2018.vcf, -endPlugin, -runfork1]
net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Starting net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2: time: May 10, 2018 0:11:21
Enzyme: ApeKI
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin -
ProductionSNPCallerPluginV2 Parameters
i: /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018
k: /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/keyfile.barcoded.txt
e: ApeKI
db: /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db
o: /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/GBS_2016_2018_9May2018.vcf
eR: 0.01
d: 0
ko: false
do: true
kmerLength: 64
minPosQS: 0.0
batchSize: 8
mnQS: 0
size of all tags in tag table=2370966
size of all tissues in tissue table=0
size of all tags in mappingApproach table=2
size of all taxa in taxa table=141
ProductionSNPCallerPluginV2: Total batches to process: 1
size of all positions in snpPosition table=96277
[pool-1-thread-1] INFO net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 -
Output VCF file:
/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/GBS_2016_2018_9May2018.vcf
created for genotypes from this run.
size of all positions in snpPosition table=96277
size of all alleles in allele table=194649
Start processing batch 1
Enzyme: ApeKI
Enzyme: ApeKI
[pool-1-thread-1] INFO net.maizegenetics.analysis.gbs.v2.GBSUtils - /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/HKFWCCXY_8_fastq.fq: Quality score base:33
[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.analysis.gbs.v2.GBSUtils - /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/H2J2MBCXY_1_fastq.fq: Quality score base:33
[pool-1-thread-1] ERROR net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 - Good Barcodes Read: 10
[ForkJoinPool.commonPool-worker-1] ERROR net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 - Good Barcodes Read: 16
java.lang.StringIndexOutOfBoundsException:
ERROR processing /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/HKFWCCXY_8_fastq.fq
Reading entry number 11 fails the length test.
Sequence length 63 minus barcode length 10 is less than kmerLength 64.
Re-run your files with either a shorter kmerLength value or a higher minimum quality score.
at net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2.processFastQ(ProductionSNPCallerPluginV2.java:362)
at net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2.processFastQFile(ProductionSNPCallerPluginV2.java:327)
at net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2.lambda$processData$2(ProductionSNPCallerPluginV2.java:252)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:401)
at java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:734)
at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:160)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:583)
at net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2.processData(ProductionSNPCallerPluginV2.java:250)
at net.maizegenetics.plugindef.AbstractPlugin.performFunction(AbstractPlugin.java:112)
at net.maizegenetics.plugindef.AbstractPlugin.dataSetReturned(AbstractPlugin.java:1837)
at net.maizegenetics.plugindef.ThreadedPluginListener.run(ThreadedPluginListener.java:29)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
java.lang.StringIndexOutOfBoundsException:
ERROR processing /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/H2J2MBCXY_1_fastq.fq
Reading entry number 17 fails the length test.
Sequence length 65 minus barcode length 9 is less than kmerLength 64.
Re-run your files with either a shorter kmerLength value or a higher minimum quality score.
at net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2.processFastQ(ProductionSNPCallerPluginV2.java:362)
at net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2.processFastQFile(ProductionSNPCallerPluginV2.java:327)
at net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2.lambda$processData$2(ProductionSNPCallerPluginV2.java:252)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Finished processing batch 1
[pool-1-thread-1] INFO net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 -
Writing ReadsPerSample log file...
[pool-1-thread-1] INFO net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 - ReadsPerSample log file: /media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/keyfile.barcoded_ReadsPerSample.log
[pool-1-thread-1] INFO net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 -
Total number of SNPs processed with minimum quality score 0 was 96277.
[pool-1-thread-1] INFO net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 - ...done
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Finished net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2: time: May 10, 2018 0:11:39
[pool-1-thread-1] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2: time: May 10, 2018 0:11:39: progress: 100%
[pool-1-thread-1] INFO net.maizegenetics.plugindef.AbstractPlugin - net.maizegenetics.analysis.gbs.v2.ProductionSNPCallerPluginV2 Citation: Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. (2007) TASSEL: Software for association mapping of complex traits in diverse samples. Bioinformatics 23:2633-2635.
What is a Length Test? When creating the original tag DB, I specified -minKmerL 20 and -kmerLength 64, but maybe there are kmers lower than 20 in my dataset anyway due to over-trimming? If anyone has thoughts/ideas/suggestions, they would be greatly appreciated.
Cheers,
Stephanie
PS - I have posted all the commands from this run below:
'/home/stephanie/TASSEL5/
run_pipeline.pl' -Xmx16G -fork1 -GBSSeqToTagDBPlugin -e ApeKI -i '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018' -db '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db' -k '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/keyfile.barcoded.txt' -kmerLength 64 -minKmerL 20 -mnQS 20 -mxKmerNum 100000000 -endPlugin -runfork1
'/home/stephanie/TASSEL5/
run_pipeline.pl' -Xmx16G -fork1 -TagExportToFastqPlugin -db '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db' -o '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/tagsForAlign.fa.gz' -c 1 -endPlugin -runfork1
bwa aln -t4 '/media/stephanie/Olivia/Genomes/SuperScaffolds/superscaffolds_chromosome1.fasta' '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/tagsForAlign.fa.gz' > '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/tagsForAlign.sai'
bwa samse '/media/stephanie/Olivia/Genomes/SuperScaffolds/superscaffolds_chromosome1.fasta' '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/tagsForAlign.sai' '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/tagsForAlign.fa.gz' > '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/tagsForAlign.sam'
'/home/stephanie/TASSEL5/
run_pipeline.pl' -Xmx16G -fork1 -SAMToGBSdbPlugin -i '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/tagsForAlign.sam' -db '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db' -aProp 0.0 -aLen 0 -endPlugin -runfork1
'/home/stephanie/TASSEL5/
run_pipeline.pl' -Xmx16G -fork1 -DiscoverySNPCallerPluginV2 -db '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db' -sC 1 -eC 1 -mnLCov 0.1 -mnMAF 0.05 -deleteOldData true -endPlugin -runfork1
'/home/stephanie/TASSEL5/
run_pipeline.pl' -Xmx16G -fork1 -ProductionSNPCallerPluginV2 -db '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/2016_2018_Kaki.db' -e ApeKI -i '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018' -k '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/keyfile.barcoded.txt' -kmerLength 64 -o '/media/stephanie/External4/GBS_2016_2018_Combined/9May2018/9May2018/GBS_2016_2018_9May2018.vcf' -endPlugin -runfork1