Hi,
I am very new to bash scripting and running pipelines. I have recently taken over the GBS pipeline of the CSU wheat breeding program. My predecessor left me a standard operating procedure that has TASSEL in it. I am currently working on doing a trial discovery run with a small number of lines n<400. I have successfully gotten through the following pipeline steps: GBSSeqToTagDBPlugin , TagExportToFastqPlugin, bwa, and SAMToGBSdbPlugin. However, I get to DiscoverySNPCallerPluginV2, and the process suddenly grinds to a slow crawl. The program seems to be running properly… however it is taking days to get through a couple chromosmes, and my predecessor told me this usually takes a couple hours at most. Attached are my logs for each step. Hopefully I am just doing something wrong, because if it takes this long to discover 400 lines, I would hate to think what would happen if we tried 40,000… I have attached a picture of the lscpu of the server I am working on.
Here is an example of the code I have been running to do this.
############################################
###Here is an example of discovery written in plain code###
############################################
/mnt/wheatdrive/smallgrainslab/gbs_pipeline/dependencies/tassel-5-standalone/run_pipeline.pl -Xmx450G -fork1 -GBSSeqToTagDBPlugin -e PstI-MspI -i /mnt/wheatdrive/smallgrainslab/gbs_pipeline/test_area -db /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery.db -k /home/zjwinn/key_files/2022_avery_x_CO11D1397_keyfile.tsv -kmerLength 65 -mnQS 0 -mxKmerNum 5000000 -deleteOldData true -endPlugin
/mnt/wheatdrive/smallgrainslab/gbs_pipeline/dependencies/tassel-5-standalone/run_pipeline.pl -Xmx450G -fork1 -TagExportToFastqPlugin -db /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery.db -o /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery_gbs_tags.fa.gz -endPlugin -runfork1
bwa mem -t 63 /mnt/wheatdrive/smallgrainslab/gbs_pipeline/ref_genos/wheat/refseqv2.0/iwgsc_refseqv2.0_all_chromosomes.fa /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery_gbs_tags.fa.gz > /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery_gbs_tags.sam
/mnt/wheatdrive/smallgrainslab/gbs_pipeline/dependencies/tassel-5-standalone/run_pipeline.pl -Xmx450G -fork1 -SAMToGBSdbPlugin -i /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery_gbs_tags.sam -db /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery.db -aLen 0 -aProp 0.0 -endPlugin -runfork1
/mnt/wheatdrive/smallgrainslab/gbs_pipeline/dependencies/tassel-5-standalone/run_pipeline.pl -Xmx450G -fork1 -DiscoverySNPCallerPluginV2 -db /mnt/wheatdrive/smallgrainslab/gbs_pipeline/results/discovery_files/disco_2022/avery_x_CO11D1397_discovery.db -mnMAF 0.01 -mnLCov 0.1 -deleteOldData true -ref /mnt/wheatdrive/smallgrainslab/gbs_pipeline/ref_genos/wheat/refseqv2.0/iwgsc_refseqv2.0_all_chromosomes.fa -endPlugin -runfork1
Tell me if you need anything else.
Thanks,
Zach
Zachary Winn (he/his)
PhD Crop Science
Postdoctoral Fellow
Colorado State University
Fort Collins, CO
Phone: [redacted]
Hi Zach –
We updated the sqlite jar recently and that could be the problem. We’ve had issues with all the newer versions of this jar causing large gbs queries to hang.
I have a jar that should work, but I’m unable to send it to you as the server blocks the file. I’ll try sending this to you in a direct message.
Lynn
--
You received this message because you are subscribed to the Google Groups "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
tassel+un...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tassel/d6551b01-cbf7-4979-aa6b-12a35cb39576n%40googlegroups.com.
The sqlite jar that fixes this problem can be found at:
https://repo1.maven.org/maven2/org/xerial/sqlite-jdbc/3.8.5-pre1/
This is an old jar, but the only sqlite jar that works for this issue. We have seen this issue in the past, so previously kept the old sqlite jar. Recently repository was updated to the new jar to accommodate people who are using Apple with the M1 chip.
From: tas...@googlegroups.com <tas...@googlegroups.com> on behalf of Zachary Winn <zw...@outlook.com>
Date: Thursday, October 27, 2022 at 6:12 PM
To: TASSEL - Trait Analysis by Association, Evolution and Linkage <tas...@googlegroups.com>
--
You received this message because you are subscribed to the Google Groups "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
tassel+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/58686dd1-ad7c-4f05-bae5-d64ba6448eden%40googlegroups.com.
Mao –
What version of TASSEL are you running? We put in a fix that we hoped would take care of this issue.
Lynn
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/fbf35d46-673f-45e0-9e74-bcf9a640bf36n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/BN7PR04MB430759832BC65EE1F2485DF1B700A%40BN7PR04MB4307.namprd04.prod.outlook.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/307e5f29-7548-43b9-bed3-9ffa7755ede3n%40googlegroups.com.
Hi Mao –
As Terry suggested, do a “git pull” in your tassel-5-standalone folder to get the latest version. Note this is not a change to the sqlite jar. The changes was to the sql query in TASSEL that was slow. Let us know if you see better results once you’ve updated tassel. If you send us the log file we can verify the TASSEL load you are running.
Lynn
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/CACq1Qvm1UEVH%3DMH%3DkwqPUa_GzOe4fcbfO5X5qPKL2BeWqk49vw%40mail.gmail.com.
Hi Mao –
I didn’t see this message before responding to the previous one. I’m glad to hear the discovery step is working better now.
Thanks - Lynn
To view this discussion on the web visit https://groups.google.com/d/msgid/tassel/afd7e807-a2f8-4034-93d1-b2569a6ad87dn%40googlegroups.com.