Problem with -combine# in ./run_pipeline.p

168 views
Skip to first unread message

ningz...@gmail.com

unread,
Feb 22, 2018, 11:13:50 PM2/22/18
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Hi, I'm new with TASSEL.

I'm having problem with the "-combine":

So I cd to TASSEL5 directory, and used command:
" ./run_pipeline.pl -fork1 -h TutorialData/mdp_genotype.hmp.txt -fork2 -r TutorialData/mdp_phenotype.txt -combine3 -input1 -input2 -intersect -FixedEffectLMPlugin -endPlugin -export glm_output"

But I cannot get glm results.

The output of the command is :

"./lib/jhdf5-14.12.5.jar:./lib/jfreesvg-3.2.jar:./lib/biojava-phylo-4.0.0.jar:./lib/slf4j-simple
-1.7.10.jar:./lib/biojava-core-4.0.0.jar:./lib/guava-22.0.jar:./lib/htsjdk-2.14.0.jar:./lib/sql
ite-jdbc-3.8.5-pre1.jar:./lib/commons-math3-3.4.1.jar:./lib/snappy-java-1.1.1.6.jar:./lib/itext
pdf-5.1.0.jar:./lib/ejml-0.23.jar:./lib/commons-codec-1.10.jar:./lib/log4j-1.2.13.jar:./lib/mai
l-1.4.jar:./lib/sTASSEL.jar:./lib/junit-4.10.jar:./lib/trove-3.0.3.jar:./lib/jfreechart-1.0.19.
jar:./lib/ahocorasick-0.2.4.jar:./lib/forester-1.038.jar:./lib/postgresql-9.4-1201.jdbc41.jar:.
/lib/biojava-alignment-4.0.0.jar:./lib/colt-1.2.0.jar:./lib/jcommon-1.0.23.jar:./lib/slf4j-api-
1.7.10.jar:./lib/json-simple-1.1.1.jar:./lib/javax.json-1.0.4.jar:./sTASSEL.jar
Memory Settings: -Xms512m -Xmx1536m
Tassel Pipeline Arguments: -fork1 -h TutorialData/mdp_genotype.hmp.txt -fork2 -r TutorialData/m
dp_phenotype.txt -combine3 -input1 -input2 -intersect -FixedEffectLMPlugin -endPlugin -export g
lm_output
[main] INFO net.maizegenetics.tassel.TasselLogging - Tassel Version: 5.2.42  Date: February 1, 
2018
[main] INFO net.maizegenetics.tassel.TasselLogging - Max Available Memory Reported by JVM: 1365
 MB
[main] INFO net.maizegenetics.tassel.TasselLogging - Java Version: 1.8.0_161
[main] INFO net.maizegenetics.tassel.TasselLogging - OS: Linux
[main] INFO net.maizegenetics.tassel.TasselLogging - Number of Processors: 4
[main] INFO net.maizegenetics.pipeline.TasselPipeline - Tassel Pipeline Arguments: [-fork1, -h,
 TutorialData/mdp_genotype.hmp.txt, -fork2, -r, TutorialData/mdp_phenotype.txt, -combine3, -inp
ut1, -input2, -intersect, -FixedEffectLMPlugin, -endPlugin, -export, glm_output]
[ForkJoinPool.commonPool-worker-2] INFO net.maizegenetics.plugindef.AbstractPlugin - Starting n
et.maizegenetics.analysis.data.FileLoadPlugin: time: Feb 22, 2018 23:45:45
[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Starting n
et.maizegenetics.analysis.data.FileLoadPlugin: time: Feb 22, 2018 23:45:45
[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.plugindef.AbstractPlugin - 
FileLoadPlugin Parameters
format: Hapmap
sortPositions: false

[ForkJoinPool.commonPool-worker-2] INFO net.maizegenetics.plugindef.AbstractPlugin - 
FileLoadPlugin Parameters
format: Phenotype
sortPositions: false

[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.analysis.data.FileLoadPlugin - Start 
Loading File: TutorialData/mdp_genotype.hmp.txt time: Feb 22, 2018 23:45:45
[ForkJoinPool.commonPool-worker-2] INFO net.maizegenetics.analysis.data.FileLoadPlugin - Start 
Loading File: TutorialData/mdp_phenotype.txt time: Feb 22, 2018 23:45:45
[ForkJoinPool.commonPool-worker-2] INFO net.maizegenetics.analysis.data.FileLoadPlugin - Finish
ed Loading File: TutorialData/mdp_phenotype.txt time: Feb 22, 2018 23:45:46
[ForkJoinPool.commonPool-worker-2] INFO net.maizegenetics.plugindef.AbstractPlugin - Finished n
et.maizegenetics.analysis.data.FileLoadPlugin: time: Feb 22, 2018 23:45:46
[ForkJoinPool.commonPool-worker-2] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizege
netics.analysis.data.FileLoadPlugin: time: Feb 22, 2018 23:45:46: progress: 100%
[ForkJoinPool.commonPool-worker-2] INFO net.maizegenetics.plugindef.AbstractPlugin - net.maizeg
enetics.analysis.data.FileLoadPlugin  Citation: Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, 
Ramdoss Y, Buckler ES. (2007) TASSEL: Software for association mapping of complex traits in div
erse samples. Bioinformatics 23:2633-2635.
[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizege
netics.analysis.data.FileLoadPlugin: time: Feb 22, 2018 23:45:46: progress: 100%
[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.analysis.data.FileLoadPlugin - Finish
ed Loading File: TutorialData/mdp_genotype.hmp.txt time: Feb 22, 2018 23:45:46
Genotype Table Name: mdp_genotype
Number of Taxa: 281
Number of Sites: 3093
Sites x Taxa: 869133
Chromosomes...
1: start site: 0 (157104) last site: 539 (299170077) total: 540
2: start site: 540 (736367) last site: 932 (234574991) total: 393
3: start site: 933 (1240310) last site: 1287 (229544509) total: 355
4: start site: 1288 (139753) last site: 1606 (245131801) total: 319
5: start site: 1607 (656148) last site: 1963 (216431558) total: 357
6: start site: 1964 (2379148) last site: 2176 (167883450) total: 213
7: start site: 2177 (729478) last site: 2422 (170346253) total: 246
8: start site: 2423 (169137) last site: 2678 (172323795) total: 256
9: start site: 2679 (3873116) last site: 2891 (151289948) total: 213
10: start site: 2892 (838970) last site: 3092 (148907116) total: 201

[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.plugindef.AbstractPlugin - Finished n
et.maizegenetics.analysis.data.FileLoadPlugin: time: Feb 22, 2018 23:45:46
[ForkJoinPool.commonPool-worker-1] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizege
netics.analysis.data.FileLoadPlugin: time: Feb 22, 2018 23:45:46: progress: 100%
"

I also tried "./run_pipeline.pl -fork1 -h TutorialData/mdp_genotype.hmp.txt -fork2 -r TutorialData/mdp_phenotype.txt -combine3 -input1 -input2 -intersect -export combined.txt -exportType Table -runfork1 -runfork2 ", also without the output file.

Can you help resolve this problem?

Thanks!!




Terry Casstevens

unread,
Feb 23, 2018, 12:30:19 AM2/23/18
to Tassel User Group
The build that I posted today fixes this problem
> --
> You received this message because you are subscribed to the Google Groups
> "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tassel+un...@googlegroups.com.
> To post to this group, send email to tas...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tassel/782e4932-b43e-4108-8731-9622930f61e8%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

ningz...@gmail.com

unread,
Feb 23, 2018, 10:19:51 AM2/23/18
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Thanks Terry,

I downloaded the latest build from https://tassel.bitbucket.io/installer/TASSEL_5_unix.sh

command : "./run_pipeline.pl -fork1 -h TutorialData/mdp_genotype.hmp.txt -fork2 -r TutorialData/mdp_phenotype.txt -combine3 -input1 -input2 -intersect -FixedEffectLMPlugin -endPlugin -export glm_output"

now returns new error message:

[Thread-6] INFO net.maizegenetics.plugindef.AbstractPlugin - 
Usage:
FixedEffectLMPlugin <options>
-phenoOnly <true | false> : Should the phenotype be analyzed with no markers and BLUEs generated? (BLUE = best linear unbiased estimate) (Default: false)
-saveToFile <true | false> : Should the results be saved to a file rather than stored in memory? It true, the results will be written to a file as each SNP is analyzed in order to reduce memory requirementsand the results will NOT be saved to the data tree. Default = false. (Default: false)
-siteFile <Statistics File> : The name of the file to which these results will be saved.
-alleleFile <Genotype Effect File> : The name of the file to which these results will be saved.
-maxP <max P value> : Only results with p <= maxPvalue will be reported. Default = 1.0. [0.0..1.0] (Default: 1.0)
-permute <true | false> : Should a permutation analysis be run? The permutation analysis controls the experiment-wise error rate for individual phenotypes. (Default: false)
-nperm <Number of Permutations> : The number of permutations to be run for the permutation analysis. (Default: 0)
-genotypeComponent <Genotype Component> : If the genotype table contains more than one type of genotype data, choose the type to use for the analysis. [Genotype, ReferenceProbability, AlleleProbability] (Default: Genotype)
-minClassSize <Minimum Class Size> : The minimum acceptable genotype class size. Genotypes in a class with a smaller size will be set to missing. (Default: 0)
-biallelicOnly <true | false> : Only test sites that are bi-allelic. The alternative is to test sites with two or more alleles. (Default: false)
-siteStatsOut <true | false> : Generate an output dataset with only p-val, F statistic, and number of obs per site for all sites. (Default: false)
-siteStatFile <Site Stat File> : Site Stat File
-appendAddDom <true | false> : If true, additive and dominance effect estimates will be added to the stats report for bi-allelic sites only. The effect will only be estimated when the data source is genotype (not a probability). The additive effect will always be non-negative. (Default: false)

[Thread-6] ERROR net.maizegenetics.plugindef.AbstractPlugin - There are more phenotype observations than taxa with genotypes. Either some taxa have multiple phenotypes or some taxa do not have genotypes. Tassel version 5 will not run GLM when that is the case. Be sure to use an intersect join to merge genotypes and phenotypes.

Can you help look at this?
Thanks!!

Peter Bradbury

unread,
Feb 25, 2018, 2:19:32 PM2/25/18
to TASSEL - Trait Analysis by Association, Evolution and Linkage
The problem is that mdp_phenotype.txt contains replicated taxa, which causes the error you saw. If you use mdp_traits.txt instead GLM will run. Alternatively, use GLM with mdp_phenotype.txt alone to generate BLUE's then combine that with the genotype.

Peter
Reply all
Reply to author
Forward
0 new messages