Tassel 4 Pipeline MLM error – java.lang.IndexOutOfBoundsException

371 views
Skip to first unread message

Fernando

unread,
Apr 19, 2013, 12:44:58 PM4/19/13
to tas...@googlegroups.com


Hello everyone,


I am trying to run Tassel 4 in pipeline to compute MLM analysis (to do GWAS) on a server with Linux OS but I have some troubles to carry out. I think the error is related with the format of some of the files I’m using, but I don’t know exactly, the error message is “java.lang.IndexOutOfBoundsException: Index: 0, Size: 0”

 

I am using the following files (I also show the header of each one):

 

1)      1) Chr10.txt – Genotyping


<Annotated>

<Transposed>               Yes

<Taxa_Number>            279

<Locus_Number>          33664

<Poly_Type>                 Catagorical

<Delimited_Values>       Yes

<Taxon_Name>             4226      4722      33-16     38-11 …               

<Chromosome_Number>        <Genetic_Position>       <Locus_Name>         <Value>

                              

 

2)      2) Permt_data_Borer_kernel.txt this file contains 1000 permutations of BLUEs of 279 maize inbred lines.

 

<Trait>                 Permt1                Permt2             Permt1000                                        

4226                    7.163                     9.017               8.935

A554                    8.708                    7.887                7.491

CML52                 6.979                    7.887                8.365

.

.

 

3)      3) Population_structure_5k_SNP.txt

 

279        3             1            

Q1          Q2          Q3         

4226      0.216     0.107     0.677 …

4722      0.004     0.024     0.971 …

33-16    0.137     0.088     0.775 …

.

.

 

4)      4) Kinship_matrix_5k_SNP.txt

 

The script I'm using is:

 

run_pipeline.pl -fork1 -a Chr10.txt -fork2 -r Permt_data_Borer_kernel.txt -fork3 -p Population_structure_5k_SNP.txt -excludeLastTrait -fork4 -k Kinship_matrix_5k_SNP.txt -combine5 -input1 -input2 -input3 -union -combine6 -input5 -input4 -mlm -mlmVarCompEst P3D -mlmCompressionLevel Optimum -mlmMaxP 1e-3 -export mlm_BK_Chr10_output -runfork1 -runfork2 -runfork3 -runfork4

 

And the following is the error message:

 

Tassel Pipeline Arguments: -fork1 -a Chr10.txt -fork2 -r Permt_data_Borer_kernel.txt -fork3 -p Population_structure_5k_SNP.txt -excludeLastTrait -fork4 -k Kinship_matrix_5k_SNP.txt -combine5 -input1 -input2 -input3 -union -combine6 -input5 -input4 -mlm -mlmVarCompEst P3D -mlmCompressionLevel Optimum -mlmMaxP 1e-3 -export mlm_BK_Chr10_output -runfork1 -runfork2 -runfork3 -runfork4

Picked up _JAVA_OPTIONS: -Xmx2097152K

[main] INFO net.maizegenetics.pipeline.TasselPipeline - Tassel Version: 4.1.27  Date: April 11, 2013

[main] INFO net.maizegenetics.pipeline.TasselPipeline - Max Available Memory Reported by JVM: 1820 MB

[main] INFO net.maizegenetics.pipeline.TasselPipeline - loadFile: Chr10.txt

[main] INFO net.maizegenetics.pipeline.TasselPipeline - loadFile: Permt_data_Borer_kernel.txt

[main] INFO net.maizegenetics.pipeline.TasselPipeline - loadFile: Population_structure_5k_SNP.txt

[main] INFO net.maizegenetics.pipeline.TasselPipeline - loadFile: Kinship_matrix_5k_SNP.txt

net.maizegenetics.baseplugins.FileLoadPlugin

   net.maizegenetics.baseplugins.CombineDataSetsPlugin

      net.maizegenetics.baseplugins.UnionAlignmentPlugin

         net.maizegenetics.baseplugins.CombineDataSetsPlugin

            net.maizegenetics.baseplugins.MLMPlugin

               net.maizegenetics.baseplugins.ExportMultiplePlugin

net.maizegenetics.baseplugins.FileLoadPlugin

net.maizegenetics.baseplugins.FileLoadPlugin

   net.maizegenetics.baseplugins.FilterTraitsPlugin

net.maizegenetics.baseplugins.FileLoadPlugin

java.lang.IndexOutOfBoundsException: Index: 0, Size: 0

        at java.util.ArrayList.RangeCheck(ArrayList.java:547)

        at java.util.ArrayList.get(ArrayList.java:322)

        at net.maizegenetics.baseplugins.FilterTraitsPlugin.performFunction(FilterTraitsPlugin.java:74)

        at net.maizegenetics.plugindef.AbstractPlugin.dataSetReturned(AbstractPlugin.java:201)

        at net.maizegenetics.plugindef.AbstractPlugin.fireDataSetReturned(AbstractPlugin.java:137)

        at net.maizegenetics.baseplugins.FileLoadPlugin.performFunction(FileLoadPlugin.java:195)

        at net.maizegenetics.plugindef.AbstractPlugin.dataSetReturned(AbstractPlugin.java:201)

        at net.maizegenetics.plugindef.ThreadedPluginListener.run(ThreadedPluginListener.java:29)

[Thread-2] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.baseplugins.FileLoadPlugin: progress: 100%

[Thread-3] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.baseplugins.FileLoadPlugin: progress: 100%

[Thread-1] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.baseplugins.FileLoadPlugin: progress: 100%

[Thread-0] INFO net.maizegenetics.pipeline.TasselPipeline - net.maizegenetics.baseplugins.FileLoadPlugin: progress: 100%

 

Could someone help me?

 

Thanks


Fernando

Terry Casstevens

unread,
Apr 19, 2013, 5:12:47 PM4/19/13
to Tassel User Group
Hi Fernando,

I recommend reformatting your population structure file like the
tutorial data set. Only slightly different. Also, Hapmap is
preferred for genotype data..

Cheers,

Terry
> --
> You received this message because you are subscribed to the Google Groups
> "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tassel+un...@googlegroups.com.
> To post to this group, send email to tas...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/tassel/-/I5lZK6EU4kYJ.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Fernando

unread,
Apr 24, 2013, 11:48:28 AM4/24/13
to tas...@googlegroups.com
Hi Terry,

I changed the format of the population structure file and now my script is working.

One more thing, how many threads have Tassel?, because I am working with a big server and I want to get as many cores as threads is going to open out the program.

Thanks

Fernando

Terry Casstevens

unread,
Apr 24, 2013, 11:51:23 AM4/24/13
to Tassel User Group
Hi Fernando,

Tassel detects how many cores you have and creates a thread pool of that size.

Cheers,

Terry
> https://groups.google.com/d/msg/tassel/-/TnUJt5JTEpMJ.
Message has been deleted

John Hart

unread,
Oct 5, 2015, 9:50:17 AM10/5/15
to TASSEL - Trait Analysis by Association, Evolution and Linkage

 We have noticed when running TASSEL 3 pipeline on our Linux server, that it doesn’t fully utilize its 32 cores.  The system monitor shows only 32% percent usage.  Is this normal for TASSEL or is there a way to increase the CPU usage?  Below are the specs of the server.  Any information on this matter would be greatly appreciated.
 
2x 16 Core E5-2698 Intel Xeon CPUs
198 GB RAM
20 TB RAID
Ubuntu Server 14.04.




Peter Bradbury

unread,
Oct 5, 2015, 10:10:43 AM10/5/15
to TASSEL - Trait Analysis by Association, Evolution and Linkage
Some specific TASSEL functions (such as MLM and GLM) are single threaded while others are multithreaded. So how many threads get utilized depends on what methods or plugins are used. Also, it is possible that usage varies over time as TASSEL progresses through the pipeline.

Peter

John Hart

unread,
Oct 5, 2015, 10:16:34 AM10/5/15
to tas...@googlegroups.com
Thanks Peter,
  In this case I am running FastqToTBTPlugin and the system monitor never indicates that more than 32% CPU is utilized...
   

--
You received this message because you are subscribed to a topic in the Google Groups "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tassel/mB2RyO-nec0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tassel+un...@googlegroups.com.

To post to this group, send email to tas...@googlegroups.com.

Edward S. Buckler

unread,
Oct 5, 2015, 10:40:31 AM10/5/15
to tas...@googlegroups.com
In the world of GBS, only the GBSv2 pipeline is fully multithreaded.  However, in the reading of fastq files, disk io bandwidth may still prevent full core usage.

Cheers-
Ed

Reply all
Reply to author
Forward
0 new messages