Problem loading data into Tassel_First time user

1,102 views
Skip to first unread message

Mose

unread,
Aug 30, 2012, 3:41:19 AM8/30/12
to tas...@googlegroups.com

I am a new TASSEL user but I have already experience problem on my first day of use as I am unable to load my data set into the software. Brief of my data Structure: I have 3 text (Table delimited) files. One file containing 42k SNP markers (nucleotide, i.e., A, C, G and T with heterozygote separated by slash, i.e., A/T and missing data with dash, i.e., -)I am trying to do association mapping using 42495 (42k) and 200 maize inbred lines. The SNPs assembled using AGPV2. In this files, first row contain SNPs ID and subsequent rows SNP markers; genotypes IDs are on the first column. The second file contain SNPs mapping, i.e, first row SNP ID second row chromosome number and third row SNP position. The third file contain phenomics data set, with first column containing genotype ID and subsequent columns phenotypic data in numeric; the first row contain phenotypic data IDs.

 

I am trying to load the data on Tassel v4 web interface but I get the message "make sure that import options are properly set' I guess probably I have problem get the right data format to use in Tassel!? I need assistance on how to reformat my data files and/or load my data  into the Tassel.
 

Edward S Buckler

unread,
Aug 30, 2012, 7:00:30 AM8/30/12
to <tassel@googlegroups.com>
Hello-
The first file should should combine the genotypes and positions in HapMap format.  It is the most reliably maintained format.  Going forward we will maintain HapMap and VCF format (and perhaps PLINK), as these are the most common formats in human genetics.

HapMap does not use a "/".  Missing data is N, while "-" implies gap.

Cheers-
Ed


On Aug 30, 2012, at 3:41 AM, Mose <moses....@googlemail.com>
 wrote:

I am a new TASSEL user but I have already experience problem on my first day of use as I am unable to load my data set into the software. Brief of my data Structure: I have 3 text (Table delimited) files. One file containing 42k SNP markers (nucleotide, i.e., A, C, G and T with heterozygote separated by slash, i.e., A/T and missing data with dash, i.e., -)I am trying to do association mapping using 42495 (42k) and 200 maize inbred lines. The SNPs assembled using AGPV2. In this files, first row contain SNPs ID and subsequent rows SNP markers; genotypes IDs are on the first column. The second file contain SNPs mapping, i.e, first row SNP ID second row chromosome number and third row SNP position. The third file contain phenomics data set, with first column containing genotype ID and subsequent columns phenotypic data in numeric; the first row contain phenotypic data IDs.
 
I am trying to load the data on Tassel v4 web interface but I get the message "make sure that import options are properly set' I guess probably I have problem get the right data format to use in Tassel!? I need assistance on how to reformat my data files and/or load my data  into the Tassel.
 

--
You received this message because you are subscribed to the Google Groups "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
To post to this group, send email to tas...@googlegroups.com.
To unsubscribe from this group, send email to tassel+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/tassel/-/RGkCeCFrERUJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
Ed Buckler
USDA-ARS Research Geneticist
Adj. Prof of Plant Breeding and Genetics
Institute for Genomic Diversity
Cornell University
159 Biotechnology Bldg
Ithaca, NY 14853-2703

Email:  es...@cornell.edu
Voice: (607) 255-4520
Fax: (607) 255-6249

Homepage: http://www.maizegenetics.net/




Mose

unread,
Aug 30, 2012, 11:08:32 AM8/30/12
to tas...@googlegroups.com
Hello Ed,

Thanks. I have reformatted my data to HapMap format, but still have a problem loading my file. I attach a screen of the error message. It is worthy noting that I have 346 SNPs which were not mapped on the 10 chromosomes but were mapped as "UN chromosome". I guess this may be source of problem since I have labelled them as NA!? What should I enter as chromosome for such SNPs or do I have to expunge them from my data set? I have labelled missing SNPs as NN since my data structure is now AA, CC, GG, ... and AT, ... for heterozygotes.

Cheers

Mose


On Thursday, 30 August 2012 13:00:32 UTC+2, Edward Buckler wrote:
Hello-
The first file should should combine the genotypes and positions in HapMap format.  It is the most reliably maintained format.  Going forward we will maintain HapMap and VCF format (and perhaps PLINK), as these are the most common formats in human genetics.

HapMap does not use a "/".  Missing data is N, while "-" implies gap.

Cheers-
Ed


On Aug 30, 2012, at 3:41 AM, Mose <moses....@googlemail.com>
 wrote:

I am a new TASSEL user but I have already experience problem on my first day of use as I am unable to load my data set into the software. Brief of my data Structure: I have 3 text (Table delimited) files. One file containing 42k SNP markers (nucleotide, i.e., A, C, G and T with heterozygote separated by slash, i.e., A/T and missing data with dash, i.e., -)I am trying to do association mapping using 42495 (42k) and 200 maize inbred lines. The SNPs assembled using AGPV2. In this files, first row contain SNPs ID and subsequent rows SNP markers; genotypes IDs are on the first column. The second file contain SNPs mapping, i.e, first row SNP ID second row chromosome number and third row SNP position. The third file contain phenomics data set, with first column containing genotype ID and subsequent columns phenotypic data in numeric; the first row contain phenotypic data IDs.
 
I am trying to load the data on Tassel v4 web interface but I get the message "make sure that import options are properly set' I guess probably I have problem get the right data format to use in Tassel!? I need assistance on how to reformat my data files and/or load my data  into the Tassel.
 

--
You received this message because you are subscribed to the Google Groups "TASSEL - Trait Analysis by Association, Evolution and Linkage" group.
To post to this group, send email to tas...@googlegroups.com.
To unsubscribe from this group, send email to tassel+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msg/tassel/-/RGkCeCFrERUJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Loading error message.jpg

Terry Casstevens

unread,
Aug 30, 2012, 11:30:01 AM8/30/12
to tas...@googlegroups.com
Dear Mose,

Please have a look at our User's Guide for details
about Hapmap and look at the sample Hapmap in
our tutorial dataset. If you still have problems, please
send me a sample file that isn't working.

Cheers,

Terry
> tassel+un...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/tassel/-/XQhVVwbBPgsJ.

Mose

unread,
Aug 30, 2012, 1:19:00 PM8/30/12
to tas...@googlegroups.com
Dear Terry,
Thanks. I was able to load the data. The problem was with 346 SNPs which were mapped on "chrUN'. After removing the entire data set (SNPs) mapping on "chrUN" I was able to load the file. Is there another way of going round this problem without deleting the data set on chrUN? Since SNPs located on chrUN has no position, initially I have filled this column as "NA'

Cheers

Moses

Terry Casstevens

unread,
Aug 30, 2012, 3:54:14 PM8/30/12
to tas...@googlegroups.com
Dear Moses,

I'm not sure with out more detailed error
message. Try using "0" for chromosome. Otherwise,
I could diagnose with a sample file.

Cheers,

Terry
> https://groups.google.com/d/msg/tassel/-/KxY_nePw7iEJ.
Reply all
Reply to author
Forward
0 new messages