Hi. Sorry for a noob question, but I am stuck and looking for help.I am trying to create a Structure input file. My data are in plink ped/map format. For testing purposes I've subseted a short list of SNP. So I've created Structure input file using Plink "--recode-structure" command.
Plink:
"$: plink --file data --recode-structure -out structure_input"However, when I try to analyze data using structure I am getting an error message:
"WARNING! Probable error in the input file.
Individual 538, locus 1: encountered the following data "GRC110543" when expecting an integer"
I suspect that this is because Plink generates table with wrong formatting, but I am not sure.Structure input table generated by Plink looks like this:
"rs3094315 rs12184325 rs3131969 rs12562034 rs2518996 rs12132517 rs11240777 rs11579015 rs12134754 rs11260595 rs6671356 rs1320571
-1 1539 77 14266 24086 6267 158 238000 88 2051 928 80405
GRC10041151 1 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
GRC10041187 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 1 2 2 2 2 2 1 2 1 2
GRC10041198 3 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2
GRC10041203 4 2 2 2 2 2 2 1 1 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2
GRC10041306 5 2 2 2 2 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2
GRC10041153 6 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2
GRC10041158 7 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
..."
Not sure if web interface will keep formatting. But first row is SNP tags.
Second row is numbers. SNP data starts from the 3rd row, sample tag is the first column.Has anyone encountered similar problem?
1. Snp tags start in first column which causes an error because Structure parser expect intiger instead of sample tag in first column |
2. Strange second raw with unintelligible numbers |
3. Wrong recoding of nucleotides (1/2??) instead of 1234 |
4. Assigns 0s to missing values instead
of -9s |
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/Yii1uGiuTQoJ.--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/structure-software?hl=en.
My dataset is too huge, for editing it in text editor. Any tips on how to reduce it?
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/bG_tOWAJHUEJ.
A powerful text editor such as 'vim' can easily handle a data set that large. Give it a shot.Have you solved your plink conversion problem?V
On Wed, Jan 16, 2013 at 6:26 AM, Vinod Kumar <kumar....@gmail.com> wrote:
Hi Aydar,
I've not used such a huge dataset but we have run two dataset, the complete data set (6000 SNps) and a smaller dataset (1000 SNPs) (SNPs removed from equal distance throughout the genome), and the results were exactly same in both of the run. If it is a self pollinated plant then you can reduce to an extent but in case of cross pollinated plant species it is tough to reduce high number of SNPs as LD decays faster in such species. Some other methods are also available in the group, discussed especially on this topic, you can follow them.
Thanks,
My dataset is too huge, for editing it in text editor. Any tips on how to reduce it?
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/bG_tOWAJHUEJ.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsub...@googlegroups.com.
Hi! Not yet. Now I am trying to convert data from Plink format to Structure using PGDSpider. It takes quite a long time though. Would you mind a few questions about the vim? How much resources it will require?
On Thursday, January 17, 2013 8:42:23 AM UTC+3, Vikram Chhatre wrote:
A powerful text editor such as 'vim' can easily handle a data set that large. Give it a shot.Have you solved your plink conversion problem?V
On Wed, Jan 16, 2013 at 6:26 AM, Vinod Kumar <kumar....@gmail.com> wrote:
Hi Aydar,
I've not used such a huge dataset but we have run two dataset, the complete data set (6000 SNps) and a smaller dataset (1000 SNPs) (SNPs removed from equal distance throughout the genome), and the results were exactly same in both of the run. If it is a self pollinated plant then you can reduce to an extent but in case of cross pollinated plant species it is tough to reduce high number of SNPs as LD decays faster in such species. Some other methods are also available in the group, discussed especially on this topic, you can follow them.
Thanks,
My dataset is too huge, for editing it in text editor. Any tips on how to reduce it?
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/bG_tOWAJHUEJ.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/CtWiCGvTb8oJ.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
If plink conversion isnt working, use your raw snp data. Vim doesnt need much resources.
---------------------------------------Vikram ChhatreGraduate Program in GeneticsTexas A&M UniversityThis message was sent from a cellular device. It may contain typos and other errors.
Hi! Not yet. Now I am trying to convert data from Plink format to Structure using PGDSpider. It takes quite a long time though. Would you mind a few questions about the vim? How much resources it will require?
On Thursday, January 17, 2013 8:42:23 AM UTC+3, Vikram Chhatre wrote:
A powerful text editor such as 'vim' can easily handle a data set that large. Give it a shot.Have you solved your plink conversion problem?V
On Wed, Jan 16, 2013 at 6:26 AM, Vinod Kumar <kumar....@gmail.com> wrote:
Hi Aydar,
I've not used such a huge dataset but we have run two dataset, the complete data set (6000 SNps) and a smaller dataset (1000 SNPs) (SNPs removed from equal distance throughout the genome), and the results were exactly same in both of the run. If it is a self pollinated plant then you can reduce to an extent but in case of cross pollinated plant species it is tough to reduce high number of SNPs as LD decays faster in such species. Some other methods are also available in the group, discussed especially on this topic, you can follow them.
Thanks,
My dataset is too huge, for editing it in text editor. Any tips on how to reduce it?
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/bG_tOWAJHUEJ.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/CtWiCGvTb8oJ.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/niZFrSOrN24J.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/nWlfsJH_a28J.
To unsubscribe from this group, send email to structure-software+unsubscribe@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsubscribe@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/CtWiCGvTb8oJ.
To post to this group, send email to structure...@googlegroups.com.
To unsubscribe from this group, send email to structure-software+unsubscribe@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/niZFrSOrN24J.
To post to this group, send email to structure-software@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/MF3OJjLWni8J.
To unsubscribe from this group, send email to structure-softw...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/structure-software/-/MF3OJjLWni8J.
You can use plink structure out put directly by changing some parameters. In mainparams file of structure input, put #define ONEROWPERIND 1 and #define MAPDISTANCES 1. It worked for me.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.
To unsubscribe from this group and stop receiving emails from it, send an email to structure-softw...@googlegroups.com.
To post to this group, send email to structure...@googlegroups.com.
Visit this group at http://groups.google.com/group/structure-software.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "structure-software" group.