read.genalex and aboot functions, how to change labels?

35 views
Skip to first unread message

Jose Freixas

unread,
Nov 8, 2017, 11:34:44 AM11/8/17
to poppr
Hello all,

I am trying to show the species names in the final dendrogram I get from the aboot function. I have the data in genalex format and I understand I cannot have my samples labeled as alpha characters, instead I should use alpha numeric characters in genalex. That's ok but how can I replace them by the actual species/cultivar names in the final dendrogram?

Any help is much appreciated.

Zhian Kamvar

unread,
Nov 8, 2017, 11:39:56 AM11/8/17
to Jose Freixas, poppr
Hi Jose,

There's no reason why you can't have your samples labeled with the species/cultivar names in your GenAlEx file. You can name them there. Otherwise, if you already named them with numbers, you can set the sample names in your data set with:

indNames(dat) <- vector_of_names # assuming you have a vector of names

You can plot the output of aboot using the plot.phylo function from ape for further customization.

Best,
Zhian

-----
Zhian N. Kamvar, Ph. D.
Postdoctoral Researcher (Everhart Lab)
Department of Plant Pathology
University of Nebraska-Lincoln
ORCID: 0000-0003-1458-7108




--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To post to this group, send email to po...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/d5016b37-7487-43a2-8a3c-5918289b7c5f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc

Zhian Kamvar

unread,
Nov 8, 2017, 11:56:01 AM11/8/17
to Jose Freixas, poppr
Hi Jose,

indNames() comes from the adegenet package (poppr uses much of the functionality from adegenet in the underlying code). It allows you to replace the sample names in your data set with anything you want. You can find the details in section 3.3 of the "basics" tutorial for adegenet: (https://github.com/thibautjombart/adegenet/wiki/Tutorials). 

So if you had a spreadsheet of sample names, you could read it in and replace your numeric names with those. 

That being said, your problem may be how your spreadsheet program converts data to CSV or there may be a hidden newline character in your data. Either way, you can also try to use the read.genalexcel() function from the popprxl package on your XLS data. 

Hope that helps,
Zhian


-----
Zhian N. Kamvar, Ph. D.
Postdoctoral Researcher (Everhart Lab)
Department of Plant Pathology
University of Nebraska-Lincoln
ORCID: 0000-0003-1458-7108




On Nov 8, 2017, at 10:47 , Jose Freixas <joe...@gmail.com> wrote:

Dear Zhian,

As far as I know, GenAlex doesn't allow me to enter the species names when dealing with SSR data. It only allows me to enter alphanumeric characters. I tried to enter the species names in column 1 where some but not all contain alphanumeric characters and I got the following error:

Error in read.genalex("C:/Users/Jose Freixas/Documents/GRIPP/Hazelnut/ABI fragment analysis/Using R for fragment analysis/Database 44 Hazelnut accessions 20171018.csv",  : 
  
 The number of rows in your data do not match the number of individuals specified.
	47 individuals specified
	19 rows in data
 Please inspect C:/Users/Jose Freixas/Documents/GRIPP/Hazelnut/ABI fragment analysis/Using R for fragment analysis/Database 44 Hazelnut accessions 20171018.csv to ensure it's a properly formatted GenAlEx file.


This error doesn't happen if I label the samples with consecutive numbers.

Finally, can you please provide more details about the indNames(dat) you indicated in your email?

Thanks in advance.

Jose


On 8 November 2017 at 11:39, Zhian Kamvar <zka...@gmail.com> wrote:
Hi Jose,

There's no reason why you can't have your samples labeled with the species/cultivar names in your GenAlEx file. You can name them there. Otherwise, if you already named them with numbers, you can set the sample names in your data set with:

indNames(dat) <- vector_of_names # assuming you have a vector of names

You can plot the output of aboot using the plot.phylo function from ape for further customization.

Best,
Zhian

-----
Zhian N. Kamvar, Ph. D.
Postdoctoral Researcher (Everhart Lab)
Department of Plant Pathology
University of Nebraska-Lincoln
ORCID: 0000-0003-1458-7108




On Nov 8, 2017, at 10:33 , Jose Freixas <joe...@gmail.com> wrote:

Hello all,

I am trying to show the species names in the final dendrogram I get from the aboot function. I have the data in genalex format and I understand I cannot have my samples labeled as alpha characters, instead I should use alpha numeric characters in genalex. That's ok but how can I replace them by the actual species/cultivar names in the final dendrogram?

Any help is much appreciated.

--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+unsubscribe@googlegroups.com.
signature.asc

Zhian Kamvar

unread,
Nov 8, 2017, 3:12:08 PM11/8/17
to Jose Freixas, poppr
Hi Jose,

indNames() expects a character vector but you gave it a data frame. You need to tell R which column you want to use as the individual names. When you read in a csv file, it gets read into R as a data frame. So, if your individual names are located in column 2 of your file, you should use:

indNames(mydata) <- list[[2]]

If you need a refresher in R, I would recommend following the examples here: https://everhartlab.github.io/IntroR/Part1-Introduction.html or using the interactive tutorial package, swirl (http://swirlstats.com/).

Hope that helps,
Zhian

-----
Zhian N. Kamvar, Ph. D.
Postdoctoral Researcher (Everhart Lab)
Department of Plant Pathology
University of Nebraska-Lincoln
ORCID: 0000-0003-1458-7108




On Nov 8, 2017, at 14:03 , Jose Freixas <joe...@gmail.com> wrote:

Many thanks for your answer. I am trying the indNames function as the read.genalexcel() functio didn't work at all.

For the indNames, I have prepared an excel sheet with my new labels, so my coding would be as follows:

mydata<-read.genalex("location.csv", ploidy=2, geo=FALSE, region=FALSE, genclone=TRUE, sep=",", recode=FALSE)
list<-read.csv("location2.csv", header=FALSE)
indNames(mydata)<-list

then I get the following error:
Error in `indNames<-`(`*tmp*`, value = list(V1 = c(15L, 35L, 21L, 12L,  : 
  Vector length does not match number of individuals

I double check and I am not missing any samples, both the genealex and the sample list have the same number of individuals. ANy help please?
signature.asc

Zhian Kamvar

unread,
Nov 9, 2017, 11:26:00 AM11/9/17
to poppr
For those on the list that may have run into a similar issue importing data, I have found a small bug when importing data with apostrophes in the names (see: https://github.com/grunwaldlab/poppr/issues/156). This will be fixed in the next version of poppr.

Best,
Zhian
Reply all
Reply to author
Forward
0 new messages