Questions - calculating Euclidean distance

13 views
Skip to first unread message

Shan Wong

unread,
Sep 13, 2023, 5:47:54 PM9/13/23
to poppr
Hello,

I have a dataset of 39 individuals with 11,671 codominant loci. I wanted to calculate genetic distance - Euclidean distance.

I used the following codes, 

observed <- dist(genclone_m, method = "euclidean", diag = FALSE, upper=FALSE, p =2)

So, I took a look at the summary info of the calculated distances using summary(observed). From the results, I don't understand why the Euclidean matrix showed "FALSE". 

Class: dist
Distance matrix by lower triangle : d21, d22, ..., d2n, d32, ...
Size: 39
Labels: Vatr_M_1.2_fixed Vatr_M_12B Vatr_M_12D Vatr_M_12E Vatr_M_14 Vatr_M_15.2 Vatr_M_17 Vatr_M_18.2 Vatr_M_19 Vatr_M_20 Vatr_M_21 Vatr_M_22D Vatr_M_26 Vatr_M_28 Vatr_M_29 Vatr_M_32 Vatr_M_33 Vatr_M_34 Vatr_M_35 Vatr_M_36D Vatr_M_38 Vatr_M_40 Vatr_M_41A_fixed Vatr_M_41B Vatr_M_41D Vatr_M_41E Vatr_M_41F Vatr_M_41H Vatr_M_42 Vatr_M_44 Vatr_M_46 Vatr_M_51A Vatr_M_51B Vatr_M_52 Vatr_M_53 Vatr_M_54 Vatr_M_57A Vatr_M_57C Vatr_M_58
call: dist(x = genclone_m, method = "euclidean", diag = FALSE, upper = FALSE,
    p = 2)
method: euclidean
Euclidean matrix (Gower 1966): FALSE 

I also tried calculating the Euclidean distance using bitwise.dist(). 
observed <- bitwise.dist(genclone_m, missing_match = TRUE, scale_missing = TRUE, euclidean = TRUE, threads = 1L)
summary(observed)

From the result, I see that the method is Provesti even though I used "Euclidean = TRUE" and the Euclidean matrix is FALSE. 

Class: dist
Distance matrix by lower triangle : d21, d22, ..., d2n, d32, ...
Size: 39
Labels: Vatr_M_1.2_fixed Vatr_M_12B Vatr_M_12D Vatr_M_12E Vatr_M_14 Vatr_M_15.2 Vatr_M_17 Vatr_M_18.2 Vatr_M_19 Vatr_M_20 Vatr_M_21 Vatr_M_22D Vatr_M_26 Vatr_M_28 Vatr_M_29 Vatr_M_32 Vatr_M_33 Vatr_M_34 Vatr_M_35 Vatr_M_36D Vatr_M_38 Vatr_M_40 Vatr_M_41A_fixed Vatr_M_41B Vatr_M_41D Vatr_M_41E Vatr_M_41F Vatr_M_41H Vatr_M_42 Vatr_M_44 Vatr_M_46 Vatr_M_51A Vatr_M_51B Vatr_M_52 Vatr_M_53 Vatr_M_54 Vatr_M_57A Vatr_M_57C Vatr_M_58
call: prevosti.dist(x = x)
method: Provesti
Euclidean matrix (Gower 1966): FALSE 

Happy to hear more suggestions on interpreting the results. Thank you very much. 

Regards,
Shan

Zhian Kamvar

unread,
Nov 11, 2023, 3:18:39 PM11/11/23
to Shan Wong, poppr
If you look at the help for ?summary.dist, you'll find that it tests for the Euclidean nature of a distance matrix using Gower's theorem. 

A lot of non-panmictic data sets will return non-euclidean distances. Take for example the nancycats dataset that contains 17 populations of cats from Nancy, France, which are heavily inbred:

library(adegenet)
data(nancycats)
# whole data set with 17 populations -----
dist(nancycats) |> is.euclid()
# [1] FALSE
# single population ----------------------
dist(nancycats[pop = 9]) |> is.euclid() 
# [1] TRUE


bitwise.dist is meant for genlight and snpclone objects. It returns prevosti distance if given a genclone or genind object: https://github.com/grunwaldlab/poppr/blob/ee9c9087d5efc4e3c1c58347e828243ea827461c/R/bitwise.r#L168-L170

Hope that helps, 
Zhian



--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/f6e8de2a-8f81-4d23-b407-a0a1d9a92f72n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages