Dear friends,
Dear sir/Madam,
I am new to statistics; I do research in ‘Marketing Management’ and
allied areas. For the past one month I am spending my time reading
about Rcmdr. I had come across your website when I was studying about
‘factor analysis. The package ‘FactoMineR’ is very interesting …….
As I am new to this, it is little difficult to understand the concept.
I have gone through the example on your website (http://
factominer.free.fr/classical-methods/hierarchical-clustering-on-
principal-components.html), I understood almost everything, except the
following, as you know here it is very difficult to get some body to
resolve my doubts (In India, for that matter in Asia, I suppose, I
don’t know why people prefer SPSS in spite of it being proprietary) ,
hence, I decided to write to you good people, whom I thought right for
clarifying my doubts.
The following are my doubts (the much common is being problem of
interpretation to values and figures),
Usually, in most of the graphs there are two dimensions (dim1 & dim2),
what are they and what they explain with respect to given values in
parenthesis. For example, in ‘Factor Map’; dim1 (11.37%) and dim2
(9.32%). I only know that it is some thing related to ‘variance’, my
problem is how to interpret this with respect to variables or factors
in study.
Regarding, ‘Description by variables and/or categories’ (res.hcpc
$desc.var$test.chi2,res.hcpc$desc.var$category), I came to know that
it is some thing to do with variables in study, but how to interpret
the values in following figure;
p.value df
where 8.465616e-79 4
how 3.144675e-47 4
price 1.862462e-28 10
tearoom 9.624188e-19 2
pub 8.539893e-10 2
friends 6.137618e-08 2
resto 3.537876e-07 2
How 3.616532e-06 6
Tea 1.778330e-03 4
sex 1.789593e-03 2
frequency 1.973274e-03 6
work 3.052988e-03 2
tea.time 3.679599e-03 2
lunch 1.052478e-02 2
dinner 2.234313e-02 2
always 3.600913e-02 2
sugar 3.685785e-02 2
sophisticated 4.077297e-02 2
Here, why only first two variables are explained to have characterize
most of three clusters. Why not ‘price’ and others? In deed, how to
interpret this p-value of respective variable? Why categories of p-
value less than 0.02 are used? Does it mean that 19th variable has
more than 0.02? (In the illustration there are only 18 variables,
where analysis was done on 19)
Related to ‘Description by principal component’,
$quanti
$quanti$`1`
v.test Mean in category Overall mean sd in category Overall
sd
Dim.6 2.647552 0.03433626 3.088219e-17 0.2655618
0.2671712
Dim.2 -7.796641 -0.13194656 -3.496615e-17 0.1813156
0.3486355
Dim.1 -12.409741 -0.23196088 4.927627e-17 0.2143767
0.3850642
p.value
Dim.6 8.107689e-03
Dim.2 6.357699e-15
Dim.1 2.314001e-35
$quanti$`2`
v.test Mean in category Overall mean sd in category Overall
sd
Dim.2 13.918285 0.81210870 -3.496615e-17 0.2340345
0.3486355
Dim.4 4.350620 0.20342610 1.116042e-17 0.3700048
0.2793822
Dim.14 2.909073 0.10749165 -3.471868e-17 0.2161509
0.2207818
Dim.13 2.341566 0.08930402 2.182264e-17 0.1606616
0.2278809
Dim.3 2.208179 0.11087544 1.099358e-18 0.2449710
0.3000159
Dim.11 -2.234447 -0.08934293 6.981106e-17 0.2066708
0.2389094
p.value
Dim.2 4.905356e-44
Dim.4 1.357531e-05
Dim.14 3.625025e-03
Dim.13 1.920305e-02
Dim.3 2.723180e-02
Dim.11 2.545367e-02
$quanti$`3`
v.test Mean in category Overall mean sd in category Overall
sd
Dim.1 13.485906 0.45155993 4.927627e-17 0.2516544
0.3850642
Dim.6 -2.221728 -0.05161581 3.088219e-17 0.2488566
0.2671712
Dim.4 -4.725270 -0.11479621 1.116042e-17 0.2924881
0.2793822
p.value
Dim.1 1.893256e-41
Dim.6 2.630166e-02
Dim.4 2.298093e-06
attr(,"class")
[1] "catdes" "list "
On what basis we say that “Individuals in cluster 1 have low
coordinates on axes 1 and 2. Individuals in cluster 2 have high
coordinates on the second axis and individuals who belong to the third
cluster have high coordinates on the first axis. Here, a dimension is
kept only when the v-test is higher than 3.”
Can you also please tell me how to interpret values of the following?
cluster: 1
285 152 166 143 71
0.5884476 0.6242123 0.6242123 0.6244176 0.6478185
------------------------------------------------------------
cluster: 2
31 95 53 182 202
0.6620553 0.7442013 0.7610437 0.7948663 0.8154826
------------------------------------------------------------
cluster: 3
172 33 233 18 67
0.7380497 0.7407711 0.7503006 0.7572188 0.7701598
$dist
cluster: 1
82 156 292 197 193
2.009519 1.921977 1.919324 1.908373 1.888461
------------------------------------------------------------
cluster: 2
94 190 212 168 229
1.775459 1.679182 1.674403 1.663392 1.640513
------------------------------------------------------------
cluster: 3
66 273 204 22 44
2.072134 1.924819 1.830174 1.820065 1.794969
In above figure, I understood clusters and respective individuals
(numbers 285, 152, and etc), but what about 0.5884476 and etc.? how to
interpret this value in analysis?
In deed, how to interpret each and every part of the figure?, like in
last figure there are clusters, dims, and values. Here, explanation to
all values is missing……
I MIGHT BE SO IGNORANT OF STATISTICS, but sir, I truly did not
understand how to interpret values, I understood, how to use whole
technique and obtain respective values, but when it comes to
interpretation, it is highly difficult for me.
PLEASE I BEG SOMEONE TO CLARIFY MY DOUBTS……………………………..