Dear Friends
I've really taken much of the time in replying query of Dogra ji. There were so many assignments but certainly that can't be given an excuse. Anyway, lets start by looking for the conditions of factor analysis:
I've attached a ppt regarding the same.
1. Factor analysis (FA) is used when the var are interdependent i.e. we cudn't make sure which var are dep & which are indep.
2. Data MUST be ordinal i.e. likert scale. Nominal data wont be used for FA. Further all likert scale questions must be on same intensity level e.g. In a questionnaire where few statements are on strongly agree to strongly disagree & rest of highly satisfied to highly dissatisfied, here FA can't be applied.
3. FA is generally applied on a very large dataset e.g. 80 statement in questionnaire on 5 point scale replied by 400 respondents.
4. FA got 2 purpose
a) Data reduction
b) Generates new set of var called Factors which are mutually independent
5. Basically FA looks for the inherent undercurrents in the dataset and bring them together in a factor. It starts with finding coefficient of correlation in all var then clubs al var into one place which the respondents replied on same level. It works only on the pattern of responses given by the respondents & is not concerned with 'whether statements resemble each other or not' e.g. in spiritual field there can be various heads from where we can ask questions-meditation, ethical & moral values, religious practices, social norms etc Now FA will find the undercurrent (how much they're alike) in the responses and put them into one factor.
6. FA is of 2 types. I'm discussing only Exploratory FA here.
7. There's many parameters to analyse output of FA
a) Check for Conditions where FA will be applied or not?
A rule of thumb is No. of respondents = No. of statements in question x 5
KMO Value is a better estimate. KMO value . 0.6 signifies FA can be used with present dataset.
b) Method of Extraction
There're various methods but PCA Method is best for beginning.
c) No. of factors to be extracted
There're 3 options (i) When we know nothing about how many factors (This is default method) So here factors are extracted on the basis of Eigen Value Method i.e. where retain those factors where eigen value >=1
(ii) Scree Plot: this is the graphical version of the first method but to be used cautiously
(iii) If total variance explained >60% (Min. criterion for using FA) then we we've option of deciding how many factors to retain. Generally we retain min no. of factors provided their Total var explained >60% This can be done by asking FA to give only 4 or 6 or whatever factors we feel fit. (there's a option which asks for No. of factors)
FA output:
1. The first table given KMO value. It shd be > 0.6 Sig value shd be <0.05
2. The 2nd table of communalities represents how much each var is explained by the factors. Value for every statement shd be > 0.4
3. The 3rd table gives Total var explained and no. of factors taken by FA. Total var explained shd be > 60% i.e. 0.6
4. The 4th table gives coefficient of correlation (called Factor Loading here) between every factor & every statement. Retain those statements in factors where factor loading is >0.4
5. Last table gives the matrix which is reqd in intermediate steps for calculation & is not of much use to us (as of now)
Observations (1):
A quick observation in Table 3 tells factor 1 almost always explains max variance & hence in Table 4, almost all statements will have high factor loading towards factor 1 & very low towards others. So in order to improve solution, we rotate the solution.
Factor Rotation (FR):
1. FR is not compulsory always. If we're getting a gud solution then FR not reqd
2. There're many methods for FR. We'll take Varimax method.
3. After rotation, in output we'll have another table to Rotated factor loading. Now we'll not consider the previous table of Factor loading.
4. In new table the statements will be evenly distributed among various factors.
5. FR never improve the total variance explained.
Observations (2):
In factor analysis output FA never gives the name s of the factors. We've to assign these names on the basis of what's common in various statements of a given factor.
For saving this factor score, we can use Save menu in FA, which employees regression method of saving. These factor sores obtained can further be used as var in techniques like Multiple regression, Logit regression, hypothesis testing etc.
======================================================
Here I've tried to explain the basics of FA. This technique is very subjective as there're many decision areas where no standard exists, only we follow conventions.
In case of any doubt, plz feel free to call me (after 8 pm only)
Best wishes
Neeraj
+91-9996259725