Hi, Thank you all for your timely replies.
Let me clearify the reson behind the analysis first. As Peter and Jeff
pointed out, two measures are not really a repeated measures study and
not enough to model growth properly. This data is from the first
follow up in a cohort study and will in the future be extended with
more measurements. For the moment however we would like to describe
the material so far as good as possibly.
The reason for trying to estimates patterns of change are not to test
any hypothesis if subgroup exists or trying to determine what factors
affects recovery. The question of interest are more like:
How many of the subjects where affected in the first timepoint but
seem to have recovered?
How many had a medium or high score in the beginning but show no or
small signs of recovery?
How many had a low score, indicating less impact, but seem to have a
delayed reaction?
How many had a low score in both time points indicating resilency
against PTSD?
Plotting the data T1 vs T2 as Ted Harding suggested (I am really
impressed by your ascii graphing skills by the way) shows no clear
clusters and I am therefore reluctant to set any cutoffs. I would
prefer the data to determine this.
I am now looking into clustering as suggested. I am pretty new to this
area so I would appreciate it if anyone could give me som help on the
following questions.
A Jeremy poited out the data has a count distribution with most
observations on small values. One of the reasons I ended up with PROC
TRAJ was that it conveniently supported zero inflated poisson as a
distributional model. Does this matter when using cluster analysis?
What clustering techniques do you think are useful? From the little I
know K-means clustering should be the proper one?
What variables should I cluster on? Should I use measurements at T1
and T2 as analysis variables or is the measurement at T1 and the
diffrence better (T2-T1)? Or will it not matter?
Any tips on how to decide the number of clusters? I am thinking that
since the variables are correlated this might induce some problems
with some techniques.
Again, thanks for all the input. Have a nice day everyone.
/Thomas
On 4 Nov, 18:17, Jeremy Miles <jeremy.mi...@gmail.com> wrote:
> Hi Thomas,
> Three thoughts.
> 1) I have never seen trajectory models on two time points, but that's
> not to say that it can't be done. However, each individual's
> trajectory will be the difference between their time 1 score and time
> 2 score. If you find that difference, then plot it, see if it looks
> to by bi/multimodal. Then do a scatterplot time 1 score against
> difference - does it have obvious clusters?
> 2) There is debate in the literature about the Nagin approach (as with
> proc traj) and the approach of Bength Muthen, implemented in Mplus.
> Very, very briefly, Mplus allows random variation (if you ask for it)
> of slopes within groups. Proc traj doesn't.
> 3) PTSD scores tend to be non-normally distributed. Trajectory
> approaches make the assumption that non-normality is because of
> mixtures of distributions. If your data are just plain non-normal,
> you'll find groups which try to account for the non-normality by
> creating mixtures of normal distributions. There's nothing wrong with
> that - that means that you are modeling your distribution better, but
> don't go interpreting them as qualitatively different groups without
> more evidence. (If you've got more than three time periods, you can
> also use mixtures to handle non-linearity).
> Here are a couple of potentially useful papers:
> BAUER, D. J. and CURRAN, P. J. Distributional assumptions of growth
> mixture models: Implications for overextraction of latent trajectory
> classes. Psychological Methods 8: 338-363, 2003.
> BAUER, D. J. A semiparametric approach to modeling nonlinear relations
> among latent variables. Structural Equation Modeling 12: 513-535,
> 2005.
> Jeremy
> 2009/11/4 Thomas Fröjd <tfr...@gmail.com>:
> > Hi.
> > I am working with a researcher analysing data following a number of
> > individuals on two timepoints. The measurements we are interested in
> > are the number of PTSD symptoms they have on each timepoint.
> > The researcher I work with suspect that there are a number of
> > naturally occuring subpopulations among the individuals with diffrent
> > patterns in recovery. Basically she hypothesizes that one group will
> > will have a low number of symptoms on each occation, another group
> > will have a high number of symptoms on both ocations and the last will
> > have a high number of symptoms the first occation but few in the last
> > occation.
> > She has asked me if I could estimate the proportions in the sample of
> > the subpopulations.
> > My idea is to use trajectory modelling with PROC TRAJ in SAS. I
> > completely lack real world experience with this type of analysis so I
> > would like to ask if you think it is the proper way to go. Also is it
> > still useful to use when there are only two points in time? It seems
> > as it is most useful when there are more than two time points
> > observed.
> > Maybe there is a simpler alternative I am overlooking? Some other type
> > of clustering?
> > Best regards
> > Thomas
> --
> Jeremy Miles
> Psychology Research Methods Wiki:www.researchmethodsinpsychology.com