Can MFA be used for time series data?

499 views
Skip to first unread message

Kohkichi Hosoda

unread,
Aug 18, 2013, 6:53:09 AM8/18/13
to factomin...@googlegroups.com
Hi FactoMineR users, 

I am analyzing some time series data which consist of control group (5 subjects) and treatment group (5 subjects). Each subject was observed at different time points (0h, 0.5h, 1h, 2h, 24h) and 80 metabolites values were obtained. So, this is a kind of repeated measures study. My data looks like; 

        Condition  time    metabolite A   metabolite B     metabolite C      
C1_30     control  0.5h    -0.04142747    -0.37265363      -0.11276739
C2_30     control  0.5h    -0.36124165    -1.17384736      -0.84574746  
C3_30     control  0.5h     0.04387140     0.73808805       0.35003297  
C4_30     control  0.5h     0.05652616    -0.80994402       0.03163576  
C5_30     control  0.5h    -0.30130528    -0.06219186      -0.07025853  
C2_60     control    1h    -0.08966251     0.21990856      -0.94293543  
C3_60     control    1h     0.12005663     0.45336003      -0.56922725  
C4_60     control    1h    -0.21664809    -0.68921863       1.01670941  
C5_60     control    1h     0.44345835    -0.23815902      -0.33989998  
C1_120    control    2h    -0.54637261    -0.82786901       0.37363236  
C2_120    control    2h    -0.01997253     0.25259060      -0.74666215  

For example, C2_30, C2_60, C2_120 are different time point data of the same subject, i.e., C2. 

My questions are followings;

1. Can MFA be used for this kind of repeated measure analysis?

2. If so, how should be my data table reorganized for MFA and how should I do MFA?
   My guess is; 
                               time0.5h                                           time1h                      …..
       Condition   metabolite A   metabolite B     metabolite C …  metabolite A   metabolite B     metabolite C …  
C1     control    -0.04142747    -0.37265363      -0.11276739   …  -0.08966251     0.21990856      -0.94293543 ...
C2     control    -0.36124165    -1.17384736      -0.84574746   ...
…….

and 

MFA(mydata, group=c(80, 80, 80, 80), type=(rep("c", 4)), name.group=c("0.5h", "1h", "2h", "24h"), num.group.sup=1)

Each time point has the same 80 metabolites as variables. The num.group.sup indicates control or treatment group in Condition column. 

Is this right?

3. I already did pareto scaling for my data and I do not want FactoMineR do any scaling. I guess data is not scaled by FactoMineR if type is "c" in MFA formula. Is this right? 

4. I would like to focus the change of metabolic profile and am not interested in between-subject variability. Therefore, each value of each metabolites was transformed into change from 0h time point. For example, value at 0.5h/value at 0h. I would like to focus within-subject change. Is this idea right? Or should I use raw data value including 0h time point?

Kohkichi

François Husson

unread,
Aug 19, 2013, 3:37:26 AM8/19/13
to factomin...@googlegroups.com
Hi,

1) You are right, MFA can be used for times series data (when a same set of individual is described several times).
2) Yes, you have to organise your data set as you do. But the line of code should rather be:
MFA(mydata, group=c(1,80, 80, 80, 80), type=("n",rep("c", 4)), name.group=c("condition","0.5h", "1h", "2h", "24h"), num.group.sup=1)
Indeed, you have to say that the 1st variable is a group, with a categorical variable names "condition", and this group is a supplementary one.
3) Yes, if you use "c", MFA do not scale the variables in each group.
4) If you have only one time and you want to perform a PCA, what would you choose? You have to make the same choice in MFA than in PCA. Perhaps you can see with the too choices what is the best.
When you have time series, you can use the argument chrono=TRUE in the plot.MFA function when you draw the partial points (then the partial points will be connected from one time to another rather than connected to the mean point.

FH

Kohkichi Hosoda

unread,
Aug 19, 2013, 4:11:02 AM8/19/13
to factomin...@googlegroups.com
Hi François, 

Thank you very much for your quick response.  Now I can be much more confident. 
I will try to analyze my data according to your advice.

Kohkichi. 

2013年8月19日月曜日 16時37分26秒 UTC+9 François Husson:

Kohkichi Hosoda

unread,
Aug 20, 2013, 9:53:56 AM8/20/13
to factomin...@googlegroups.com
Hi, 

I have two more basic questions. 

1. Data for MFA should be scaled column-wise on each table. Variables of all time point should not be scaled together but should be scaled on each table of each time point. Is this right?  

2. What about missing time point? For example, if C1, C2, C3, C4 have 0h, 1h, 2h, 3h time points, C5 has 0h, 1h,  3h (2h missing) and C6 has  0h, 2h, 3h (1h missing) and each time point has the same 80 variables, how should the data be dealt with?  Can I do MFA as they are? or should I do some special manipulation?

I appreciate your help in advance. 

Kohkichi

2013年8月19日月曜日 17時11分02秒 UTC+9 Kohkichi Hosoda:
Reply all
Reply to author
Forward
0 new messages