Tabular Lavaan output .scaled | .robust

2,558 views
Skip to first unread message

R McLaren

unread,
Nov 29, 2016, 12:30:44 AM11/29/16
to lavaan
Hi,

After having received Beaujean's very good (2014) book on Latent Variable Modeling Using R, I found FitMeasures() for a sem model below the link at the end of this post. Please note in the table below that .robust values for numerous indices don't exist, but scaled values for the same indices do exist. Beaujean notes "Most fit measures in lavaan that were derived from a robust estimator have a .scaled suffix in the name"

SEM provides chisq.scaled values  and a robust chisq for models fit1 and fit2.

Anova allows for a parent model / nested model fit comparison (e.g., anova(fit1, fit2) which computes the difference in fit between the models given 1 df.
Yet when I run anova, Lavaan does not output the anova results. Why is this so? The error message is fairly cryptic: "lavaan ERROR"

Further is it the case that these 'NA' .robust estimates in the table are 'NA' because the table already has .scaled estimates? And if Lavaan doesn't output a computation of the difference between scaled|robust chisqs for fit1, fit2,, does this mean a manual computation of the difference between would be improper, invalid or suspect?

Thanks, RM


(http://blogs.baylor.edu/rlatentvariable/)

fitMeasures(fit2)




npar fmin chisq df pvalue chisq.scaled
113 1.939 655.299 550 0.001 803.516
df.scaled pvalue.scaled chisq.scaling.factor baseline.chisq baseline.df baseline.pvalue
550 0 1.394 5393.949 595 0
baseline.chisq.scaled baseline.df.scaled baseline.pvalue.scaled baseline.chisq.scaling.factor cfi tli
1729.473 595 0 4.233 0.978 0.976
nnfi rfi nfi pnfi ifi rni
0.976 0.869 0.879 0.812 0.978 0.978
cfi.scaled tli.scaled cfi.robust tli.robust nnfi.scaled nnfi.robust
0.777 0.758 NA NA 0.758 NA
rfi.scaled nfi.scaled ifi.scaled rni.scaled rni.robust rmsea
0.497 0.535 0.535 0.947 NA 0.034
rmsea.ci.lower rmsea.ci.upper rmsea.pvalue rmsea.scaled rmsea.ci.lower.scaled rmsea.ci.upper.scaled
0.022 0.043 0.998 0.052 0.044 0.06
rmsea.pvalue.scaled rmsea.robust rmsea.ci.lower.robust rmsea.ci.upper.robust rmsea.pvalue.robust rmr
0.304 NA NA NA NA 0.101
rmr_nomean srmr srmr_bentler srmr_bentler_nomean srmr_bollen srmr_bollen_nomean
0.104 0.081 0.079 0.081 0.079 0.081
srmr_mplus srmr_mplus_nomean cn_05 cn_01 gfi agfi
0.079 0.081 156.276 162.536 0.992 0.991
pgfi mfi



0.823 0.731



Auto Generated Inline Image 1

Terrence Jorgensen

unread,
Nov 30, 2016, 4:00:19 AM11/30/16
to lavaan
Yet when I run anova, Lavaan does not output the anova results. Why is this so? The error message is fairly cryptic: "lavaan ERROR"

I can't see the image you attached.  You should be able to copy/paste your model syntax, anova() call, and console output with the (full) error into your post, just like you did for the fitMeasures() output.

Further is it the case that these 'NA' .robust estimates in the table are 'NA' because the table already has .scaled estimates?

The *.scaled values are calculated naïvely, by plugging chisq.scaled into the original formulas, which does not provided consistent estimates of the those quantities in the population.  The *.robust values are calculated using population-consistent formulas, although those formulas have not been developed for the case of a "mean.var.adjusted" or "scaled.shifted" chi-squared, which is used when estimator = "WLSMV" (default for categorical outcomes).  You can find details, discussion, and citations in this thread:


And if Lavaan doesn't output a computation of the difference between scaled|robust chisqs for fit1, fit2,, does this mean a manual computation of the difference between would be improper, invalid or suspect?

You can find the formulas for calculating the test statistic in the articles cited on the ?lavTestLRT help page, or here:


Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

R MacLaren

unread,
Nov 30, 2016, 11:34:17 AM11/30/16
to lav...@googlegroups.com
Yes thank you! Here's the syntax for Model 2:

sem.model2 <- '
    Intrinsic =~ INTRINSIC
    Performance =~ GPA + HON + ACAH
    Satisfaction =~ SCHOOL+LIFE+HEALTH
    SSA =~ PHONE+STUDY+DISTRACT+FUN+MISTAKE+OFFICE+PART+REVIEW+DUE+CREATE+CLASS+MENTAL+INVOLVE+OPPORT+BOOKS     
    Flow =~ CONT+STAND+NEW+ATTEND+SKILLS+SUCCESS+ELSE+RELATION+ASSIGN+ESCAPE+TRAVEL
    WordFlow =~ binwf + binDT2 + binCOH
   
    # regressions
    SSA ~ Intrinsic
    Performance ~ SSA + Flow
    Flow ~ SSA + Intrinsic
    Satisfaction ~ Flow
    WordFlow ~ Flow + Satisfaction + Performance
   
    # residual correlations
    MENTAL ~~ STUDY
    CONT ~~ STUDY
'
fit2 <- sem(sem.model2, data = FlowDataset, ordered=c("binwf","binDT2","binCOH"))
summary(fit2, fit.measures = TRUE, standardized=TRUE)

Console output:

[123] WARNING: Warning in muthen1984(Data = X[[g]], ov.names = ov.names[[g]], ov.types = ov.types, :
lavaan WARNING: trouble inverting W matrix; used generalized inverse
Warning in lav_object_post_check(lavobject) :
lavaan WARNING: some estimated ov variances are negative

Anova call:

> anova(fit1,fit2, fitMeasures=TRUE)

Console output:

[124] ERROR:
lavaan ERROR

I am reading the links you suggested presently. The output for the model is given below. Sorry the columns are wonky.

lavaan (0.5-22) converged normally after 103 iterations

  Number of observations                           169

  Estimator                                       DWLS      Robust
  Minimum Function Test Statistic              687.392     827.071
  Degrees of freedom                               584         584
  P-value (Chi-square)                           0.002       0.000
  Scaling correction factor                                  1.460
  Shift parameter                                          356.383
    for simple second-order correction (Mplus variant)

Model test baseline model:

  Minimum Function Test Statistic             5434.585    1778.635
  Degrees of freedom                               630         630
  P-value                                        0.000       0.000

User model versus baseline model:

  Comparative Fit Index (CFI)                    0.978       0.788
  Tucker-Lewis Index (TLI)                       0.977       0.772

  Robust Comparative Fit Index (CFI)                            NA
  Robust Tucker-Lewis Index (TLI)                               NA

Root Mean Square Error of Approximation:

  RMSEA                                          0.032       0.050
  90 Percent Confidence Interval          0.021  0.042       0.042  0.057
  P-value RMSEA <= 0.05                          0.999       0.512

  Robust RMSEA                                                  NA
  90 Percent Confidence Interval                                NA     NA

Standardized Root Mean Square Residual:

  SRMR                                           0.082       0.082

Weighted Root Mean Square Residual:

  WRMR                                           0.992       0.992

Parameter Estimates:

  Information                                 Expected
  Standard Errors                           Robust.sem

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  Intrinsic =~                                                         
    INTRINSIC         1.000                               1.657    1.000
  Performance =~                                                       
    GPA               1.000                               0.331    0.850
    HON               3.427    0.517    6.629    0.000    1.135    0.874
    ACAH              0.869    0.162    5.353    0.000    0.288    0.623
  Satisfaction =~                                                      
    SCHOOL            1.000                               1.184    1.033
    LIFE              0.742    0.094    7.922    0.000    0.879    0.704
    HEALTH            0.468    0.103    4.557    0.000    0.554    0.417
  SSA =~                                                               
    PHONE             1.000                               0.422    0.365
    STUDY             0.989    0.249    3.978    0.000    0.417    0.402
    DISTRACT          0.944    0.236    4.006    0.000    0.399    0.432
    FUN               1.116    0.314    3.550    0.000    0.471    0.409
    MISTAKE           1.368    0.336    4.078    0.000    0.578    0.590
    OFFICE            0.887    0.286    3.099    0.002    0.375    0.366
    PART              1.163    0.296    3.931    0.000    0.491    0.510
    REVIEW            1.034    0.247    4.193    0.000    0.437    0.474
    DUE               1.476    0.443    3.332    0.001    0.623    0.396
    CREATE            1.117    0.287    3.896    0.000    0.471    0.531
    CLASS             0.622    0.176    3.538    0.000    0.263    0.495
    MENTAL            0.981    0.232    4.234    0.000    0.414    0.629
    INVOLVE           0.607    0.332    1.825    0.068    0.256    0.204
    OPPORT            1.068    0.379    2.816    0.005    0.451    0.387
    BOOKS             1.072    0.348    3.079    0.002    0.452    0.435
  Flow =~                                                              
    CONT              1.000                               0.433    0.526
    STAND             1.029    0.156    6.585    0.000    0.446    0.527
    NEW               1.146    0.182    6.308    0.000    0.496    0.545
    ATTEND            1.791    0.345    5.193    0.000    0.776    0.500
    SKILLS            1.139    0.173    6.583    0.000    0.493    0.477
    SUCCESS           1.415    0.267    5.304    0.000    0.613    0.569
    ELSE              1.685    0.302    5.586    0.000    0.730    0.529
    RELATION          0.922    0.248    3.714    0.000    0.399    0.359
    ASSIGN            1.598    0.267    5.992    0.000    0.692    0.605
    ESCAPE            1.364    0.354    3.847    0.000    0.591    0.334
    TRAVEL            1.989    0.297    6.701    0.000    0.862    0.606
  WordFlow =~                                                          
    binwf             1.000                               1.512    1.512
    binDT2            0.223    0.219    1.018    0.309    0.337    0.337
    binCOH            0.158    0.170    0.927    0.354    0.238    0.238

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  SSA ~                                                                
    Intrinsic         0.135    0.035    3.905    0.000    0.531    0.531
  Performance ~                                                        
    SSA               0.254    0.188    1.352    0.176    0.324    0.324
    Flow              0.182    0.170    1.074    0.283    0.238    0.238
  Flow ~                                                               
    SSA               0.828    0.203    4.073    0.000    0.807    0.807
    Intrinsic         0.027    0.016    1.650    0.099    0.103    0.103
  Satisfaction ~                                                       
    Flow              1.849    0.247    7.479    0.000    0.676    0.676
  WordFlow ~                                                           
    Flow              1.366    0.459    2.975    0.003    0.391    0.391
    Satisfaction     -0.401    0.140   -2.859    0.004   -0.314   -0.314
    Performance       0.530    0.405    1.308    0.191    0.116    0.116

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
 .STUDY ~~                                                             
   .MENTAL            0.089    0.035    2.513    0.012    0.089    0.183
   .CONT              0.114    0.060    1.894    0.058    0.114    0.171

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .INTRINSIC         7.515    0.129   58.304    0.000    7.515    4.534
   .GPA               3.251    0.030  106.724    0.000    3.251    8.347
   .HON               2.959    0.100   29.466    0.000    2.959    2.279
   .ACAH              1.692    0.057   29.828    0.000    1.692    3.667
   .SCHOOL            5.343    0.116   45.905    0.000    5.343    4.661
   .LIFE              5.367    0.128   41.861    0.000    5.367    4.300
   .HEALTH            5.201    0.118   43.972    0.000    5.201    3.909
   .PHONE             3.118    0.090   34.603    0.000    3.118    2.699
   .STUDY             3.396    0.084   40.524    0.000    3.396    3.269
   .DISTRACT          3.077    0.071   43.196    0.000    3.077    3.333
   .FUN               3.740    0.106   35.311    0.000    3.740    3.244
   .MISTAKE           3.746    0.079   47.305    0.000    3.746    3.825
   .OFFICE            2.331    0.085   27.568    0.000    2.331    2.275
   .PART              2.959    0.074   39.744    0.000    2.959    3.072
   .REVIEW            2.775    0.072   38.811    0.000    2.775    3.012
   .DUE               3.456    0.123   27.993    0.000    3.456    2.197
   .CREATE            3.367    0.069   48.758    0.000    3.367    3.792
   .CLASS             4.698    0.062   75.290    0.000    4.698    8.852
   .MENTAL            4.071    0.051   80.028    0.000    4.071    6.188
   .INVOLVE           2.373    0.112   21.256    0.000    2.373    1.893
   .OPPORT            2.166    0.113   19.137    0.000    2.166    1.859
   .BOOKS             2.320    0.082   28.450    0.000    2.320    2.231
   .CONT              3.290    0.064   51.224    0.000    3.290    3.992
   .STAND             3.822    0.068   56.523    0.000    3.822    4.524
   .NEW               6.000    0.080   74.909    0.000    6.000    6.592
   .ATTEND            5.615    0.178   31.466    0.000    5.615    3.623
   .SKILLS            5.888    0.102   57.591    0.000    5.888    5.691
   .SUCCESS           5.497    0.100   55.214    0.000    5.497    5.101
   .ELSE              3.373    0.111   30.419    0.000    3.373    2.445
   .RELATION          5.840    0.107   54.412    0.000    5.840    5.256
   .ASSIGN            5.225    0.102   51.453    0.000    5.225    4.564
   .ESCAPE            4.556    0.152   29.934    0.000    4.556    2.574
   .TRAVEL            4.444    0.111   40.166    0.000    4.444    3.125
   .binwf             0.000                               0.000    0.000
   .binDT2            0.000                               0.000    0.000
   .binCOH            0.000                               0.000    0.000
    Intrinsic         0.000                               0.000    0.000
   .Performance       0.000                               0.000    0.000
   .Satisfaction      0.000                               0.000    0.000
   .SSA               0.000                               0.000    0.000
   .Flow              0.000                               0.000    0.000
   .WordFlow          0.000                               0.000    0.000

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    binwf|t1          0.232    0.098    2.376    0.018    0.232    0.232
    binDT2|t1         0.202    0.097    2.069    0.039    0.202    0.202
    binCOH|t1         0.796    0.109    7.323    0.000    0.796    0.796

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .INTRINSIC         0.000                               0.000    0.000
   .GPA               0.042    0.014    3.084    0.002    0.042    0.278
   .HON               0.397    0.140    2.840    0.005    0.397    0.236
   .ACAH              0.130    0.033    3.894    0.000    0.130    0.612
   .SCHOOL           -0.089    0.145   -0.611    0.541   -0.089   -0.068
   .LIFE              0.785    0.114    6.906    0.000    0.785    0.504
   .HEALTH            1.463    0.191    7.659    0.000    1.463    0.826
   .PHONE             1.157    0.153    7.573    0.000    1.157    0.867
   .STUDY             0.905    0.109    8.297    0.000    0.905    0.839
   .DISTRACT          0.693    0.077    8.952    0.000    0.693    0.813
   .FUN               1.107    0.154    7.173    0.000    1.107    0.833
   .MISTAKE           0.625    0.070    8.976    0.000    0.625    0.652
   .OFFICE            0.910    0.120    7.608    0.000    0.910    0.866
   .PART              0.686    0.086    7.975    0.000    0.686    0.740
   .REVIEW            0.658    0.077    8.579    0.000    0.658    0.775
   .DUE               2.085    0.291    7.165    0.000    2.085    0.843
   .CREATE            0.566    0.069    8.247    0.000    0.566    0.718
   .CLASS             0.213    0.019   10.996    0.000    0.213    0.755
   .MENTAL            0.261    0.033    7.851    0.000    0.261    0.604
   .INVOLVE           1.505    0.234    6.443    0.000    1.505    0.958
   .OPPORT            1.154    0.157    7.327    0.000    1.154    0.850
   .BOOKS             0.877    0.125    6.997    0.000    0.877    0.811
   .CONT              0.492    0.056    8.830    0.000    0.492    0.724
   .STAND             0.516    0.058    8.816    0.000    0.516    0.722
   .NEW               0.582    0.054   10.690    0.000    0.582    0.703
   .ATTEND            1.801    0.222    8.128    0.000    1.801    0.750
   .SKILLS            0.827    0.074   11.241    0.000    0.827    0.773
   .SUCCESS           0.786    0.102    7.703    0.000    0.786    0.677
   .ELSE              1.370    0.173    7.897    0.000    1.370    0.720
   .RELATION          1.075    0.107   10.062    0.000    1.075    0.871
   .ASSIGN            0.831    0.093    8.894    0.000    0.831    0.634
   .ESCAPE            2.786    0.418    6.670    0.000    2.786    0.889
   .TRAVEL            1.280    0.144    8.891    0.000    1.280    0.633
   .binwf            -1.287                              -1.287   -1.287
   .binDT2            0.886                               0.886    0.886
   .binCOH            0.943                               0.943    0.943
    Intrinsic         2.747    0.297    9.247    0.000    1.000    1.000
   .Performance       0.077    0.016    4.699    0.000    0.705    0.705
   .Satisfaction      0.762    0.158    4.824    0.000    0.543    0.543
   .SSA               0.128    0.056    2.274    0.023    0.718    0.718
   .Flow              0.047    0.015    3.017    0.003    0.249    0.249
   .WordFlow          2.011    2.148    0.936    0.349    0.879    0.879

Scales y*:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
    binwf             1.000                               1.000    1.000
    binDT2            1.000                               1.000    1.000
    binCOH            1.000                               1.000    1.000

Following up on the links Dr. Jorgenson sent me, I get the following output:

> lavTestLRT(fit2, fit1, method="satorra.bentler.2001")
Scaled Chi Square Difference Test (method = "satorra.bentler.2001")

      Df AIC BIC  Chisq Chisq diff Df diff Pr(>Chisq)
fit2 584         687.39                             
fit1 585         691.10    0.87684       1     0.3491

> lavTestLRT(fit2, fit1, method="satorra.bentler.2010")
Scaled Chi Square Difference Test (method = "satorra.bentler.2010")

      Df AIC BIC  Chisq Chisq diff Df diff Pr(>Chisq)
fit2 584         687.39                             
fit1 585         691.10 -0.0083377       1          1

Thank you for your help to date it has been valuable to my research efforts!

--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/rGitXu9h9zY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+unsubscribe@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

R McLaren

unread,
Nov 30, 2016, 12:38:25 PM11/30/16
to lavaan
Attached is a semPaths figure showing the configuration of variables and latent factors.

Best, RM

Screenshot from 2016-11-30 12-35-05.png

Terrence Jorgensen

unread,
Dec 1, 2016, 5:56:21 AM12/1/16
to lavaan
[123] WARNING: Warning in muthen1984(Data = X[[g]], ov.names = ov.names[[g]], ov.types = ov.types, :
lavaan WARNING: trouble inverting W matrix; used generalized inverse

This warning explains itself.  I don't think there's much guidance to offer about this.  I don't know of any studies that have investigated whether the generalized inverse (?MASS::ginv) provides results that are as valid (e.g., unbaised point and SE estimates) as the usual inverse (?solve). The problem may actual solve itself if you can address the other problems in your model (see next message).
 
Warning in lav_object_post_check(lavobject) :
lavaan WARNING: some estimated ov variances are negative

Your summary() output shows you that the estimated residual variances of SCHOOL and binwf are negative.  This may indicate model misspecification, sampling error (very likely a problem since this is a huge model with categorical data and you only have 169 observations), or both.  You can rule out sampling error by checking the 95% CI for the negative variance estimates to see whether the CIs include positive values.

summary(fit1, ci = TRUE)

> anova(fit1,fit2, fitMeasures=TRUE)

I don't think there is a "fitMeasures" argument.  Are you thinking of the fitMeasures() function?

lavaan ERROR

Strange that it is blank after that.  I wonder if Yves already corrected this in the development version.  Install and try again to check:

install.packages("lavaan", repos = "http://www.da.ugent.be", type = "source")

Sorry the columns are wonky. 

You can change the font to Courier New to make it fixed-width and readable.

> lavTestLRT(fit2, fit1, method="satorra.bentler.2001")
Scaled Chi Square Difference Test (method = "satorra.bentler.2001")

      Df AIC BIC  Chisq Chisq diff Df diff Pr(>Chisq)
fit2 584         687.39                             
fit1 585         691.10    0.87684       1     0.3491

> lavTestLRT(fit2, fit1, method="satorra.bentler.2010")
Scaled Chi Square Difference Test (method = "satorra.bentler.2010")

      Df AIC BIC  Chisq Chisq diff Df diff Pr(>Chisq)
fit2 584         687.39                             
fit1 585         691.10 -0.0083377       1          1

Strange that lavTestLRT() provides results without error, whereas the anova() method (which I think is just a wrapper around lavTestLRT) returns an error.  Also strange that the 2001 method returns a positive chi-squared difference, whereas the 2010 method does not.  The 2010 method is supposed to always provide a positive statistic.  You only showed us 1 of the models.  Are you sure the 2 models are nested?  If so, it might be necessary to provide your data to trace down the issue.  Again, use the latest development version of lavaan to make sure you aren't running across bugs that have already been exterminated.

R McLaren

unread,
Dec 3, 2016, 10:06:14 PM12/3/16
to lavaan
Hi Dr. Jorgensen and thank you for the clear reply. You raise a good point about whether the models are indeed nested. I've assumed I've constructed model syntax that makes the nesting clear to the program, but upon reflection I can see that I may be mistaken, so I've provided Model 1 and 2 specifications below. In a kind of naive attempt to ensure the software 'understood' the nesting, I identified each of SSA and Flow together in the first model, and separated into separate causal constructs in the second model.

sem.model1 <- '

    Intrinsic =~ INTRINSIC
    Performance =~ GPA + HON + ACAH
   
    Satisfaction =~ SCHOOL+LIFE+HEALTH
    SSAFlow =~ SSA + Flow

    SSA =~ PHONE+STUDY+DISTRACT+FUN+MISTAKE+OFFICE+PART+REVIEW+DUE+CREATE+CLASS+MENTAL+INVOLVE+OPPORT+BOOKS     
    Flow =~ CONT+STAND+NEW+ATTEND+SKILLS+SUCCESS+ELSE+RELATION+ASSIGN+ESCAPE+TRAVEL

    WordFlow =~ binwf+binDT2+binCOH

    # regressions
   
    Performance ~ SSAFlow
    SSAFlow ~ Intrinsic
    Satisfaction ~ SSAFlow
    WordFlow ~ SSAFlow + Satisfaction + Performance


    # residual correlations
    MENTAL ~~ STUDY
    CONT ~~ STUDY
'
fit1 <- sem(sem.model1, data = FlowDataset, ordered=c("binwf","binDT2","binCOH"))
summary(fit1, fit.measures = TRUE, standardized=TRUE)


sem.model2 <- '
    Intrinsic =~ INTRINSIC
    Performance =~ GPA + HON + ACAH
   
    Satisfaction =~ SCHOOL+LIFE+HEALTH
    SSA =~ PHONE+STUDY+DISTRACT+FUN+MISTAKE+OFFICE+PART+REVIEW+DUE+CREATE+CLASS+MENTAL+INVOLVE+OPPORT+BOOKS     
    Flow =~ CONT+STAND+NEW+ATTEND+SKILLS+SUCCESS+ELSE+RELATION+ASSIGN+ESCAPE+TRAVEL
    WordFlow =~ binwf + binDT2 + binCOH
   
    # regressions
    SSA ~ Intrinsic
    Performance ~ SSA + Flow
    Flow ~ SSA + Intrinsic
    Satisfaction ~ Flow
    WordFlow ~ Flow + Satisfaction + Performance
   
    # residual correlations
    MENTAL ~~ STUDY
    CONT ~~ STUDY
'
fit2 <- sem(sem.model2, data = FlowDataset, ordered=c("binwf","binDT2","binCOH"))
summary(fit2, fit.measures = TRUE, standardized=TRUE)

Is this the proper way to specify a parent:nested model comparison?

Thanks!

RM

R MacLaren

unread,
Dec 4, 2016, 1:42:46 PM12/4/16
to lav...@googlegroups.com
Regarding model specification, is it correct to assume that model specification is good to the extent that the variables specified in the model accurately represent the phenomenon?

The reason I ask is because my variables for WordFlow / School satisfaction are premised on sentiment analysis and pattern matching data, and I am working to use/develop better sentiment analysis procedures. I'm thinking if model specification is wonky, might it be because the tools I'm using to measure WordFlow are poor ones? If this is the case, it might be worth me spending a day or two developing better measures for word flow and satisfaction.

R MacLaren

unread,
Dec 5, 2016, 12:37:11 AM12/5/16
to lav...@googlegroups.com
My mistake I probably should be asking the previous question to the SEM group. Sorry for the bandwidth. I have some new measures however that I'm exploring presently from the verbal data and I'll experiment and see if I can buttress the model with better measures.

On Sun, Dec 4, 2016 at 1:42 PM, R MacLaren <rick.m...@alumni.utoronto.ca> wrote:
Regarding model specification, is it correct to assume that model specification is good to the extent that the variables specified in the model accurately represent the phenomenon?

The reason I ask is because my variables for WordFlow / School satisfaction are premised on sentiment analysis and pattern matching data, and I am working to use/develop better sentiment analysis procedures. I'm thinking if model specification is wonky, might it be because the tools I'm using to measure WordFlow are poor ones? If this is the case, it might be worth me spending a day or two developing better measures for word flow and satisfaction.

Terrence Jorgensen

unread,
Dec 5, 2016, 6:40:05 AM12/5/16
to lavaan
Is this the proper way to specify a parent:nested model comparison?

lavaan will automatically sort the models by their df before calculating the difference in fit, but it is agnostic about whether the models are nested.  It is hard to tell whether your models are nested just by looking at the syntax because one includes the effects of 2 first-order factors (SSA and Flow), whereas the other model includes the effects of a second-order factor (SSAFlow), and it is not a simple matter of replacing 2 paths with 1 path because in the first model SSA affects Flow.

Because the measurement models are identical, these models only differ in their latent structure, so you could use an empirical check to see whether they are nested.  There is one implemented in the semTools package called net(), and the help page provides a Reference where you can read about the method.  It only works with continuous data, but your models only differ in the latent structure, and the latent variables are continuous.  So I used those parts of your syntax to build a toy example of each model as if you had observed the latent variables (using arbitrary population parameters for the regression paths -- the values don't affect the test of nesting/equivalence).  Then I fit the structural model to the implied covariance matrices so that the lavaan objects could be passed to net().

mod1 <- 'SSAFlow =~ .6?SSA + .6?Flow
  Performance ~ .3?SSAFlow 
  SSAFlow ~ .5?Intrinsic
  Satisfaction ~ .3?SSAFlow 
  WordFlow ~ .3?SSAFlow + .3?Satisfaction + .3?Performance
'
out1 <- sem(mod1, sample.cov = fitted(sem(mod1))$cov,
            sample.nobs = 100, fixed.x = FALSE)

mod2 <- '
  SSA ~ .5?Intrinsic
  Performance ~ .3?SSA + .3?Flow 
  Flow ~ .3?SSA + .3?Intrinsic
  Satisfaction ~ .3?Flow 
  WordFlow ~ .3?Flow + .3?Satisfaction + .3?Performance
'
out2 <- sem(mod2, sample.cov = fitted(sem(mod2))$cov,
            sample.nobs = 100, fixed.x = FALSE)

Then you can provide both lavaan objects to the net() function to see whether the models are nested.  They are not nested.

net(out1, out2)

R McLaren

unread,
Dec 5, 2016, 4:18:05 PM12/5/16
to lavaan
Thanks Terrance I executed the code you provided, and it gave the following output - I'm looking over the Help page presently and rereading your helpful note a few times so I can get a grasp of it. From first reading, I take away that it follows if one conceives of the latent Flow as being comprised of say 7 or 8 dimensions (e.g., autotelic, loss of self consciousness, merging action and awareness, balancing challenge with skill, etc. see https://en.wikipedia.org/wiki/Flow_(psychology) then I can empirically test whether the models are nested if I can mimic the work you've done here with alternate Flow models.

I performed a test of the 95% confidence interval as you suggested and it was + and - so I'll search it out presently and see if I can find the results and paste the results in this thread. Thanks again and I'll be back with the output soon... RM


>     net(out1, out2)

     If cell [R, C] is TRUE, the model in row R is nested within column C.

     If cell [R, C] is TRUE and the models have the same degrees of freedom,
     they are equivalent models.  See Bentler & Satorra (2010) for details.

     If cell [R, C] is NA, then the model in column C did not converge when
     fit to the implied means and covariance matrix from the model in row R.

     The hidden diagonal is TRUE because any model is equivalent to itself.
     The upper triangle is hidden because for models with the same degrees
     of freedom, cell [C, R] == cell [R, C].  For all models with different
     degrees of freedom, the upper diagonal is all FALSE because models with
     fewer degrees of freedom (i.e., more parameters) cannot be nested
     within models with more degrees of freedom (i.e., fewer parameters).
    
              out2  out1
out2 (df = 6)          
out1 (df = 7) FALSE

R McLaren

unread,
Dec 5, 2016, 4:52:58 PM12/5/16
to lavaan
Hi below are the 95% CIs including SCHOOL and binwp:

> summary(fit2, ci = TRUE)

lavaan (0.5-22) converged normally after 103 iterations

  Number of observations                           169

  Estimator                                       DWLS      Robust
  Minimum Function Test Statistic              687.392     827.071
  Degrees of freedom                               584         584
  P-value (Chi-square)                           0.002       0.000
  Scaling correction factor                                  1.460
  Shift parameter                                          356.383
    for simple second-order correction (Mplus variant)

Parameter Estimates:

  Information                                 Expected
  Standard Errors                           Robust.sem

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
  Intrinsic =~                                                         
    INTRINSIC         1.000                               1.000    1.000
  Performance =~                                                       
    GPA               1.000                               1.000    1.000
    HON               3.427    0.517    6.629    0.000    2.414    4.441
    ACAH              0.869    0.162    5.353    0.000    0.551    1.187
  Satisfaction =~                                                      
    SCHOOL            1.000                               1.000    1.000
    LIFE              0.742    0.094    7.922    0.000    0.558    0.926
    HEALTH            0.468    0.103    4.557    0.000    0.267    0.669
  SSA =~                                                               
    PHONE             1.000                               1.000    1.000
    STUDY             0.989    0.249    3.978    0.000    0.501    1.476
    DISTRACT          0.944    0.236    4.006    0.000    0.482    1.407
    FUN               1.116    0.314    3.550    0.000    0.500    1.732
    MISTAKE           1.368    0.336    4.078    0.000    0.711    2.026
    OFFICE            0.887    0.286    3.099    0.002    0.326    1.449
    PART              1.163    0.296    3.931    0.000    0.583    1.742
    REVIEW            1.034    0.247    4.193    0.000    0.551    1.518
    DUE               1.476    0.443    3.332    0.001    0.608    2.345
    CREATE            1.117    0.287    3.896    0.000    0.555    1.678
    CLASS             0.622    0.176    3.538    0.000    0.278    0.967
    MENTAL            0.981    0.232    4.234    0.000    0.527    1.435
    INVOLVE           0.607    0.332    1.825    0.068   -0.045    1.258
    OPPORT            1.068    0.379    2.816    0.005    0.325    1.812
    BOOKS             1.072    0.348    3.079    0.002    0.389    1.754
  Flow =~                                                              
    CONT              1.000                               1.000    1.000
    STAND             1.029    0.156    6.585    0.000    0.723    1.335
    NEW               1.146    0.182    6.308    0.000    0.790    1.502
    ATTEND            1.791    0.345    5.193    0.000    1.115    2.467
    SKILLS            1.139    0.173    6.583    0.000    0.800    1.478
    SUCCESS           1.415    0.267    5.304    0.000    0.892    1.938
    ELSE              1.685    0.302    5.586    0.000    1.094    2.277
    RELATION          0.922    0.248    3.714    0.000    0.435    1.409
    ASSIGN            1.598    0.267    5.992    0.000    1.075    2.121
    ESCAPE            1.364    0.354    3.847    0.000    0.669    2.058
    TRAVEL            1.989    0.297    6.701    0.000    1.407    2.571
  WordFlow =~                                                          
    binwf             1.000                               1.000    1.000
    binDT2            0.223    0.219    1.018    0.309   -0.206    0.652
    binCOH            0.158    0.170    0.927    0.354   -0.176    0.491

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
  SSA ~                                                                
    Intrinsic         0.135    0.035    3.905    0.000    0.067    0.203
  Performance ~                                                        
    SSA               0.254    0.188    1.352    0.176   -0.114    0.622
    Flow              0.182    0.170    1.074    0.283   -0.150    0.515
  Flow ~                                                               
    SSA               0.828    0.203    4.073    0.000    0.430    1.227
    Intrinsic         0.027    0.016    1.650    0.099   -0.005    0.059
  Satisfaction ~                                                       
    Flow              1.849    0.247    7.479    0.000    1.364    2.333
  WordFlow ~                                                           
    Flow              1.366    0.459    2.975    0.003    0.466    2.267
    Satisfaction     -0.401    0.140   -2.859    0.004   -0.676   -0.126
    Performance       0.530    0.405    1.308    0.191   -0.264    1.323

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
 .STUDY ~~                                                             
   .MENTAL            0.089    0.035    2.513    0.012    0.020    0.158
   .CONT              0.114    0.060    1.894    0.058   -0.004    0.232

Intercepts:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
   .INTRINSIC         7.515    0.129   58.304    0.000    7.262    7.767
   .GPA               3.251    0.030  106.724    0.000    3.192    3.311
   .HON               2.959    0.100   29.466    0.000    2.762    3.155
   .ACAH              1.692    0.057   29.828    0.000    1.581    1.804
   .SCHOOL            5.343    0.116   45.905    0.000    5.115    5.571
   .LIFE              5.367    0.128   41.861    0.000    5.116    5.618
   .HEALTH            5.201    0.118   43.972    0.000    4.969    5.433
   .PHONE             3.118    0.090   34.603    0.000    2.942    3.295
   .STUDY             3.396    0.084   40.524    0.000    3.232    3.561
   .DISTRACT          3.077    0.071   43.196    0.000    2.937    3.217
   .FUN               3.740    0.106   35.311    0.000    3.532    3.947
   .MISTAKE           3.746    0.079   47.305    0.000    3.590    3.901
   .OFFICE            2.331    0.085   27.568    0.000    2.166    2.497
   .PART              2.959    0.074   39.744    0.000    2.813    3.104
   .REVIEW            2.775    0.072   38.811    0.000    2.635    2.915
   .DUE               3.456    0.123   27.993    0.000    3.214    3.698
   .CREATE            3.367    0.069   48.758    0.000    3.232    3.502
   .CLASS             4.698    0.062   75.290    0.000    4.576    4.821
   .MENTAL            4.071    0.051   80.028    0.000    3.971    4.171
   .INVOLVE           2.373    0.112   21.256    0.000    2.154    2.592
   .OPPORT            2.166    0.113   19.137    0.000    1.944    2.387
   .BOOKS             2.320    0.082   28.450    0.000    2.160    2.479
   .CONT              3.290    0.064   51.224    0.000    3.164    3.416
   .STAND             3.822    0.068   56.523    0.000    3.690    3.955
   .NEW               6.000    0.080   74.909    0.000    5.843    6.157
   .ATTEND            5.615    0.178   31.466    0.000    5.266    5.965
   .SKILLS            5.888    0.102   57.591    0.000    5.687    6.088
   .SUCCESS           5.497    0.100   55.214    0.000    5.302    5.692
   .ELSE              3.373    0.111   30.419    0.000    3.155    3.590
   .RELATION          5.840    0.107   54.412    0.000    5.630    6.051
   .ASSIGN            5.225    0.102   51.453    0.000    5.026    5.424
   .ESCAPE            4.556    0.152   29.934    0.000    4.258    4.855
   .TRAVEL            4.444    0.111   40.166    0.000    4.227    4.661

   .binwf             0.000                               0.000    0.000
   .binDT2            0.000                               0.000    0.000
   .binCOH            0.000                               0.000    0.000
    Intrinsic         0.000                               0.000    0.000
   .Performance       0.000                               0.000    0.000
   .Satisfaction      0.000                               0.000    0.000
   .SSA               0.000                               0.000    0.000
   .Flow              0.000                               0.000    0.000
   .WordFlow          0.000                               0.000    0.000

Thresholds:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
    binwf|t1          0.232    0.098    2.376    0.018    0.041    0.423
    binDT2|t1         0.202    0.097    2.069    0.039    0.011    0.393
    binCOH|t1         0.796    0.109    7.323    0.000    0.583    1.009

Variances:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper
   .INTRINSIC         0.000                               0.000    0.000
   .GPA               0.042    0.014    3.084    0.002    0.015    0.069
   .HON               0.397    0.140    2.840    0.005    0.123    0.671
   .ACAH              0.130    0.033    3.894    0.000    0.065    0.196
   .SCHOOL           -0.089    0.145   -0.611    0.541   -0.374    0.196
   .LIFE              0.785    0.114    6.906    0.000    0.562    1.008
   .HEALTH            1.463    0.191    7.659    0.000    1.089    1.837
   .PHONE             1.157    0.153    7.573    0.000    0.857    1.456
   .STUDY             0.905    0.109    8.297    0.000    0.691    1.119
   .DISTRACT          0.693    0.077    8.952    0.000    0.541    0.845
   .FUN               1.107    0.154    7.173    0.000    0.804    1.409
   .MISTAKE           0.625    0.070    8.976    0.000    0.489    0.762
   .OFFICE            0.910    0.120    7.608    0.000    0.675    1.144
   .PART              0.686    0.086    7.975    0.000    0.518    0.855
   .REVIEW            0.658    0.077    8.579    0.000    0.508    0.809
   .DUE               2.085    0.291    7.165    0.000    1.514    2.655
   .CREATE            0.566    0.069    8.247    0.000    0.432    0.701
   .CLASS             0.213    0.019   10.996    0.000    0.175    0.251
   .MENTAL            0.261    0.033    7.851    0.000    0.196    0.327
   .INVOLVE           1.505    0.234    6.443    0.000    1.048    1.963
   .OPPORT            1.154    0.157    7.327    0.000    0.845    1.462
   .BOOKS             0.877    0.125    6.997    0.000    0.631    1.122
   .CONT              0.492    0.056    8.830    0.000    0.383    0.601
   .STAND             0.516    0.058    8.816    0.000    0.401    0.630
   .NEW               0.582    0.054   10.690    0.000    0.475    0.689
   .ATTEND            1.801    0.222    8.128    0.000    1.366    2.235
   .SKILLS            0.827    0.074   11.241    0.000    0.683    0.971
   .SUCCESS           0.786    0.102    7.703    0.000    0.586    0.986
   .ELSE              1.370    0.173    7.897    0.000    1.030    1.709
   .RELATION          1.075    0.107   10.062    0.000    0.866    1.285
   .ASSIGN            0.831    0.093    8.894    0.000    0.648    1.015
   .ESCAPE            2.786    0.418    6.670    0.000    1.967    3.604
   .TRAVEL            1.280    0.144    8.891    0.000    0.998    1.562

   .binwf            -1.287                              -1.287   -1.287
   .binDT2            0.886                               0.886    0.886
   .binCOH            0.943                               0.943    0.943
    Intrinsic         2.747    0.297    9.247    0.000    2.165    3.329
   .Performance       0.077    0.016    4.699    0.000    0.045    0.110
   .Satisfaction      0.762    0.158    4.824    0.000    0.452    1.072
   .SSA               0.128    0.056    2.274    0.023    0.018    0.238
   .Flow              0.047    0.015    3.017    0.003    0.016    0.077
   .WordFlow          2.011    2.148    0.936    0.349   -2.199    6.222

Scales y*:
                   Estimate  Std.Err  z-value  P(>|z|) ci.lower ci.upper

    binwf             1.000                               1.000    1.000
    binDT2            1.000                               1.000    1.000
    binCOH            1.000                               1.000    1.000


Thanks, I'll return to reading these notes over again and refer to supplemental sources so I can get a handle on this...Best regards, RM

R McLaren

unread,
Dec 5, 2016, 6:33:44 PM12/5/16
to lavaan
I should say that the 7 or 8 dimensions will likely lead to negative values again, so partitioning the SSA/Flow items in 4 sensible dimensions might be worth a try where these dimensions meaningfully line up with the 8 would be ideal.

On Monday, December 5, 2016 at 4:18:05 PM UTC-5, R McLaren wrote:

Terrence Jorgensen

unread,
Dec 6, 2016, 4:01:05 AM12/6/16
to lavaan
I executed the code you provided, and it gave the following output

Yes, that is the same output I got, indicating your models are not nested.

I'm looking over the Help page presently and rereading your helpful note a few times so I can get a grasp of it.

The idea behind covariance-matrix nesting is that if a more-restricted model B is nested within a less-restriced model A, then model A can fit to any covariance matrix that B can and more. So if you fit B (which has fewer df) to data and save the model-implied covariance matrix, then fit A (which has more df) to that model-implied covariance matrix, then it will fit perfectly (within reasonable machine precision) because B is nested within A.  If A and B are nested and have the same df, they are statistically equivalent (they make identical predictions about any covariance matrix to which they are fit).  

Because it only applies to analyses of (mean and) covariance structure (i.e., no threshold model linking latent item responses to categorical observed item responses), this method of testing nesting/equivalence requires continuous data.  Whereas mean and covariance structure parameters are independently identified in mean and covariance structure analyses, threshold specification affects not only the implied means but also the implied variances of the latent item responses. 

From first reading, I take away that it follows if one conceives of the latent Flow as being comprised of say 7 or 8 dimensions (e.g., autotelic, loss of self consciousness, merging action and awareness, balancing challenge with skill, etc. see https://en.wikipedia.org/wiki/Flow_(psychology) then I can empirically test whether the models are nested if I can mimic the work you've done here with alternate Flow models.

If you are talking about different structural models for the latent common factors, yes.  But if you are talking about specifying different measurement models to the observed categorical responses, then you should not be running latent regressions.  Just leave the latent structure saturated (a CFA) until you find a measurement model that fits well and makes sense.  Estimating latent regressions among uninterpretable factors are not themselves interpretable.  And don't confuse "interpretable factor" with "interpretable label" -- that's the Naming Fallacy.

You can only apply the net() function if you treat the item responses as continuous for the purpose of testing whether competing models are nested (i.e., do not trust the estimates, SEs, or fit measures when you treat the indicators as continuous, just pass those models to net() to see if they are nested; then treat indicators as categorical to actually test and interpret the results of each model). Note that the net() results would only applicable if the thresholds are saturated (no constraints on them, so no invariance testing).

I performed a test of the 95% confidence interval as you suggested and it was + and - 

Only the residual variance for SCHOOL had a 95% CI that included plausible values.  The one for binwf was entirely negative, so this indicates sampling error is not a reasonable explanation for the negative estimate.  It is probably due to misspecification. 

R MacLaren

unread,
Dec 6, 2016, 5:15:13 PM12/6/16
to lav...@googlegroups.com
Hi Terrence and thanks very much - I'll take a day or two to re-read these Lavaan group threads where you've responded, as well as a recent sem discussion group thread located here:

https://mail.google.com/mail/u/0/#inbox/158cf10b40fa9e67

...and get a fix on some learning goals via the Beaujean book and the CFA book by Brown(2006) I have on hand - the solution path in the 2nd to last paragraph will prove useful to me.

Regarding misspecification, I have to read more about model misspecification, identification and modification indices. Joreskog(1993) in Brown outlines a protocol where I can proceed from the item with the highest mi, and if reasonable, 'unconstrain' or 'free' the parameter, using techniques I think I can garner from Beaujean.

Again thanks Terrence!

Terrence Jorgensen

unread,
Dec 7, 2016, 4:15:05 AM12/7/16
to lavaan
Regarding misspecification, I have to read more about model misspecification, identification and modification indices. Joreskog(1993) in Brown outlines a protocol where I can proceed from the item with the highest mi, and if reasonable, 'unconstrain' or 'free' the parameter, using techniques I think I can garner from Beaujean. 

Modification indices can be useful, but they assume the structure of your model is already basically correct, and that they only changes you need to make are to free additional parameters instead of restructure the model to represent a different theoretical explanation.  Also, modification indices can provide some confusing answers because the ones that make the most sense might not be the largest.  Before consulting the modification indices, I would recommend deciding which ones you want to look at (so you start with your theoretical justification first, then only inspect those.  The problem with going straight to modification indices is that the human mind can rationalize anything, so we can usually (unintentionally) invent a reason why freeing a parameter might make sense.

I think a more useful data-driven approach to finding why your model doesn't fit well is to look at correlation residuals.

resid(out1, type = "cor")

Those show you the difference between observed Pearson correlations (or estimated polychoric correlations) and model-implied correlations.  In other words, how do your model's predictions differ from the data?  A rule of thumb is that differences > .10 are substantial, but that is just a rule of thumb.  The point is just that matrix of correlation residuals will force you to notice relationships in the data that are not being captured by the model.  And knowing your model as you do, you will begin trying to figure out what the model should look like (which may include restructuring) before trying to sequentially free parameters (which tends to result in models that do not generalize).

Good luck,
Message has been deleted

R MacLaren

unread,
Dec 7, 2016, 12:05:52 PM12/7/16
to lav...@googlegroups.com
Hi Terrence,

Apologies to you because I had sent this same message, but tried to retract/delete it because I by accident misspelled your name - I hope it was deleted in time.

I think one possible signal that a person grasps science, is when they predict the future. Students may learn from my mistake.

For example, Terrence you said "The problem with going straight to modification indices is that the human mind can rationalize anything, so we can usually (unintentionally) invent a reason why freeing a parameter might make sense."

Yes I was awake quite late last evening, exercising my creativity at rationalizing why particular cherries would be good to pick. But it didn't sit right with me the thought of selecting out items - I've always as an experimentalist been averse to mauling the dataset, and nowadays with more open research I feel it's best to have good theoretical cause to deprecate datapoints. So my intuitions were good, but I wasn't so good that I didn't try to cherry pick with a few items before I remembered what a reviewer reminded me once before about overly specific models not generalizing.

So I opted to try to aggregate the 8 dimensions of flow into four chunkier dimensions that make sense: FlowIntentionInvolvement, FlowExecutive, and FlowAction. Satisfaction I treat as a merely 'renamed' outcome - as a concept, satisfaction shares with the flow experience a positive emotional valence, or tag, and so was not conceptually discordant with happiness or feelings of flow, which often share +'ve valence. Since I envision Flow and Satisfaction influencing WordFlow, it seemed justifiable to aggregate Satisfaction into Flow as an historical, remembered outcome of having had numerous flow experiences doing school work.

+     FlowAction =~ OFFICE+PART+REVIEW+DUE+CREATE+CLASS+MENTAL+
                    INVOLVE+OPPORT+BOOKS+SKILLS+ELSE+RELATION
+     FlowExecutive =~ PHONE+STUDY+DISTRACT+CONT+STAND+NEW+ATTEND+SUCCESS+TRAVEL
+     FlowIntInv =~ FUN+MISTAKE+ASSIGN+ESCAPE
+     FlowOutcome =~ SCHOOL+LIFE+HEALTH ##Formerly 'Satisfaction'


And after reading the mi's, the items that were severe empirically clustered into three flow dimensions consistent with Csikszentmihalyi's 8 flow dimensions so from a theoretical vantage point I thought I had justification for the decision.

In terms of the biplot of z. scores, FlowAction is CS+CGF+AA, FlowExecutive is TT+PrdxCntrl+Con and FlowIntInv is autotelic AU + loss of self consciousness LC.

So today, after reading your reply Terrence, I'm pleased to see there's a better avenue. I ran

resid(fit2, type = "cor")

...and the output I'm reviewing presently. I'm also going to look at other authors who write about flow after reviewing the 'resid' output. You and readers should know that my outcome latent variable, WordFlow, is premised on machine sentiment analysis - so I'm experimenting with classifiers that use naive Bayes and Python NLTK to output sentiment as neg(1), neutral (2) or pos (3), for example. I've treated them as ordered and identified them as such. The challenge is, that when I look at the methodology for classification, in each instance I have to ask myself "To what extent do these tools measure what I believe is flow?" and I ask it often when individual classifications of flow for particular sentences is baffling. The models have shown me in a couple of instances these notions I had about the concept needed to be changed remarkably.

So Terrence I appreciate your point about these statistical protocols being learning milestones about the data. I've marked the residuals > 0.10 and I'll work now to see if they make sense using an alternate conception of flow dimensionality.

Thanks for the guidance, and talk soon!   Rick

RGraph-KMEANSCLUSTER-FLOW1.png
Reply all
Reply to author
Forward
0 new messages