PLS-SEM Sample Size Adequacy

734 views
Skip to first unread message

Madison Ngafeeson

unread,
May 21, 2013, 9:24:17 PM5/21/13
to pls...@googlegroups.com

Hello All,

I was reading a recent book on PLS entitled “A primer on PLS-SEM” by Hair et al., 2014 (a great write by the way), and I had a question regarding sample size recommendations for PLS-SEM analyses.

 What is the minimum sample size for PLS-SEM analysis for obtaining a statistical power of 80% with 15-16 constructs at a .05 significance level?

 Looking forward to hearing from you. Thank you.

 Madison

Geoffrey Hubona

unread,
May 21, 2013, 9:39:50 PM5/21/13
to pls...@googlegroups.com
You know, I hear this question often. And I like to provide direct concise answers. But the best answer is "more is always better". There is no magic threshold for obtaining the 80% power benchmark with 15 constructs at p < 0.05, although many would tell you there is. The power is determined by many things, other than just the sample size and the complexity of the model. It is a function of how strong the effect is (that you are trying to capture) in the actual population. It is a function of the adequacy (reliability and validity of measurement) of your instrument. It is a function of the "behavior" of the error terms in your specific sample, which creates bias. It is a function of the adequacy of the model that you specify.
 
I have published studies with as few as 107 observations and a simple PLS path model model (5 constructs), but even that was a close call. My R-squared values were terribly low even tho almost all paths were significant. Somehow, no one noticed and they published them (two of them) anyway. There are rules of thumb for that magic minimum number, but I believe them to be misleading. It depends on your model, your data, the strength of the effect (in the population at large from which you sampled your data), other things that cannot be quantified so easily.
 
You can use Cohen's power tables for a citable source of "the magic number", but I would not try it with fewer than, say, 150 observations. And, as I said to begin with, more is always better.
 
Go ahead, beat me up if you want to.
 
Geoff Hubona


--
--
You received this message because you are subscribed to the Google
Groups "PLS-SEM" group.
To post to this group, send email to pls...@googlegroups.com
To unsubscribe from this group, send email to
pls-sem+u...@googlegroups.com
 
---
You received this message because you are subscribed to the Google Groups "PLS-SEM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pls-sem+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Ned Kock

unread,
May 22, 2013, 10:43:27 AM5/22/13
to pls...@googlegroups.com

I get this question all the time from WarpPLS users.

 

If your goal is to avoid type II errors, my answer would be: 138 or 10 times the maximum number of variables (manifest or latent) influencing the estimation of any coefficient in the model, whichever is greater.

 

The "138 rule" comes from simulations based on regression assumptions and some of Cohen's assumptions regarding power. This rule probably needs a more solid mathematical foundation to be widely accepted.

 

Nevertheless, this "138 rule" will be fairly consistent with the results of a WarpPLS analysis where the user employs P values and effect sizes in combination; sticking with minimum thresholds of .05 and .02, respectively.

 

The "10 times rule" is based on ideas underlying the theory of degrees of freedom, and thus depends on whether the inner model influences the outer model. If the inner model influences the outer model, the number of dependencies among variables will increase.

 

To illustrate what the "10 times rule" means, let us consider a model with three latent variables: A, B and C. Let us assume that A and B point at C; and that A has 5 indicators, B has 3 indicators, and C has 7 indicators.

 

If the inner model is NOT allowed to influence the outer model, the minimum required sample size based on the "10 times rule" is 10x7=70. This is the case with the algorithm "PLS regression" in WarpPLS; the inner model is NOT allowed to influence the outer model.

 

If the inner model is allowed to influence the outer model, the minimum required sample size is 10x(7+2)=90 based on the "10 times rule". This is the case with software tools that conduct PLS-based SEM through one of Lohmöller’s “good neighbor” modes; namely modes A, B and MIMIC.

 

The 7+2 term refers to C, which has 7 indicators and 2 latent variables pointing at it. With “good neighbor” modes, C is calculated based on its 7 indicators AND the scores of the 2 latent variables that point at it.

 

What influences the minimum sample size under the "10 times rule" is not the overall number of latent variables in the model, but the complexity of the “mini-models” revolving around key latent variables in the model.

 

Ned

 

-----------------------------------------------------------

Ned Kock

http://nedkock.com

 

 

Reply all
Reply to author
Forward
0 new messages