Composite Variable

1 view

Skip to first unread message

Janne Desir

unread,

Aug 4, 2024, 4:00:13 PM8/4/24

to downparumo

Compositevariables are another way besides latent variables to represent complex multivariate concepts in structural equation modeling. The most important distinction between the two is that, while latent variables give rise to measurable manifestations of an unobservable concept, composite variables arise from the total combined influence of measured variables.

Here, the arrows are leading into, not out of, $\eta$, indicating that the composite variable is made up of the influences of the three observed variables. Note: in this and other presentations, the composite is denoted by a hexagon, but can sometimes be an oval as it can technically be a form of a latent variable.

In other cases, the property might arise from the collective influence of variables but is not without error. For example, the idea of soil condition arises from different aspects of the soil: its pH, moisture, grain size, and so on. However, one might measure only some of these, and thus there remain other factors (nutrient content, etc.) that might contribute to the notion of soil condition. In this case, the composite would have error and is therefore known as a latent composite.

The benefit of such an approach is that complicated constructs can be distilled into discrete blocks that are easier to present and discuss. For example, it is easier to talk about the effects of the experimental treatment on soil condition, rather than the effect of treatment 1 on soil moisture, the effect of treatment 1 on soil pH, the effect of treatment 1 on soil grain size, and so on.

In this way, the composite harkens back to the early meta-model, or broad conceptual relationships that inform the parameterization of the structural path model. In fact, in populating the meta-model, you may wish to consider those broad concepts as composites (or latents) when fitting the model, rather than modeling all relationships among all observed variables.

For the soil example, consider: is it that there is a common difference among soils driving variation in pH, moisture, etc.? Or is it that pH, moisture, etc. are all independent properties that combine to inform soil condition? If the goal is measure plant growth in potting soils from different manufacturers, then manufacturer might be the common source of variation and a latent variable more appropriate. If the observer is visiting different sites and measuring conditions that describe the soil in each place, then perhaps a composite variable is warranted.

Another way of thinking about this is whether the indicators are interchangeable. In other words, does soil pH tell us the same information as soil moisture? If so, then they might be indicators of the same latent phenomenon. If not, and they contain unique information, then they likely combine to form a composite variable.

The way in which the indicators are summed depends on whether they are expected to have the same weight (a fixed composite) or different weights (a statistical composite). The former might be something like species relative abundances. The latter is what we will focus on here because it has the most practical applications in ecology.

Note how the unstandardized coefficient is 1. This is because the composite is in units of the predicted values of the response. Thus, the coefficient is really only interpretable in standardized units.

We see from the output that the estimated loadings for our two indicators are the same values we provided, and consequently the understandardized coefficient is 1. However, the standardized coefficient is 0.177 and it is this value that we would present (although its non-significant, given that these are fake data).

It seems both the unsquared and squared values of cover significantly predict richness, so we are justified in including both as indicators to our composite variable. Now we extract the coefficients, use them to generate the factor scores, and finally use those scores to predict richness.

For the moment, composites are not directly implemented in piecewiseSEM with special syntax like in lavaan, but we hope to introduce that functionality soon. In the interim, they are easy to compute them by hand, as we have shown above, extract the predicted scores, and use them as any other predictor.

Note that we get the same standardized coefficients as in lavaan! There is, however, deviation in the goodness-of-fit that are accounted for by differences in how the composite is constructed and the correlated errors among the indicator and $firesev$ which are not possible to model in piecewiseSEM yet.

I am creating a measurement scale and want to do SEM. On the other hand, I cannot find how to create a composite variable (factor), that is to say which groups the items of the dimension together. Can someone please help me?

The first step for creating a measurement scale, after you've collected your data, is to do an exploratory factor analysis. You can do this in JMP by going to Analysis > Multivariate Methods > Factor Analysis, selecting all your measured variables and launching the platform. The platform allows you to explore the adequate number of factors (i.e., latent variables) to extract --once you've found a good factor solution (e.g., you've accounted for a good amount of variance in the data, you have factors with a combination of strong and weak standardized factor loadings --known as simple structure,-- acceptable fit, etc), you should use an independent dataset to fit a confirmatory factor analysis in SEM that helps you validate the factorial structure (the pattern of standardized factor loadings). You can find more info on exploratory factor analysis in our documentation:

In sum, the standardized factor loadings of an exploratory factor analysis help you identify which variables group to define the latent variable in SEM. Theory and domain expertise should also guide this process, as you probably have an expectation of which variables are hypothesized to be caused by the same unobserved (latent) variable.

The data frame includes annual employee survey responses from 156 employees to three Job Satisfaction items (JobSat1, JobSat2, JobSat3), three Turnover Intentions items (TurnInt1, TurnInt2, TurnInt3), and four Engagement items (Engage1, Engage2, Engage3, Engage4). Employees responded to each item using a 5-point response format, ranging from Strongly Disagree (1) to Strongly Agree (5). Assume that higher scores on an item indicate higher levels of that variable; for example, a higher score on TurnInt1 would indicate that the respondent has higher intentions of quitting the organization.

I'm defining a variable as a composition of other variables and some text, and I'm trying to get this variable to not expand its containing variables on the assigning. But I want it to expand when called later. That way I could reuse the same template to print different results as the inner variables keep changing. I'm truing to avoid eval as much as possible as I will be receiving some of the inner variables from third parties, and I do not know what to expect.

My use case, as below, is to have some "calling stack" so I can log all messages with the same format and keep a record of the script, function, and line of the logged message in some format like this: script.sh:this_function:42.

If there is some built-in or other clever way to log this 'call stack', by all means, let me know! I'll gladly wave my code good-bye, but I'd still like to know how to prevent the variables from expanding right away on the assigning of CURR_STACK, but still keep them able to expand further ahead.

Shell variables are store plain inert text(*), not executable code; there isn't really any concept of delayed evaluation here. To make something that does something when used, create a function instead of a variable:

Note: using BASH_SOURCE[1] and FUNCNAME[1] gets info about context the function was run from, rather than where it is in the function itself. But for some reason I'm not clear on, BASH_LINENO[1] gets the wrong info, and BASH_LINENO[0] is what you want.

(* There's an exception to what I said about variables just contain inert text: some variables -- like $LINENO, $RANDOM, etc -- are handled specially by the shell itself. But you can't create new ones like this except by modifying the shell itself.)

The thing is: the designer of the font above, likes the variable font to work well in as many environment as possible. But since he designed in Glyphs and uses a lot of the automatisation, it would be safest to produce these fonts in Glyphs as well.

What about adding an custom value to add the MVAR table?

(like you offered with Write Kern Table)

I could name you a hand full of people on top of my head, still using High Sierra, some of them even are designers.

The following unwanted font tables were found: Table: MVAR Reason: Produces a bug in DirectWrite which causes 1492477 - Variable Font linespacing differs when compared to their static equivalents on Windows version, Quicksand font: letters cropped in Issue #2085 google/fonts GitHub

it is patient data and the goal is to determine if patients who identify as transgender/Non binary have a rate of surgical complications less than, equal to, or higher than patients who identify as cisgender when it comes to hysterectomy

I want to create a composite variable. Its name will be surgical complications. It will be a column that shows either a Yes or No. It will be Yes if a patient had any complication (any single positive value in the intraop, postop, reoperation, or readmission categories) or No (no positive value for any of the aforementioned categories)

I have tried if-then statements with or/and, if then else statements, array statements and I cannot get this to work. I got it close to working but then it stopped and started just putting a "." and I could not figure out why it stopped working. I am at my wits end and willing to pay someone to help me at this point (if that is allowed).