I think it also goes the issue of interpretability. The log(variables) are less skew, but the model based on them will have to be interpreted in the log metric, which is hard to make sense of. Now, if the metric of your data is not that relevant to you, you could use the log variables
Also, the assumption of normality is in the residuals, not the variables itself, that is why most of these models are "robust" to its violation, meaning that it doesnt ruin the model, but might have larger standard deviations of the posteriors.
Since the SEM models seek to replicate the variables correlations, its good to note that skewness will attenuate these correlations. But I think skewness would need to be higher to make a "substantial" difference
(hard to say what is the effect right now)
Some recommendations in different scenarios
- (a) If you continue with it as categorical data maybe combine some of the response options on the tails, to have more subjects in each response and have less categories
- (b) If running them as continuous, and the metric of the variables is important, use the raw data, and be aware that some correlations may be attenuated
- (c) if running as continuous, and the metric of the data is not important, can use the log variables, but be aware that the interpretability would be harder and your results would be generalizable with the log scale variables
- could also run (a) and (b), and if the factor correlations etc are not very different could be an indication that using then as continuous does not create a "big" difference
Hope this helps you make a decision