By the end of this lab, you can:
~, ~~)We will use the built-in dataset mtcars (motor trends cars; a classic toy dataset).
dat <- mtcars
# Keep only variables we use (and make names explicit)
dat <- dat[, c("mpg", "qsec", "wt", "hp")]
# Quick look
summary(dat) mpg qsec wt hp
Min. :10.40 Min. :14.50 Min. :1.513 Min. : 52.0
1st Qu.:15.43 1st Qu.:16.89 1st Qu.:2.581 1st Qu.: 96.5
Median :19.20 Median :17.71 Median :3.325 Median :123.0
Mean :20.09 Mean :17.85 Mean :3.217 Mean :146.7
3rd Qu.:22.80 3rd Qu.:18.90 3rd Qu.:3.610 3rd Qu.:180.0
Max. :33.90 Max. :22.90 Max. :5.424 Max. :335.0
Why this dataset? It’s small, clean, and lets us practice the SEM workflow with manifest variables only.
lm()Fit a standard multiple regression predicting miles per gallon (mpg) from:
wt)hp)
Call:
lm(formula = mpg ~ wt + hp, data = dat)
Residuals:
Min 1Q Median 3Q Max
-3.941 -1.600 -0.182 1.050 5.854
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727 1.59879 23.285 < 2e-16 ***
wt -3.87783 0.63273 -6.129 1.12e-06 ***
hp -0.03177 0.00903 -3.519 0.00145 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.593 on 29 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8148
F-statistic: 69.21 on 2 and 29 DF, p-value: 9.109e-12
Questions
wt on mpg?hp on mpg?Keep this as your “reference”: we’ll reproduce the same model in lavaan.
In lavaan, you should write the model as a string, preferably in an object and not within the formula.
mod_sem <- '
mpg ~ wt + hp
'
fit_sem <- sem(mod_sem, data = dat, meanstructure = TRUE)
summary(fit_sem, standardized = TRUE)lavaan 0.6-19 ended normally after 1 iteration
Estimator ML
Optimization method NLMINB
Number of model parameters 4
Number of observations 32
Model Test User Model:
Test statistic 0.000
Degrees of freedom 0
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
mpg ~
wt -3.878 0.602 -6.438 0.000 -3.878 -0.630
hp -0.032 0.009 -3.696 0.000 -0.032 -0.361
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.mpg 37.227 1.522 24.459 0.000 37.227 6.276
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.mpg 6.095 1.524 4.000 0.000 6.095 0.173
Tasks
wt and hp. Do they match lm()?mpg (lavaan reports it explicitly when meanstructure = TRUE).mpg.Use the function parameterEstimates() to obtain the estimates. How do you extract standardized estimate?
lhs op rhs est se z pvalue ci.lower ci.upper std.lv std.all
1 mpg ~ wt -3.878 0.602 -6.438 0 -5.058 -2.697 -3.878 -0.630
2 mpg ~ hp -0.032 0.009 -3.696 0 -0.049 -0.015 -0.032 -0.361
3 mpg ~~ mpg 6.095 1.524 4.000 0 3.109 9.082 6.095 0.173
4 wt ~~ wt 0.927 0.000 NA NA 0.927 0.927 0.927 1.000
5 wt ~~ hp 42.812 0.000 NA NA 42.812 42.812 42.812 0.659
6 hp ~~ hp 4553.965 0.000 NA NA 4553.965 4553.965 4553.965 1.000
7 mpg ~1 37.227 1.522 24.459 0 34.244 40.210 37.227 6.276
8 wt ~1 3.217 0.000 NA NA 3.217 3.217 3.217 3.341
9 hp ~1 146.688 0.000 NA NA 146.688 146.688 146.688 2.174
std.nox
1 -0.654
2 -0.005
3 0.173
4 0.927
5 42.812
6 4553.965
7 6.276
8 3.217
9 146.688
op tells you the operator (~ regression, ~~ variance/covariance, ~1 intercept)est is the estimatese is the standard errorz and pvalue are the usual Wald test outputsstd.all is the fully standardized estimateAt this stage, focus on estimates only. We’ll talk about modeling strategy and diagnostics later.
Check-in: You just used lavaan as a “regression engine”. SEM becomes interesting when we connect multiple equations and allow covariances explicitly.
Now predict two outcomes from the same predictors:
mpg ~ wt + hpqsec ~ wt + hpAnd allow:
wt ~~ hp)mpg ~~ qsec)Adding covariances requires additional degrees of freedom!
mod_mv <- '
# regressions (two outcomes)
mpg ~ wt + hp
qsec ~ wt + hp
# covariance among predictors (exogenous variables)
wt ~~ hp
# residual covariance among outcomes (endogenous disturbances)
mpg ~~ qsec
'
fit_mv <- sem(mod_mv, data = dat, meanstructure = TRUE)
summary(fit_mv, standardized = TRUE)lavaan 0.6-19 ended normally after 43 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 14
Number of observations 32
Model Test User Model:
Test statistic 0.000
Degrees of freedom 0
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
mpg ~
wt -3.878 0.602 -6.438 0.000 -3.878 -0.630
hp -0.032 0.009 -3.696 0.000 -0.032 -0.361
qsec ~
wt 0.942 0.253 3.720 0.000 0.942 0.516
hp -0.027 0.004 -7.560 0.000 -0.027 -1.048
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
wt ~~
hp 42.812 13.757 3.112 0.002 42.812 0.659
.mpg ~~
.qsec 0.550 0.463 1.187 0.235 0.550 0.215
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.mpg 37.227 1.522 24.459 0.000 37.227 6.276
.qsec 18.826 0.640 29.433 0.000 18.826 10.704
wt 3.217 0.170 18.898 0.000 3.217 3.341
hp 146.688 11.929 12.296 0.000 146.688 2.174
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.mpg 6.095 1.524 4.000 0.000 6.095 0.173
.qsec 1.076 0.269 4.000 0.000 1.076 0.348
wt 0.927 0.232 4.000 0.000 0.927 1.000
hp 4553.965 1138.491 4.000 0.000 4553.965 1.000
Questions
wt ~~ hp a reasonable parameter here?mpg ~~ qsec represent in this model (hint: both have predictors)?mpg with Exercise 2. Do they change? Why/why not?fit_lm_mpg <- lm(mpg ~ wt + hp, data = dat)
fit_lm_qsec <- lm(qsec ~ wt + hp, data = dat)
summary(fit_lm_mpg)$coef Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.22727012 1.59878754 23.284689 2.565459e-20
wt -3.87783074 0.63273349 -6.128695 1.119647e-06
hp -0.03177295 0.00902971 -3.518712 1.451229e-03
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.82558525 0.671867025 28.019808 1.493575e-22
wt 0.94153237 0.265896975 3.540967 1.368578e-03
hp -0.02730962 0.003794603 -7.196964 6.362107e-08
Separate lm() fits ignore the joint structure (e.g., residual covariance). lavaan makes it explicit.
Test whether the effect of wt is the same for both outcomes (mpg and qsec).
(You already fit it as fit_mv.)
Label the two wt slopes with the same name (e.g., b_wt):
mod_mv_equal <- '
mpg ~ b_wt*wt + hp
qsec ~ b_wt*wt + hp
wt ~~ hp
mpg ~~ qsec
'
fit_mv_equal <- sem(mod_mv_equal, data = dat, meanstructure = TRUE)
summary(fit_mv_equal, standardized = TRUE)lavaan 0.6-19 ended normally after 64 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 14
Number of equality constraints 1
Number of observations 32
Model Test User Model:
Test statistic 35.243
Degrees of freedom 1
P-value (Chi-square) 0.000
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
mpg ~
wt (b_wt) 0.524 0.248 2.114 0.035 0.524 0.082
hp -0.073 0.011 -6.762 0.000 -0.073 -0.805
qsec ~
wt (b_wt) 0.524 0.248 2.114 0.035 0.524 0.298
hp -0.023 0.004 -6.378 0.000 -0.023 -0.932
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
wt ~~
hp 42.812 8.643 4.953 0.000 42.812 0.659
.mpg ~~
.qsec -0.416 0.774 -0.537 0.591 -0.416 -0.095
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.mpg 29.136 1.766 16.502 0.000 29.136 4.751
.qsec 19.594 0.645 30.368 0.000 19.594 11.579
wt 3.217 0.170 18.898 0.000 3.217 3.341
hp 146.687 11.929 12.296 0.000 146.687 2.174
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.mpg 16.266 4.066 4.000 0.000 16.266 0.432
.qsec 1.168 0.292 4.000 0.000 1.168 0.408
wt 0.927 0.209 4.440 0.000 0.927 1.000
hp 4553.971 0.081 56034.919 0.000 4553.971 1.000
Chi-Squared Difference Test
Df AIC BIC Chisq Chisq diff RMSEA Df diff Pr(>Chisq)
fit_mv 0 698.87 719.40 0.000
fit_mv_equal 1 732.12 751.17 35.243 35.243 1.0345 1 2.911e-09 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Interpretation questions
We’ll later discuss when χ² difference tests are appropriate, and what changes under robust estimators.
Use lavInspect() to view the sample covariances and the model-implied covariances.
mpg qsec wt hp
mpg 35.189
qsec 4.368 3.093
wt -4.957 -0.296 0.927
hp -310.709 -84.059 42.812 4553.965
mpg qsec wt hp
mpg 35.189
qsec 4.368 3.093
wt -4.957 -0.296 0.927
hp -310.709 -84.059 42.812 4553.965
Questions
Sigma_hat?~, ~~, ~1 (and later =~)