m <- '
F1 =~ y1 + y2 + y3
F2 =~ y4 + y5 + y6
F1 ~~ F2 # factor covariance (oblique)
'
fit <- cfa(m, data = dat) # or sem(m, data = dat)
summary(fit, standardized = TRUE, fit.measures = TRUE)From items to constructs (measurement-first)
Specify → Identify → Estimate → Evaluate → Revise/Report
Today: the measurement part — CFA (and reliability from CFA).
Two-step mindset: measure first, then relate constructs (SEM deck 05).
By the end of this session you should be able to:
std.lv)lavaan) + interpretationImportant
Factor analysis models believe that a small number of latent dimensions explain systematic covariance among many observed variables.
Warning
PCA is not a factor analysis. It does not assume the ‘existence’ of any latent trait.
EFA: discover a plausible loading pattern.
CFA: test a specific measurement hypothesis.
CFA is not a fancy EFA: it is a confirmatory claim with theoretical assumptions and meanings.
The general CFA model can be written as:
\[ \begin{aligned} \mathbf{x} &= \mathbf{\Lambda}_x\,\mathbf{\xi} + \mathbf{\delta} \\ \mathbf{y} &= \mathbf{\Lambda}_y\,\mathbf{\eta} + \mathbf{\epsilon} \end{aligned} \]
where \((\mathbf{x})\) and \((\mathbf{y})\) are observed variables, \((\mathbf{\xi})\) and \((\mathbf{\eta})\) are latent factors, and \((\mathbf{\delta})\) and \((\mathbf{\epsilon})\) are errors of measurement.
\[ \begin{aligned} y &= b_0 + b_1 x + \epsilon \\ y_1 &= \tau_1 + \lambda_1\eta + \epsilon_1 \end{aligned} \]
\[ \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} = \begin{bmatrix} \tau_1 \\ \tau_2 \\ \tau_3 \end{bmatrix} + \begin{bmatrix} \lambda_1 \\ \lambda_2 \\ \lambda_3 \end{bmatrix} (\eta_1) + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \epsilon_3 \end{bmatrix} \]
When we draw \((\mathbf{\Lambda})\) (\((\Rightarrow)\)) from latent to observed, we assume reflective latent variables:
Pragmatic “just a summary” interpretations are not neutral: they imply different measurement models (PCA/EGA/…).
Important
Asserting that a latent variable ‘exists’ does not imply that there is one and only one entity/factor explaining the observations. A multitude, even millions, of factors (e.g., genes, ses, environment, health) can form the latent variable
For a standard CFA with latent covariance \((\Phi)\) and residual covariance \((\Theta)\):
\[ \Sigma = \Lambda \Phi \Lambda' + \Theta \]
This is why:
Lambda: matrix of loadings

Phi: latent variance-covariance matrix

Theta: residual variance-covariance matrix

Key design choices:
All have statistical and theoretical consequences.

Core operator:
=~ defines a factor from its indicatorsTo estimate latent-variable models, you must scale each factor. Two common strategies:
std.lv = TRUE).\[ \Sigma = \Lambda\Phi\Lambda' + \Theta \]
Marker method (fix one loading to 1)

Standardization (fix factor variance to 1)

Note
Scaling changes the metric of unstandardized loadings, not the implied covariance model.
For CFA, common identification rules include:
Necessary but not sufficient:
\[ t \leq \frac{q(q+1)}{2} \]
where \((t)\) is the number of free parameters and \((q)\) the number of observed variables.
Intuition: the number of nonredundant elements in \((S)\) is the maximum number of “equations”; if unknowns exceed equations, identification is impossible.
A sufficient (not necessary) condition (with diagonal \((\Theta)\)):
A sufficient (not necessary) condition for models with >1 factor:
std.lv=TRUE)Global indices are the same as in deck 03 (χ², CFI/TLI, RMSEA+CI, SRMR).
What becomes CFA-specific is local misfit interpretation:
Warning
Fit ≠ validity. “Improving fit” can destroy the meaning of a factor model if you add substantively implausible parameters.
pe <- parameterEstimates(fit_hs, standardized = TRUE)
pe[pe$op %in% c("=~","~~") & pe$lhs %in% c("visual","textual","speed"),
c("lhs","op","rhs","est","se","pvalue","std.all")] lhs op rhs est se pvalue std.all
1 visual =~ x1 0.900 0.081 0 0.772
2 visual =~ x2 0.498 0.077 0 0.424
3 visual =~ x3 0.656 0.074 0 0.581
4 textual =~ x4 0.990 0.057 0 0.852
5 textual =~ x5 1.102 0.063 0 0.855
6 textual =~ x6 0.917 0.054 0 0.838
7 speed =~ x7 0.619 0.070 0 0.570
8 speed =~ x8 0.731 0.066 0 0.723
9 speed =~ x9 0.670 0.065 0 0.665
19 visual ~~ visual 1.000 0.000 NA 1.000
20 textual ~~ textual 1.000 0.000 NA 1.000
21 speed ~~ speed 1.000 0.000 NA 1.000
22 visual ~~ textual 0.459 0.064 0 0.459
23 visual ~~ speed 0.471 0.073 0 0.471
24 textual ~~ speed 0.283 0.069 0 0.283
visual textual speed
0.612 0.885 0.686
visual textual speed
0.371 0.721 0.424
We emphasize ω-family (omega) reliability because it aligns with the CFA model; α is a special (often unrealistic) case.
Theory: test scores are affected by specific abilities (e.g., processing speed) that are influenced by an overarching factor (\(g\)).
In practice, you should validate first-order abilities before jumping to a second-order model.
Note
Note that we use sample.cov and sample.nobs arguments instead of data. In SEM you can fit the models from the covariance matrix and ‘ignore’ the raw data.
Parallel theories: test scores are affected by a general factor (\(g\)) and by specific abilities that explain remaining variance.
All factors are set to be orthogonal.
Note
Note that loadings in bifactor models are often costrained to be equal within the same specific factors. This is done to ease convergence but should be justified and transparently reported.
Bifactor often improves fit by absorbing residual covariance.
Before interpreting:
A bifactor model that “fits” can still be a bad measurement story.
| Model | ECV.g | PUC | Omega.g | OmegaH.g |
|---|---|---|---|---|
| 1 | 0.351 | 0.822 | 0.768 | 0.513 |
ECV.g = proportion of common variance explained by the general factor.
PUC = proportion of correlations uncontaminated by overlap among specific factors.
Omega.g = reliability of the observed total score.
OmegaH.g = proportion of total-score variance attributable uniquely to g.
Warning
Interpretation
These results do not support a strong general factor.
ECV.g = ~0.30: g explains only about 30% of common variance; OmegaH.g ~ 0.40: only about 40% of total-score variance is uniquely due to g. PUC ~ 0.80 is fairly high, but high PUC does not rescue weak ECV.g and OmegaH.g.
| ECV_SS | ECV_GS | Omega | OmegaH | H | FD | |
|---|---|---|---|---|---|---|
| VCI | 0.656 | 0.344 | 0.666 | 0.438 | 0.516 | 0.699 |
| PRI | 0.705 | 0.295 | 0.664 | 0.472 | 0.541 | 0.720 |
| WMI | 0.614 | 0.386 | 0.575 | 0.353 | 0.398 | 0.627 |
| PSI | 0.580 | 0.420 | 0.511 | 0.296 | 0.332 | 0.573 |
| g | 0.351 | 0.351 | 0.768 | 0.513 | 0.614 | 0.727 |
ECV_SS = proportion of common variance in a factor’s own items due to that specific factor.
ECV_GS = proportion of common variance in those same items due to the general factor.
Omega = reliability of the observed factor score, counting all modeled common variance.
OmegaH = reliable variance attributable to the factor itself, net of the other factors.
H = construct replicability.
FD = factor determinacy, that is, how well factor scores are defined.
Note
Interpretation. For all four group factors, ECV_SS > ECV_GS: within each item cluster, the specific factor is stronger than g. But OmegaH, H, and FD are modest for both the group factors and g: none appears especially strong or highly replicable.
Overall, the structure looks multidimensional, not essentially unidimensional.
| IECV | |
|---|---|
| SI | 0.367 |
| VC | 0.352 |
| CO | 0.309 |
| BD | 0.390 |
| PCn | 0.290 |
| MR | 0.191 |
| DS | 0.383 |
| LN | 0.388 |
| CD | 0.408 |
| SS | 0.432 |
IECV = proportion of an item’s common variance accounted for by the general factor. High IECV means the item behaves mostly like an indicator of g. Low IECV means the item retains substantial non-general variance.
Important
Interpretation
Here, IECV values range from about 0.17 to 0.40.
No item behaves mainly as an indicator of the general factor. The general factor is therefore not dominant at the item level. This reinforces the earlier conclusion: the bifactor model does not justify treating the scale as essentially unidimensional.
Go to:
labs/lab04_cfa_reliability_omegas.qmd linkYou will practice:
Classic SEM measurement chapters (CFA fundamentals; see course website reading list)
Maybe one day
extras/ex10_bifactor_esem_method-factors.qmd — bifactor/ESEM/method factors (advanced)extras/ex05_miivs_factor-score-regression.qmd — factor scores & regression (later)Thanks to Massimiliano Pastore for his slides!
