The implementation of Exploratory Factor Analysis in R is very similar to that of Principle Components Analysis. To highlight these similarities, we will use the same libraries (most importantly psych) and the same dataset scope_sem_sub as in the unit on PCA (see Section 27.2 for further details).
Exploratory Factor Analysis (EFA) is quite similar to PCA in that it compresses the high-dimensional feature space, yet the core idea is not to capture as much variance as possible with as few variables as possible, but rather reveal latent (= invisible) variables, i.e., factors.
The computation bears some resemblance to that of PCA, with the main difference being that an observation \(x_m\) is assumed to be generated by combinations of factor loadings \(\lambda_{1}, \lambda_{2}, \dots, \lambda_{mp}\) with the underlying factors \(\xi_{1}, \xi_{2}, \dots, \xi_{p}\) (see Equation 28.1). Everything to the right of the equation can only be obtained by running estimation procedures such as Principle Axis Factoring or Maximum Likelihood Estimation.
When retrieving PCA and EFA loadings, several interpretive differences must be kept in mind:
Key differences between EFA and PCA
PCA: PCA weights can be conceptualised as “directions in feature space along which the data vary the most” (James et al. 2021: 503) and are analogous to regression slopes. Features with similar loadings on a given PC will be very close to each other in a biplot and could be understood as correlated with each other.
EFA: The factor loadings in an EFA, on the other hand, directly indicate how strong a factor is correlated with an existing independent variable in the dataset. As such, they help identify and interpret the underlying constructs that have given rise to the data. We can think of EFA loadings as regression coefficients and correlation coefficients at the same time.
28.4 Application in R
We use our insights from the PCA analysis, according to which three latent variables are enough to capture the bulk of variance in the dataset. When fitting an EFA model, principle axis factoring is the default solution, but could also be changed to fm = "ml" to perform Maximum Likelihood Estimation.
efa1 <-fa(scope_sem_sub[,-1], nfactors =3, rotate ="none", fm ="pa")
The remaining printing and plotting methods are identical to PCA.
plot(efa1, labels =colnames(scope_sem_sub[,-1]), main =NA)
Plot PA scores and loadings:
biplot(efa1, choose =c(1, 2), main =NA,pch =20, col =c("darkgrey", "blue"))
biplot(efa1, choose =c(2, 3), main =NA,pch =20, col =c("darkgrey", "blue"))
28.4.1 Rotation
Factors are typically rotated in order to aid in their interpretation, resulting in much clearer loading patterns. Varimaxrotation is the default technique and does not affect the model fit (i.e., there is no loss in explained variance; for details see (Mair 2018: 26-29).1
1 Varimax is a so-called orthogonal rotation technique and, therefore, does not introduce correlations between the factors. If correlated factors are explicitly desired, oblique rotations such as oblimin and promax provide apt alternatives (Mair 2018: 27).
efa2 <-fa(scope_sem_sub[,-1], nfactors =3, rotate ="Varimax", fm ="pa")loadings(efa2)
The rotated EFA object paints a picture that is very similar to the PCA result from the previous unit.
diagram(efa2, main =NA)
biplot(efa2, choose =c(1, 2), main =NA,pch =20, col =c("darkgrey", "blue"))
biplot(efa2, choose =c(2, 3), main =NA,pch =20, col =c("darkgrey", "blue"))
Interpreting the EFA output
Perception: The first principle axis is once more loaded heavily (and positively) by increasing concreteness scores in addition to higher visual and haptic ratings. Moreover, they display strong linear relationships. The negative association with interoceptive ratings suggests that referents that tend be perceived directly with their senses (concreteness) do not tend to be perceived inside their body.
Senses: In PA2 we find the inverse pattern of PC2 – very strong positive correlations with sense-related features and a weaker, yet notable negative correlation with selectional preference strength. If a verb has more senses, it tends to carry less information about its context.
Ingestion: Interoceptive ratings are no longer part of the picture, thus giving way to the gustatory and olfactory perception of referents.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning: With Applications in r. New York: Springer. https://doi.org/10.1007/978-1-0716-1418-1.
Levshina, Natalia. 2015. How to Do Linguistics with r: Data Exploration and Statistical Analysis. Amsterdam; Philadelphia: John Benjamins Publishing Company.