6.5 Ordinal regression

Author

Affiliation

Vladimir Buskin

Catholic University of Eichstätt-Ingolstadt

Abstract

Modelling categorical (ordinal) response variables.

Introduction

In her recent contribution, Glass (2021) examines possible reasons why certain transitive verbs have a stronger affinity towards object omission compared to others, placing special emphasis on the routinisation of the actions denoted by the verbs. Specifically, she assesses how high/low-routine contexts affect the acceptability of object omission for transitive verbs from different frequency bins.

We will replicate her findings using her survey data Glass_2021_survey_processed.csv¹:

¹ The original dataset can be retrieved from Lelia Glass’s OSF repository: https://osf.io/t6zw5 [Last accessed: 27th September, 2024].

# Load libraries
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(ordinal)


Attaching package: 'ordinal'

The following object is masked from 'package:dplyr':

    slice

library(MASS)


Attaching package: 'MASS'

The following object is masked from 'package:dplyr':

    select

library(sjPlot)
library(effects)

Loading required package: carData
lattice theme set by effectsTheme()
See ?effectsTheme for details.

library(ggeffects)
library(ggpubr)

# For additional tests
library(DescTools)
library(generalhoslem)

Loading required package: reshape

Attaching package: 'reshape'

The following object is masked from 'package:lubridate':

    stamp

The following object is masked from 'package:dplyr':

    rename

The following objects are masked from 'package:tidyr':

    expand, smiths

library(brant)

# Load data
survey <- read.csv("../datasets/Glass_2021_survey_processed.csv")

# Inspect dataset
str(survey)

'data.frame':   784 obs. of  5 variables:
 $ rating       : int  3 4 3 3 2 4 3 2 3 2 ...
 $ freq         : chr  "hi" "lo" "lo" "hi" ...
 $ verb         : chr  "pick" "catch" "throw" "break" ...
 $ ParticipantID: chr  "Participant1" "Participant1" "Participant1" "Participant1" ...
 $ routine      : chr  "hi" "lo" "lo" "lo" ...

head(survey)

  rating freq   verb ParticipantID routine
1      3   hi   pick  Participant1      hi
2      4   lo  catch  Participant1      lo
3      3   lo  throw  Participant1      lo
4      3   hi  break  Participant1      lo
5      2   hi  taste  Participant1      hi
6      4   hi bottle  Participant1      hi

Short breakdown of the variables

routine: In Glass’s study, transitive verbs were randomly assigned to one of the following conditions:

(High routine condition:) I worked at my poultry farm. Just like I always do, I butchered some chickens. Then I gathered some eggs.

(Low-routine condition:) I visited a friend’s job. Just because people wanted me to try it, I butchered some chickens. Then I went for a walk.

Cf. Glass (2021: 66)

unique(survey$routine)

[1] "hi" "lo"

rating records the responses of participants to a follow-up question regarding the acceptability of object omission. The answers are recorded on a 1-5 Likert scale.

The next time Caroline talks about butchering chickens the day before, how likely do you think she is to say the following?

‘I butchered yesterday’

Cf. Glass (2021: 66)

unique(survey$rating)

[1] 3 4 2 5 1

verb contains the items to be rated for the conditions in routine

unique(survey$verb)

[1] "pick"   "catch"  "throw"  "break"  "taste"  "bottle" "sell"   "chop"

frequency relates to the frequency bins of the verbs:

unique(survey$freq)

[1] "hi" "lo"

ParticipantID identifies each of the 98 subjects who provided ratings

Descriptive overview

Show the code

ggplot(survey, aes(x = rating, fill = freq)) +
  geom_density(alpha = 0.7) +
  facet_wrap(~freq) +
  theme_minimal() +
  labs(title = "Density of Ratings by Frequency",
       x = "Rating", y = "Density", fill = "Frequency")

Show the code

ggplot(survey, aes(x = rating, fill = routine)) +
  geom_density(alpha = 0.7) +
  facet_wrap(~routine) +
  theme_minimal() +
  labs(title = "Density of Ratings by Routine",
       x = "Rating", y = "Density", fill = "Routine")

Show the code

library(ggridges)
ggplot(survey, aes(x = rating, y = verb, fill = verb)) +
  geom_density_ridges(scale = 5, rel_min_height = 0.01, alpha = 0.6) +
  theme_ridges() +
  #theme(legend.position = "none") +
  labs(title = "Distribution of Ratings by Verb",
       x = "Rating", y = "Verb")

Picking joint bandwidth of 0.381

Modelling ordinal data

Our task is clear: We need to measure how routine and freq affect the variability in the acceptability ratings, while controlling for repeated measurements for verb and ParticipantID, which impose a hierarchical structure on the dataset.

Formally speaking, we have \(p\) explanatory variables \(X_1, X_2, ..., X_p\) for \(1, ..., p\). The target variable, i.e. our \(Y\), is rating with the ordered, discrete outcomes \(y \in \{1, 2, 3, 4, 5\}\).

The goal is to find a model \(f\) that describes the relationship between \(Y\) and \(X_p\) as accurately as possible and minimises the error term \(\epsilon\):

\[ Y = f(X_1, X_2, ..., X_p) + \epsilon \tag{1}\]

Ordered logistic regression

One family of models that respects the ordered, yet categorical nature of \(Y\) is ordered (or ordinal) logistic regression. Other terms include proportional odds models and cumulative logit/link models.

Recap: Logistic regression

Logistic regression is used to model categorical response variables with two or more levels. For instance, let’s assume our \(Y\) is dichotomous with the following two outcomes:

\[ Y = \begin{cases} \text{yes} \\ \text{no} \end{cases} \tag{2}\]

Using the logistic function, we can estimate the probability of one outcome versus the other given the predictors \(X_p\). Their log-transformed odds ratio (log odds) is equivalent of the all-too-familiar linear model:

\[ \log\left(\frac{P(Y = yes \mid X_1, X_2, ..., X_p)}{1 - P(Y = yes \mid X_1, X_2, ..., X_p)}\right) = \beta_0 + \sum_{i=1}^p \beta_iX_i \tag{3}\]

Core to this approach is the notion of cumulative probabilities. Let \(J\) denote the number of ordered categories in \(Y\). In Glass’s case study, the estimated cumulative probabilities for each ordered outcome (= acceptability rating) would have the forms in Equation 4.

\[ \begin{array}{rcl} P(Y \leq 1) & = & P(Y = 1) \\ P(Y \leq 2) & = & P(Y = 1) + P(Y = 2) \\ P(Y \leq 3) & = & P(Y = 1) + P(Y = 2) + P(Y = 3) \\ & \vdots & \\ P(Y \leq j) & = & P_1 + ... + P_j \end{array} \tag{4}\]

We can now update our logistic regression model to take into account cumulative probabilities for \(j = 1, ..., J-1\).

\[ \log\left(\frac{P(Y \leq j \mid X_1, X_2, ..., X_p)}{1 - P(Y \leq j \mid X_1, X_2, ..., X_p)}\right) = \beta_0 + \sum_{i=1}^p \beta_iX_i \tag{5}\]

The intercepts \(\beta_{0_j}\) serve as cutpoints between the adjacent ordinal categories. For \(J = 5\) categories, there are \(J - 1 = 4\) cutpoints, i.e.,

1|2 for \(P(Y \leq 1)\)
2|3 for \(P(Y \leq 2)\)
3|4 for \(P(Y \leq 3)\)
4|5 for \(P(Y \leq 4)\).

Given a change in predictor values, the slope coefficients \(\beta_pX_p\) indicate how the probability of being in a higher rating category changes (Baguley 2012: 691–2).

We can obtain “regular” probabilities from the cumulative ones by drawing on the equivalence in Equation 6.

\[ P(Y = j) = P(Y \leq j) - P(Y \leq j - 1) \tag{6}\]

For instance, the probability \(P(Y = 3)\) is equivalent to

\[ P(Y = 3) = P(Y \leq 3) - P(Y \leq 2). \tag{7}\]

Assumptions of proportional odds models

The proportional odds assumption stipulates a stable effect of the predictors on the (log) odds of the ordinal outcomes across all possible cutpoints (O’Connell 2006: 29). In case of violation, it is better to rely on partial proportional odds models or multinomial logistic regression instead.

Application in R

There are several R packages that support ordinal logistic regression models. This section provides an overview of some of the more common (as well as well-documented) implementations.

Using `polr()` from the `MASS` library

# Convert to factor and determine ordering
survey$rating <- factor(survey$rating, ordered = TRUE, levels = c("1", "2", "3", "4", "5"))

# Fit polr model
survey.polr <- polr(rating ~ 
                      freq +
                      routine,
                      data = survey)

# Model summary
summary(survey.polr)


Re-fitting to get Hessian

Call:
polr(formula = rating ~ freq + routine, data = survey)

Coefficients:
             Value Std. Error  t value
freqlo    -0.01095     0.1291 -0.08483
routinelo -0.55521     0.1302 -4.26449

Intercepts:
    Value   Std. Error t value
1|2 -0.8704  0.1228    -7.0859
2|3  0.1342  0.1188     1.1290
3|4  1.2528  0.1293     9.6856
4|5  2.8915  0.2003    14.4345

Residual Deviance: 2246.662 
AIC: 2258.662

# R-squared and AIC
PseudoR2(survey.polr, c("Nagelkerke", "AIC"))

  Nagelkerke          AIC 
   0.0244793 2258.6617419

tab_model(survey.polr, show.se = TRUE, show.aic = TRUE, show.dev = TRUE, transform = NULL)


Re-fitting to get Hessian

	rating
Predictors	Log-Odds	std. Error	CI	p
1\|2	-0.87	0.12	-1.11 – -0.63	<0.001
2\|3	0.13	0.12	-0.10 – 0.37	0.259
3\|4	1.25	0.13	1.00 – 1.51	<0.001
4\|5	2.89	0.20	2.50 – 3.28	<0.001
freq [lo]	-0.01	0.13	-0.26 – 0.24	0.932
routine [lo]	-0.56	0.13	-0.81 – -0.30	<0.001
Observations	784
R² Nagelkerke	0.024
Deviance	2246.662
AIC	2258.662

Interpreting the model parameters

Coefficients: The conditions freqlo (low frequency) and routinelo (low-routine context) both have negative values, which means that both of them decrease the probability of obtaining a higher acceptability rating (compared to freqhi and routinehi).
Intercepts: These represent the cutpoints between the ordinal categories, which are necessary for calculating the probabilities of each ordinal category.

Testing assumptions and goodness of fit

Test proportional odds assumption:

brant(survey.polr) # p < 0.05 is a violation of the assumption

-------------------------------------------- 
Test for    X2  df  probability 
-------------------------------------------- 
Omnibus     14.45   6   0.02
freqlo      6.5 3   0.09
routinelo   8.14    3   0.04
-------------------------------------------- 

H0: Parallel Regression Assumption holds

Hosmer-Lemeshow test, which is essentially a \(\chi^2\)-test:

logitgof(survey$rating, # observed
         fitted(survey.polr), # expected
         ord = TRUE) # respect ordering


    Hosmer and Lemeshow test (ordinal model)

data:  survey$rating, fitted(survey.polr)
X-squared = 51.173, df = 35, p-value = 0.03808

The Lipsitz test is an extension of the Hosmer-Lemeshow test. Note that it requires the response to be a factor.

lipsitz.test(survey.polr)


    Lipsitz goodness of fit test for ordinal response models

data:  formula:  rating ~ freq + routine
LR statistic = 20.261, df = 9, p-value = 0.01637

Part of the same family of tests is the Pulkstenis-Robinson test, which also relies on the \(\chi^2\)-distribution:

pulkrob.chisq(survey.polr, catvars = c("freq", "routine"))


    Pulkstenis-Robinson chi-squared test

data:  formula:  rating ~ freq + routine
X-squared = 21.476, df = 9, p-value = 0.0107

Visualisation

With `effects`

# Routine effect plot
plot(Effect(focal.predictors = c("routine"), mod = survey.polr), rug = FALSE, style="stacked")

With `ggeffects` and `ggplot2`

Show the code

# Get the ggeffects data
eff <- ggeffects::ggeffect(survey.polr, "freq")

# Convert to a data frame
plot_data <- as.data.frame(eff)

# Ensure the response.level has the desired levels
plot_data$response.level <- factor(plot_data$response.level, 
                                   levels = c("X1", "X2", "X3", "X4", "X5"),
                                   labels = c("1", "2", "3", "4", "5"))

# Create the plot with confidence intervals
p1 <- ggplot(plot_data, aes(x = x, y = predicted, color = response.level, group = response.level)) +
  geom_line(linewidth = 1) +
  geom_point(size = 3) +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.1, linewidth = 0.5) +
  scale_color_viridis_d() +
  labs(
    x = "Frequency",
    y = "Predicted Probability",
    color = "Rating"
  ) +
  facet_grid(~ response.level) +
  theme_minimal() +
  theme(
    legend.position = "right",
    panel.grid.minor = element_blank(),
    axis.title = element_text(face = "bold"),
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(margin = margin(b = 10))
  ) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1))

p1

Show the code

# Get the ggeffects data for "routine"
eff_routine <- ggeffects::ggeffect(survey.polr, "routine")

# Convert to a data frame
plot_data_routine <- as.data.frame(eff_routine)

# Ensure the response.level has the desired levels for "routine"
plot_data_routine$response.level <- factor(plot_data_routine$response.level, 
                                           levels = c("X1", "X2", "X3", "X4", "X5"),
                                           labels = c("1", "2", "3", "4", "5"))

# Create the second plot for "routine"
p2 <- ggplot(plot_data_routine, aes(x = x, y = predicted, color = response.level, group = response.level)) +
  geom_line(linewidth = 1) +
  geom_point(size = 3) +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.1, linewidth = 0.5) +
  scale_color_viridis_d() +
  labs(
    x = "Routine",
    y = "Predicted Probability",
    color = "Rating"
  ) +
  facet_grid(~ response.level) +
  theme_minimal() +
  theme(
    legend.position = "right",
    panel.grid.minor = element_blank(),
    axis.title = element_text(face = "bold"),
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(margin = margin(b = 10))
  ) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1))

p2

Show the code

# Generate the interaction effects for "freq" and "routine"
eff_interaction <- ggeffect(survey.polr, terms = c("freq", "routine"))

# Convert to a data frame
plot_data_interaction <- as.data.frame(eff_interaction)

# Ensure the response.level has the desired levels
plot_data_interaction$response.level <- factor(plot_data_interaction$response.level, 
                                               levels = c("X1", "X2", "X3", "X4", "X5"),
                                               labels = c("1", "2", "3", "4", "5"))


# Create the interaction plot with facet by 'x' and color by 'response.level'
p_interaction <- ggplot(plot_data_interaction, aes(x = group, y = predicted, color = response.level, group = response.level)) +
  geom_line(linewidth = 1) +
  geom_point(size = 3) +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), width = 0.1, linewidth = 0.5) +
  scale_color_viridis_d() +
  labs(
    title = "Predicted Probabilities for Interaction of Frequency and Routine",
    x = "Routine",
    y = "Predicted Probability",
    color = "Rating"
  ) +
  facet_wrap(~ x, labeller = labeller(x = c("hi" = "High Frequency", "lo" = "Low Frequency"))) +  # Facet by "freq"
  theme_minimal() +
  theme(
    legend.position = "right",
    panel.grid.minor = element_blank(),
    axis.title = element_text(face = "bold"),
    plot.title = element_text(face = "bold"),
    plot.subtitle = element_text(margin = margin(b = 10))
  ) +
  scale_y_continuous(labels = scales::percent_format(accuracy = 1))

# Display the interaction plot
p_interaction

Using `clm()` from the `ordinal` library

# Convert to factor and determine ordering
survey$rating <- factor(survey$rating, ordered = TRUE, levels = c("1", "2", "3", "4", "5"))

# Fit cumulative link model
clm.1 <- ordinal::clm(rating ~ 
                    freq +
                    routine,
                    data = survey, Hess=TRUE)

# Model summary
summary(clm.1)

tab_model(clm.1, show.se = TRUE, show.aic = TRUE, show.dev = TRUE, transform = NULL)

	rating
Predictors	Log-Odds	std. Error	CI	p
1\|2	-0.87	0.12	-1.11 – -0.63	<0.001
2\|3	0.13	0.12	-0.10 – 0.37	0.259
3\|4	1.25	0.13	1.00 – 1.51	<0.001
4\|5	2.89	0.20	2.50 – 3.28	<0.001
freq [lo]	-0.01	0.13	-0.26 – 0.24	0.932
routine [lo]	-0.56	0.13	-0.81 – -0.30	<0.001
Observations	784
R² Nagelkerke	0.024
AIC	2258.662

Mixed-effects ordinal regression

Recap: Mixed-effects models

If the data is nested according to some grouping factor with \(1, ..., k\) groups, we can let the intercept and/or slopes vary by group. For instance, recall the varying-intercept model:

\[ Y = \alpha_{k} + \beta_1X_{1} + \beta_2X_{2} + ... + \epsilon \qquad \alpha_{k} \sim N(\mu_{\alpha}, \sigma_{\alpha}^2). \] In this case we also speak of random effects.

# Convert to factor and determine ordering
survey$rating <- factor(survey$rating, ordered = TRUE, levels = c("1", "2", "3", "4", "5"))

# Fit mixed model with random intercepts for "verb" and "ParticipantID"
clm.2 <- ordinal::clmm(rating ~ 
                    freq * routine +
                    (1 | verb) +
                    (1 | ParticipantID),
                    data = survey, Hess=TRUE)

# Model summary
summary(clm.2)

Cumulative Link Mixed Model fitted with the Laplace approximation

formula: rating ~ freq * routine + (1 | verb) + (1 | ParticipantID)
data:    survey

 link  threshold nobs logLik  AIC     niter     max.grad cond.H 
 logit flexible  784  -991.65 2001.30 598(2386) 3.81e-04 1.5e+02

Random effects:
 Groups        Name        Variance Std.Dev.
 ParticipantID (Intercept) 2.0628   1.4363  
 verb          (Intercept) 0.9059   0.9518  
Number of groups:  ParticipantID 98,  verb 8 

Coefficients:
                 Estimate Std. Error z value Pr(>|z|)    
freqlo            -0.1567     0.2138  -0.733 0.463437    
routinelo         -0.7473     0.2103  -3.553 0.000381 ***
freqlo:routinelo  -0.1406     0.2981  -0.472 0.637046    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Threshold coefficients:
    Estimate Std. Error z value
1|2  -1.3562     0.4020  -3.374
2|3   0.2062     0.3988   0.517
3|4   1.8329     0.4053   4.523
4|5   3.8639     0.4401   8.780

tab_model(clm.2, show.se = TRUE, show.aic = TRUE, show.dev = TRUE, transform = NULL)

	rating
Predictors	Log-Odds	std. Error	CI	p
1\|2	-1.36	0.40	-2.14 – -0.57	0.001
2\|3	0.21	0.40	-0.58 – 0.99	0.605
3\|4	1.83	0.41	1.04 – 2.63	<0.001
4\|5	3.86	0.44	3.00 – 4.73	<0.001
freq [lo]	-0.16	0.21	-0.58 – 0.26	0.463
routine [lo]	-0.75	0.21	-1.16 – -0.34	<0.001
freq [lo] × routine [lo]	-0.14	0.30	-0.72 – 0.44	0.637
Random Effects
σ²	3.29
τ₀₀ _{ParticipantID}	2.06
τ₀₀ _verb	0.91
ICC	0.47
N _verb	8
N _{ParticipantID}	98
Observations	784
Marginal R² / Conditional R²	0.028 / 0.489
AIC	2001.305

Show the code

# Extract random effects
re_verb <- ranef(clm.2)$verb
re_participant <- ranef(clm.2)$ParticipantID

# Create dataframes for random effects
df_verb <- data.frame(verb = rownames(re_verb), re = re_verb[,1])
df_participant <- data.frame(ParticipantID = rownames(re_participant), re = re_participant[,1])

# Get predictions for an average case
pred_avg <- ggpredict(clm.2, terms = c("freq", "routine"))

# Add random effects to predictions
pred_verb <- crossing(pred_avg, df_verb) %>%
  mutate(predicted = predicted + re)

pred_participant <- crossing(pred_avg, df_participant) %>%
  mutate(predicted = predicted + re)

# Create a horizontal dot plot for random effects of participants
p_caterpillar <- ggplot(df_participant, aes(x = re, y = reorder(ParticipantID, re))) +
  geom_point(size = 3, color = "steelblue3") +  # Dots representing the random effects
  labs(title = "Random Effects for Participants", 
       x = "Random Effect Estimate (log odds)", 
       y = "Participant ID") +
   geom_vline(xintercept = 0, linetype = "dashed", color = "grey20") +
  theme_minimal() +
  theme(panel.grid.major.y = element_blank())  # Removes the gridlines for y-axis

p_caterpillar2 <- ggplot(df_verb, aes(x = re, y = reorder(verb, re))) +
  geom_point(size = 3, color = "steelblue3") +  # Dots representing the random effects
  labs(title = "Random Effects for Verbs", 
       x = "Random Effect Estimate (log odds)", 
       y = "Verb") +
   geom_vline(xintercept = 0, linetype = "dashed", color = "grey20") +
  theme_minimal() +
  theme(panel.grid.major.y = element_blank())  # Removes the gridlines for y-axis

ggarrange(p_caterpillar, p_caterpillar2, ncol = 2, common.legend = TRUE, legend = "right")

Generalised Additive Mixed-effects Models (GAMMs)

Rationale

A core assumption of Generalised Linear Models (GLMs) is a linear relationship between predictor(s) and response. If, however, one is interested in exploring potential non-linear trends without the risk of extreme overfitting, GAMs offer an elegant solution: Instead of relying on the linear sum of model coefficients, GAMs estimate more flexible smooth terms \(f_k\) for \(k = 1, ..., p\). For illustration, Equation 8 shows a linear additive model for a continuous target variable with \(p\) predictors.

\[ Y = \beta_0 + \sum\limits_{k = 1}^p f_k(X_k) \tag{8}\]

Application in R

# Load libraries
library(mgcv)

Loading required package: nlme


Attaching package: 'nlme'

The following object is masked from 'package:dplyr':

    collapse

This is mgcv 1.9-1. For overview type 'help("mgcv-package")'.

library(itsadug)

Loading required package: plotfunctions


Attaching package: 'plotfunctions'

The following object is masked from 'package:ggpubr':

    get_palette

The following object is masked from 'package:ggplot2':

    alpha

Loaded package itsadug 2.4 (see 'help("itsadug")' ).

library(gratia)

# Convert predictors to factors
survey$ParticipantID <- as.factor(survey$ParticipantID)
survey$verb <- as.factor(survey$verb)

# Fit GAMM
gam1 <- bam(as.numeric(rating) ~ # treated as numeric term
              freq + # linear term
              routine + # linear term
              s(ParticipantID, bs = "re") + # smooth term
              s(verb, bs = "re"), # smooth term
              data = survey, 
              family = ocat(R = 5) # number of ordinal categories
            )

# Model summary
summary(gam1)


Family: Ordered Categorical(-1,0.51,2.09,4.06) 
Link function: identity 

Formula:
as.numeric(rating) ~ freq + routine + s(ParticipantID, bs = "re") + 
    s(verb, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   0.3613     0.3880   0.931    0.352    
freqlo       -0.2195     0.1431  -1.534    0.125    
routinelo    -0.7752     0.1428  -5.430 7.79e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                    edf Ref.df      F p-value    
s(ParticipantID) 77.654     97  4.677  <2e-16 ***
s(verb)           6.728      7 25.860  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Deviance explained = 27.3%
fREML =   1631  Scale est. = 1         n = 784

# Extract the intercepts for plotting
thresh <- gratia::theta(gam1) %>% 
  tibble::as_tibble() %>% 
  setNames(c("threshold"))

# Extract predictions for "routine"
routine_pred <- ggpredict(gam1, terms = "routine")

# Plot predictions
routine_pred %>% 
  ggplot(aes(x = x, y = predicted)) +
  geom_point(col = "steelblue") +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high, width = 0), col = "steelblue") +
  geom_hline(data = thresh, aes(yintercept = threshold), linetype = "dashed") +
  theme_minimal()

# Extract random effects for "verb"
verb_pred <- ggpredict(gam1, terms = "verb")

# Plot random effect
verb_pred %>% 
  ggplot(aes(x = x, y = predicted)) +
  geom_point(col = "steelblue") +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high, width = 0), col = "steelblue") +
  geom_hline(data = thresh, aes(yintercept = threshold), linetype = "dashed") +
  theme_minimal()

# Extract random effects for "ParticipantID"
subj_pred <- ggpredict(gam1, terms = "ParticipantID")

# Plot random effect
subj_pred |>
  ggplot(aes(x = x, y = predicted)) +
  geom_point(col = "steelblue") +
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high, width = 0), col = "steelblue") +
  #geom_line() +
  geom_hline(data = thresh, aes(yintercept = threshold), linetype = "dashed") +
  theme_minimal()

References

Baayen, R. Harald, and Maja Linke. 2020. “Generalized Additive Mixed Models.” In A Practical Handbook of Corpus Linguistics, edited by Magali Paquot and Stefan Thomas Gries, 563–91. Cham: Springer.

Baguley, Thomas. 2012. Serious Stats: A Guide to Advanced Statistics for the Behavioral Sciences. Houndmills, Basingstoke: Palgrave Macmillan.

Glass, Lelia. 2021. “English Verbs Can Omit Their Objects When They Describe Routines.” English Language and Linguistics 26 (1): 49–73. https://doi.org/10.1017/S1360674321000022.

Hastie, Trevor, and Robert Tibshirani. 1991. Generalized Additive Models. London: Chapman & Hall.

O’Connell, Ann A. 2006. Logistic Regression Models for Ordinal Response Variables. Vol. 146. Thousand Oaks, Calif.: Sage.

Powers, Daniel A., and Yu Xie. 2008. Statistical Methods for Categorical Data Analysis. 2. ed. Bingley: Emerald.

Wood, Simon N. 2006. Generalized Additive Models: An Introduction with R. Boca Raton: Chapman & Hall/CRC.

6.5 Ordinal regression

Suggested reading

Introduction

Descriptive overview

Modelling ordinal data

Ordered logistic regression

Application in R

Using `polr()` from the `MASS` library

Testing assumptions and goodness of fit

Visualisation

With `effects`

With `ggeffects` and `ggplot2`

Using `clm()` from the `ordinal` library

Mixed-effects ordinal regression

Generalised Additive Mixed-effects Models (GAMMs)

Suggested reading

Rationale

Application in R

References

Suggested reading

Introduction

Descriptive overview

Modelling ordinal data

Ordered logistic regression

Application in R

Using polr() from the MASS library

Testing assumptions and goodness of fit

Visualisation

With effects

With ggeffects and ggplot2

Using clm() from the ordinal library

Mixed-effects ordinal regression

Generalised Additive Mixed-effects Models (GAMMs)

Suggested reading

Rationale

Application in R

References

Using `polr()` from the `MASS` library

With `effects`

With `ggeffects` and `ggplot2`

Using `clm()` from the `ordinal` library