Statistics for Corpus Linguists
  • Overview
  • Fundamentals
    • 1.1 Basics
    • 1.2 Linguistic variables
    • 1.3 Research questions
    • 1.4 Set theory and mathematical notation
  • Introduction to R
    • 2.1 First steps
    • 2.2 Exploring R Studio
    • 2.3 Vectors
    • 2.4 Data frames
    • 2.5 Libraries
    • 2.6 Importing/Exporting
  • NLP
    • 3.1 Concordancing
    • 3.2 Regular expressions
    • 3.3 The CQP interface
    • 3.4 Data annotation
  • Statistics
    • 4.1 Data, variables, samples
    • 4.2 Probability theory
    • 4.3 Descriptive statistics
    • 4.4 Hypothesis testing
    • 4.5 Chi-squared test
    • 4.6 t-test
  • Models
    • 6.1 Linear regression
    • 6.2 Logistic regression
    • 6.3 Mixed-effects regression
    • 6.4 Poisson regression
    • 6.5 Ordinal regression
  • Machine Learning
    • 7.1 Tree-based methods
    • 7.2 Gradient boosting
    • 7.3 PCA
    • 7.4 EFA
    • 7.5 Clustering
  1. 7. Machine Learning
  2. 7.5 Clustering
  • 7. Machine Learning
    • 7.1 Tree-based methods
    • 7.2 Gradient boosting
    • 7.3 PCA
    • 7.4 EFA
    • 7.5 Clustering

On this page

  • Recommended reading
  • Preparation
  • Clustering algorithms
    • \(k\)-means
    • Partitioning around medoids (PAM)
    • Hierarchical agglomerative clustering
  1. 7. Machine Learning
  2. 7.5 Clustering

7.5 Clustering

Author
Affiliation

Vladimir Buskin

Catholic University of Eichstätt-Ingolstadt

Recommended reading

James et al. (2021): Chapter 12

Hastie, Tibshirani, and Friedman (2017): Chapters 14.3.6, 14.3.10 & 14.3.12

Preparation

Clustering algorithms

Warning

This page is still under construction. More content will be added soon!

\(k\)-means

Partitioning around medoids (PAM)

Hierarchical agglomerative clustering

Hastie, Trevor, Robert Tibshirani, and Jerome H. Friedman. 2017. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York, NY: Springer.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2021. An Introduction to Statistical Learning: With Applications in r. New York: Springer. https://doi.org/10.1007/978-1-0716-1418-1.
7.4 EFA