Information-Theoretic Approaches to Linguistics
1 Course Information
Lecture times | TF 9:35-11am |
Lecture Location | Olson 118 |
Syllabus | http://socsci.uci.edu/~rfutrell/teaching/itl-davis |
2 Instructor Information
Instructor | Richard Futrell (rfutrell@uci.edu) |
Instructor's office hours | T 1:00pm or by appointment |
Office hours location | Olson 105 |
3 Course Description
Information theory is a mathematical framework for analyzing communication systems. This course examines its applications in linguistics, especially corpus linguistics, psycholinguistics, quantitative syntax, and typology. We study natural language as an efficient code for communication. We introduce the information-theoretic model of communication and concepts of entropy, mutual information, efficiency, robustness. Information-theoretic explanations for language universals in terms of efficient coding, including word length, word frequency distributions, and trade-offs of morphological complexity and word order fixedness. Information-theoretic models of language production and comprehension, including the principle of Uniform Information Density, expectation-based models, and noisy-channel models.
4 Course Format
Course time will be spent on lectures, discussions, exercises, and demos. Evaluation will consist of a single 3-page paper presenting a proposed application of information theory to a linguistic problem and a proposed experiment or set of experiments to test the theory. There will be readings before each class, labeled "Discussion Material" in the schedule below. Lectures and in-class discussions will focus on the content from the discussion material. There are also recommended readings. Reading these will greatly increase the value of the course for you.
5 Intended audience
This course is designed for linguists of all backgrounds. A background in probability theory and computational linguistics will be very helpful but is not required.
Please fill out the introductory survey here so that I know something about your background and experience.
6 Schedule (subject to modification)
Day | Topic | Discussion Material | Recommended Readings |
---|---|---|---|
6/24 | Introduction to information theory | Gleick (2011: Ch. 7) | Pereira (2000), Goldsmith (2007) highly recommended if you do not have a background in probability theory |
6/27 | Efficient Coding and the Lexicon | Piantadosi et al. (2011) | Dye et al. (2018), Liu et al. (2019) |
7/2 | Complexity of Languages | Bentz et al. (2017) | Cotterell et al. (2018), Shannon (1951) |
7/5 | Online Processing | Smith & Levy (2013) | Jaeger (2010), Levy et al. (2009) |
7/9 | Efficiency in Syntax | Futrell et al. (2015) | Jaeger & Tily (2010), Futrell & Levy (2017), Futrell et al. (2019) |
7/12 | Lexical Semantics and the Information Bottleneck | Zaslavsky et al. (2018) | Zaslavsky et al. (2019), Gibson et al. (2017), Sims (2018) |
7/16 | Morphological Complexity | Cotterell et al. (2019) | Koplenig et al. (2017), Ackerman & Malouf (2013) |
7/19 | Learning and Algorithmic Information Theory | Hsu et al. (2013) | Piantadosi & Fedorenko (2017) |
7 Resources
- On information theory
There is a Khan Academy video course on information theory, which is highly recommended.
James Gleick wrote a popular book about information theory. The Information: A History, A Theory, a Flood.
The comprehensive textbook on information theory is Cover & Thomas (2006). Prof. Cover's lectures based on the book are online. If you have a strong math background, this is the book to work through.
A more accessible introduction is given in MacKay (2003).
A short accessible introduction is given in Cherry (1957).
On probability
If you would like to brush up on probability theory, I recommend watching John Tsitiklis's lectures.
On information-theoretic linguistics
There has been a lot of fascinating work beyond the papers listed in the schedule about applying information theory to the study of human language. Here I will give a sampler of some work which I had to leave out of the main course.
Work on information-theoretic phonology started with Cherry, Halle & Jakobson (1953). More recent work includes Goldsmith & Riggle (2011) and Hall et al. (2016).
There has been a lot of work on information-theoretic models of morphological processing. A good place to start with this stuff is Milin et al. (2009).
8 Requirements & Grading
- Grade breakdown
Your grade will be determined by your final paper. In the final paper (3 pages double-spaced), you will be asked to elaborate on a proposed application of information theory to a linguistic problem, and to propose an experiment or set of experiments to test the theory.
The final paper will be due at the beginning of class on the last day of class.