Language Science 159/259: Language Processing
1 Course Information
Lecture times | Tuesdays & Thursdays 1-2:20pm |
Lecture Location | https://us02web.zoom.us/j/8297737939 |
Syllabus | https://www.socsci.uci.edu/~rfutrell/teaching/lsci159-f2020 |
Canvas site | https://canvas.eee.uci.edu/courses/31401 (159) https://canvas.eee.uci.edu/courses/31402 (259) |
2 Instructor Information
Instructor | Richard Futrell (rfutrell@uci.edu) |
Instructor's office | SSPB 2215 |
Instructor's office hours | T 4pm |
3 Course Description
This course is on human language processing: what is the process in the human mind that converts language to meaning and meaning to language. We will cover quantitative computational theories of language understanding, based primarily on information theory, in addition to experimental literature. Students will learn how to formulate and test precise theories of how language processing works by discussing and evaluating state-of-the-art research papers.
Detailed topics: Bayesian inference and information theory as underlying principles of language processing, speech perception, noisy channel models of sentence understanding, human languages as efficient codes for meaning, (probabilistic) context-free grammars for modeling syntactic structure, working memory effects on sentence processing, language evolution and how constraints on language processing shape languages.
4 Course Format
Class time will be spent on a mixture of lectures and seminar-style discussions about research papers. Homework will consist of (1) short problem sets and (2) short responses to research articles that we will be reading.
The class has two sections: Lsci 159 and Lsci 259. Lsci 159 is the undergraduate class. Lsci 259 is the graduate class, which has different assignments.
Instead of problem sets, graduate students taking Lsci 259 will be required to write two review articles, reviews of primary literature readings that go beyond the required readings for the class.
5 Intended audience
This course is intended for advanced undergraduates and graduate students studying language science, cognitive science, computer science, psychology, languages, and related fields. Some background in linguistics, such as Lsci 3, will make the class easier, but we will be reviewing the necessary concepts from linguistics as we go.
We will be developing models using some probability theory: some background in probability will make the class and readings easier, but we will introduce/review the necessary concepts early on in the class. We will not do any math beyond high school algebra.
6 Readings
We will have two kinds of readings: background readings and primary-literature readings.
There is no course textbook. You don't need to buy anything for this course. All readings are provided as pdf documents either here or on the Canvas site. Some of the pdf documents are password-protected. You can find the password in the announcements on the Canvas site.
- Background readings. (Optional) These are readings taken from textbooks which provide context and orientation to a problem we are studying. You will not be directly assessed on your knowledge of these background readings, but you will find that reading them makes lectures and the primary-literature readings dramatically more comprehensible.
- Primary-literature readings. (Mandatory) These are research articles taken from scholarly journals, intended for a scientific audience of other researchers. For each primary-literature reading, you will be completing short discussion questions, as described below. These are articles written by scientists for other scientists; they are not intended for a general audience. So you might find them hard to understand at first. One of the skills we will be learning in this class is how to properly read, understand, and evaluate these articles.
The background readings are drawn from these books:
- J&M — Dan Jurafsky & James Martin (2018). Speech and Language Processing, 3rd edition.
- Sedivy — Julie Sedivy (2018). Language in Mind: An Introduction to Psycholinguistics. Oxford University Press.
- Gleick — James Gleick (2011). The Information: A History, a Theory, a Flood. Pantheon Books.
7 Syllabus (subject to modification)
Day | Topic | Background Reading | Primary-literature reading | Deadlines |
---|---|---|---|---|
10/1 | Introduction | |||
10/6 | Probability and inference | Intro to Bayes' Rule | ||
10/8 | Probability and inference | Sedivy 4.3 | ||
10/13 | Probability and inference | Gibson et al. (2013) | ||
10/15 | Information theory | Gleick Ch. 7 | ||
10/20 | Information theory | 159: Problem Set 1 due | ||
10/22 | Information theory | Mahowald et al. 2013 | ||
10/27 | Information theory / Ambiguity | Sedivy 8-8.2 | ||
10/29 | Ambiguity | |||
11/3 | Election Day (no class) | |||
11/5 | Ambiguity | Tanenhaus et al. (1995) | 259: Review Paper 1 due; 159: Problem Set 2 due | |
11/10 | Parsing | Sedivy 8.3 J&M 11-11.3 | ||
11/12 | Parsing | |||
11/17 | Parsing | |||
11/19 | Prediction | Sedivy 8.4 | ||
11/24 | Prediction | Altmann & Kamide (1999) | 159: Problem Set 3 due | |
11/26 | Thanksgiving Break (no class) | |||
12/1 | Memory | Sedivy 8.5 J&M 14-14.4 | ||
12/3 | Memory | Futrell et al. (2015) | ||
12/8 | Memory | |||
12/10 | Language Evolution | 159: Problem Set 4 Due; 259: Review Paper 2 due |
8 Requirements & Grading
Grade breakdown
For students taking Lsci 159:
Work Grade percentage Discussion Questions 40% Participation 20% Problem sets 40% For graduate students taking Lsci 259:
Work Grade percentage Discussion Questions 20% Participation 20% Review papers 60%
- Description of requirements
- Participation. Synchronous attendance is required on days that we are discussing a Primary-Literature Reading. A full participation score requires that you attend all of these sessions, and participate actively in class.
- Discussion Questions. For each primary-literature reading, you will be required to produce a response to discussion questions with your reactions and thoughts about the article. This consists of answers to three discussion questions that I will provide for each article. Responses to these questions should be completed 1 hour before class, so that I can review your responses ahead of the classroom discussion. Discussion questions about a paper will be made available 4 days before the class where we discuss that paper.
- Problem sets. For students taking 159, there will be short problem sets for each course unit. These will involve some math, some short answers, and some interpretation of figures. The goal of the problem sets is to give you a deeper understanding of (1) how the theories studied in this class work, and (2) how to interpret adn evaluate experiments that test those theories. The problem sets will not require any programming.
- Review papers. For graduate students taking 259 only, you will be required to write two review papers in which you critically evaluate primary literature readings. More info on review papers, including the list of papers you can choose from.
Assignment late policy
Assignments (discussion questions, problem sets, and review papers) are due at 5pm on the deadline indicated in the schedule above. Assignments can be turned in up to 7 days late; 10% of your score will be deducted for each 24 hours of lateness (rounded up). For example, if an assignment is worth 80 points, you turn it in 3 days late, and earn a 70 before lateness is taken into account, your score will be (1-0.3)*70=49.
Working together
You may work together on problem sets, but the final writeups that you turn in must be written by you alone.
Mapping of class score to letter grade
I grade the course on a curve, but I guarantee minimum grades based on these thresholds:
Threshold Guaranteed minimum grade >= 90% A >= 80% B >= 70% C >= 60% D So for example a score of 90.0001% guarantees you an A-, but you could end up with a higher grade due to the curve.
9 Academic Integrity
We will be adhering fully to the standards and practices set out in UCI's policy on academic integrity.