Language Science 159: Language Processing
1 Course Information
Lecture times | Tuesdays & Thursdays 12:30-1:50pm |
Lecture Location | SSL 155 |
Syllabus | https://www.socsci.uci.edu/~rfutrell/teaching/lsci159-f2019 |
Canvas site | https://canvas.eee.uci.edu/courses/20975 |
2 Instructor Information
Instructor | Richard Futrell (rfutrell@uci.edu) |
Instructor's office | SSPB 2215 |
Instructor's office hours | By appointment |
3 Course Description
This course is on human language processing: what is the process in the human mind that converts language to meaning and meaning to language. We will cover experimental studies on human language understanding, as well as models and approaches from computational linguistics. Students will learn how to formulate and test precise theories of how language processing works by discussing and evaluating state-of-the-art research papers.
Detailed topics: Bayesian inference and information theory as underlying principles of language processing, speech perception, noisy channel models of sentence understanding, human languages as efficient codes for meaning, (probabilistic) context-free grammars for modeling syntactic structure, working memory effects on sentence processing, language evolution and how constraints on language processing shape languages, distributional vector-space methods for modeling word meanings.
4 Course Format
Class time will be spent on a mixture of lectures and seminar-style discussions about research papers. Homework will consist of (1) short problem sets and (2) short responses to research articles that we will be reading.
Students may bring laptops to class as long as they are closed during lectures and discussions, unless we are using them as part of exercises.
5 Intended audience
This course is intended for advanced undergraduates studying language science, cognitive science, computer science, psychology, languages, and related fields. Some background in linguistics, such as Lsci 3, will make the class easier, but we will be reviewing the necessary concepts from linguistics as we go. We will be developing models using some probability theory: some background in probability will make the class and readings easier, but we will introduce/review the necessary concepts early on in the class. We will not do any math beyond high school algebra.
Here is a survey for students beginning the class.
6 Readings
We will have two kinds of readings: background readings and primary-literature readings.
There is no course textbook. You don't need to buy anything for this course. All readings are provided as pdf documents either here or on the Canvas site. Some of the pdf documents are password-protected. You can find the password in the announcements on the Canvas site.
- Background readings. (Optional) These are readings taken from textbooks which provide context and orientation to a problem we are studying. You will not be directly assessed on your knowledge of these background readings, but you will find that reading them makes lectures and the primary-literature readings dramatically more comprehensible.
- Primary-literature readings. (Mandatory) These are research articles taken from scholarly journals, intended for a scientific audience of other researchers. For each primary-literature reading, you will be completing a short paper response, as described below. These are articles written by scientists for other scientists; they are not intended for a general audience. So you might find them hard to understand at first. One of the skills we will be learning in this class is how to properly read, understand, and evaluate these articles.
The background readings are drawn from these books:
- J&M — Dan Jurafsky & James Martin (2018). Speech and Language Processing, 3rd edition.
- Sedivy — Julie Sedivy (2018). Language in Mind: An Introduction to Psycholinguistics. Oxford University Press.
- Gleick — James Gleick (2011). The Information: A History, a Theory, a Flood. Pantheon Books.
7 Syllabus (subject to modification)
Day | Topic | Background Reading | Primary-literature reading | Deadlines |
---|---|---|---|---|
9/26 | Introduction | |||
10/1 | Probability and inference I | Intro to Bayes' Rule | ||
10/3 | Probability and inference II | Sedivy 4.3 | ||
10/8 | Probability and inference III | Gibson et al. (2013) | ||
10/10 | Information theory I | Gleick Ch. 7 | ||
10/15 | Information theory II | Problem Set 1 due | ||
10/17 | Information theory III | Mahowald et al. (2013) | ||
10/22 | Information theory IV / Ambiguity I | Sedivy 8-8.2 | ||
10/24 | Ambiguity II | |||
10/29 | Ambiguity III | Tanenhaus et al. (1995) | ||
10/31 | Parsing I | Sedivy 8.3 J&M 11-11.3 | ||
11/5 | Parsing II | Problem Set 2 due | ||
11/7 | Parsing III | Sedivy 8.4 | ||
11/12 | Prediction I | |||
11/14 | Prediction II | Altmann & Kamide (1999) | ||
11/19 | Memory I | Sedivy 8.5 J&M 14-14.4 | ||
11/21 | Memory II | Problem Set 3 due | ||
11/26 | Memory III | Futrell et al. (2015) | ||
11/28 | Thanksgiving break | |||
12/3 | Word meaning I | Sedivy 7-7.1 J&M 6-6.3 | ||
12/5 | Word meaning II | Caliskan et al. (2017) | Problem Set 4 due |
8 Requirements & Grading
Grade breakdown
Work Grade percentage Paper responses 40% Problem sets 40% Participation 20%
- Description of requirements
- Paper responses. For each primary-literature reading, you will be required to produce a paper response with your reactions and thoughts about the article. The paper response consists of answers to three discussion questions that I will provide for each article. Paper responses should be completed 1 hour before class, so that I can review your responses ahead of the classroom discussion. Discussion questions about a paper will be made available 4 days before the class where we discuss that paper.
- Problem sets. There will be short problem sets for each course unit. These will involve some math, some short answers, and some interpretation of figures. The goal of the problem sets is to give you a deeper understanding of (1) how the theories studied in this class work, and (2) how to interpret adn evaluate experiments that test those theories. The problem sets will not require any programming.
Assignment late policy
Assignments (paper responses and problem sets) are due at 5pm on the deadline indicated in the schedule above. Assignments can be turned in up to 7 days late; 10% of your score will be deducted for each 24 hours of lateness (rounded up). For example, if an assignment is worth 80 points, you turn it in 3 days late, and earn a 70 before lateness is taken into account, your score will be (1-0.3)*70=49.
Working together
You may work together on homework, but the final writeups that you turn in must be written by you alone.
Mapping of class score to letter grade
I grade the course on a curve, but I guarantee minimum grades based on these thresholds:
Threshold Guaranteed minimum grade >= 90% A >= 80% B >= 70% C >= 60% D So for example a score of 90.0001% guarantees you an A-, but you could end up with a higher grade due to the curve.
9 Academic Integrity
We will be adhering fully to the standards and practices set out in UCI's policy on academic integrity.