The goal of this course is to give a broad but detailed introduction
to the key algorithms and modeling techniques used for Natural
Language Processing (NLP) today. With a few exceptions, NLP involves
taking a sequence of words as input (e.g. a sentence) and returning
some annotation(s) for that string. Well-known examples of this
include part-of-speech tagging and syntactic parsing. Many other
common tasks, e.g. shallow parsing or named-entity recognition, can be
easily recast as tagging tasks; hence certain basic techniques can be
widely applied within NLP. Applications such as automatic speech
recogntion, machine translation, information extraction, and question
answering all make use of NLP techniques. By the end of this course,
you should understand how to approach common natural language problems
arising in these and other applications.
Prerequisites
There is no official programming language for this course, but there
will be a fair amount of programming required to complete assignments,
hence facility with some programming language is assumed.
Grading
10% of your grade will depend on in-class discussion, 50% on the homeworks and 20% each on the midterm and final.
What we'll cover and an approximate schedule
Roughly speaking, half of the course will be devoted to finite-state
methods, and half to context-free methods (or beyond). Algorithms for
annotating linguistic structure will always be presented with
statistical variants, which provide the basis for disambiguation.