Dalhousie University    [  http://web.cs.dal.ca/~vlado/csci6509/coursecalendar.html  ]
Winter 2012 (Jan4-Apr5)
Faculty of Computer Science
Dalhousie University

CSCI 4152/6509 - Course Calendar

[ Home | Calendar | Project | A0 | A1 | A2 | A3 | A4 | A5 ]
#DateTitle 
1 We Jan 4Course Introduction
Course information: logistics and administrivia, textbook and other main references, evaluation scheme, academic integrity policy, tentative course schedule; Introduction to NLP: what is a natural language and other kinds of languages; Handouts: course home page, tentative schedule, academic integrity policy, A0, coursenotes.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Ch.1
A0 out
2 Fr Jan 6Introduction to NLP
NLP Applications, NLP as a Research Area, short history of NLP, levels of NLP, some reasons why NLP is hard, ambiguities at different levels of NLP, examples of lexical and syntactic ambiguities.
Files: Slides (PDF), Lecture notes (PDF).
 
  Part I: Background
3 Mon Jan 9Words and Morphology
Project handout, project deliverables; ambiguities at all levels of NLP, examples of ambiguities; Part I: Linguistic background, Words and morphology, Elements of morphology: morphemes, stems, affixes, stemming; morphological processes: inflection, derivation, compounding, clitics; Parts-of-speech (POS), POS tagging, open and closed categories, POS tag sets; Closed word categories, determiners (DT), interrogative determiners (WDT), predeterminers (PDT), personal pronouns (PRP), possessive pronouns (PRP$), wh-pronouns (WP) and wh-possessive (WP$), prepositions (IN), particles (RP), possessive ending (POS), modal verbs (MD).
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Sec 3.1; 5.1-5.3
 
4 Wed Jan 11Open Word Categories
Infinitive word 'to' (TO), qualifiers (RB), wh-adverbs (WRB), conjunctions (CC), interjections (UH); Open word categories: nouns (NN, NNS, NNP, NNPS), adverbial nouns, adjectives (JJ, JJR, JJS), numbers (CD), verbs (VB, VBP, VBZ, VBG, VBD, VBN); adverbs (RB, RBR, RBS); Remaining POS classes: existential there (EX), foreign words (FW), list items (LS), punctuation; POS tagging examples; Syntax: phrase structure, phrases, clauses, sentences; parsing, parse tree examples.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Ch 12
 
L1 Wed Jan 11Perl Tutorial 1 Files: Slides (PDF). 
5 Fri Jan 13Syntax
Aside: Festival speech generation demo: http://festvox.org/voicedemos.html, Context-Free Grammars (CFG) review, examples. induced grammar, parse trees, derivations, language generated by a CFG, left-most and right-most derivations, ambiguous sentences, bracketed representation of parse trees; Typical phrase structure rules in English: Sentence (S), Noun Phrase (NP), Verb Phrase (VP), Prepositional Phrase (PP), Adjective Phrase (ADJP), Adverbial Phrase (ADVP). Are NLs context-free?
Files: Slides (PDF), Lecture notes (PDF).
A0 due
6 Mon Jan 16Semantics and Pragmatics
Aside: WordNet http://wordnet.princeton.edu, Wiktionary http://en.wiktionary.org. Natural Language Phenomena: agreement, movement, subcategorization; Heads and dependency; head-feature principle, dependency trees, arguments and adjuncts; Elements of semantics: semantic analysis.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Ch 17-17.2 (Representation of meaning).
A1 out
  Part II: Statistical Approach to NLP
7 Wed Jan 18Introduction to Stochastic NLP
Aside: NL Principles in Perl. Lexical semantics: word senses and basic concepts; metonymy, WordNet resource, semantic relations between words, semantic compositionality, semantic roles; Part II: Statistical approach to NLP, logical and plausible reasoning, two paradigms of NLP: logical and plausible, counting words and n-grams, Zipf's law, Elements of information retrieval, basic task definition of ad-hoc retrieval, typical IR system architecture.
Files: Slides (PDF), Lecture notes (PDF). Reading: 18.6 (Idioms and Compositionality), 19-19.3 (Lexical Semantics and WordNet)
 
L2 Wed Jan 11Perl Tutorial 2 Files: Slides (PDF). 
8 Fri Jan 20Vector Space Model
Aside: Lucene. Steps in document and query processing in IR, vector space model, tfidf term weighting formula, cosine similarity measure, term-by-document matrix, reducing the number of dimensions, Latent Semantic Analysis, IR evaluation: precision, recall, and other measures.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] 23.1 (Information Retrieval), [MS] Ch.15 (Topics in Information Retrieval).
A2 out
9 Mon Jan 23Text Mining
Aside: Introduction to IR on-line book http://nlp.stanford.edu/IR-book/html/htmledition/irbook.html. Text mining: text mining tasks, Text classification, problem definition, types of text classification, evaluation measures in text classification, micro- and macro-averaging, Evaluation methods for classification: general issues - overfitting and underfitting, methods: 1. training error, 2. train and test, 3. n-fold cross-validation; Parser evaluation: PARSEVAL measures.
Files: Slides (PDF), Lecture notes (PDF). Reading: [MS] Ch. 16 (Text Categorization)
A1 due
10 Wed Jan 25Text Clustering and CNG Classification
Aside: SpamAssassin http://spamassassin.apache.org/. Parser evaluation: PARSEVAL measures, labeled and unlabeled precision and recall, F-measure; Text clustering: task definition, the simple k-means method, hierarchical clustering, divisive and agglomerative clustering; evaluation of clustering: inter-cluster similarity, cluster purity, use of entropy or information gain; CNG -- Common N-Grams classification method. Elements of probability theory.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] 14.7 (page 479, Evaluating parsers)
 
L3 Wed Jan 25Perl Tutorial 3 Files: Slides (PDF). 
11 Fri Jan 27Probabilistic Modeling
Aside: Roulette Wheel applet by David Little. Handout: cng-paper.pdf. Generative models, Bayesian inference, Probabilistic modeling: random variables, random configurations, computational tasks in probabilistic modeling, spam detection example, joint distribution model; fully independent model: example, computational tasks.
Files: Slides (PDF), Lecture notes (PDF).
 
12 Mon Jan 30Naive Bayes Model
Sum-product formula; pros and cons of fully independent model; Naive Bayes model: basic idea, assumption, computational tasks, graphical representation, example, number of paramenters, pros and cons.
Files: Slides (PDF), Lecture notes (PDF).
A2 due
13 Wed Feb 1N-gram Model
N-gram model, N-gram model, language modeling in speech recognition, n-gram model assumption, graphical representation, use of log probabilities; Markov chain: stochastic process, Markov process, Markov chain; Perplexity and evaluation of N-gram models, Text classification using language models; Add-one (Laplace) smoothing.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Ch. 4
 
  Fri Feb 3Munro Day, University closed, no class  
14 Mon Feb 6Hidden Markov Model
Witten-Bell smoothing (Witten-Bell discounting); Hidden Markov Model (HMM): DFA-based representation, definition, graphical representation, HMM assumption, POS tagging example, learning parameters.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Sec. 5.5 (HMM POS Tagging)
A3 out
15 Wed Feb 8Viterbi Algorithm
HMM Tagging example: brute-force approach, Viterbi algorithm. Bayesian Networks: graphical representation, Bayesian networks assumption, conditional probability tables.
Files: Slides (PDF), Lecture notes (PDF).
P0 due
L4 Wed Feb 8Guest Lecture
Yannick Marchand: From Speech Synthesis by Analogy to Computational Simulations of human reading.
 
16 Fri Feb 10Inference in Bayesian Networks
Bayesian Networks defininition (review); Bayesian Networks: number of parameters, representational power, computational tasks: evaluation and simulation; Inference in Bayesian Networks, NP-hard if max. number of parents is 2.
Files: Slides (PDF), Lecture notes (PDF).
 
17 Mon Feb 13Sum-Product Algorithms
BN inference example using brute force, Sum-product algorithm: main steps, factor graph and message passing, marginalization in one variable, "hard-wiring" variables, marginalization with several variables, conditioning and competion; burglar-earthquake example.
Files: Slides (PDF), Lecture notes (PDF).
 
18 Wed Feb 15Project Discussion (P0 discussion), Sum-Product Algorithm Example 2
Individual discussion with students about P0 submissions. Sum-Product completion HMM example.
Files: Slides (PDF), Lecture notes (PDF).
 
L5 Wed Feb 15Discussion about Assignment 2  
19 Fri Feb 17Probabilistic Context-Free Grammars
HMM as a Bayesian Network, example; Probabilistic Contest-Free Grammar (PCFG): motivation, PCFG as a probabilistic model, computational tasks for PCFG model: evaluation, learning, simulation, proper PCFG.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] Chapters 13 and 14.
A3 due
  Mon Feb 20Study break Mon-Fri, Feb 20-24  
20 Mon Feb 27CYK Parsing
Discussion about P1. Chomsky normal form, CYK algorithm by example.
Files: Slides (PDF), Lecture notes (PDF).
 
21 Wed Feb 29Probabilistic CYK Parsing
CFG parsing using CYK, PCFG Marginalization using CYK-style algorithm, example; PCFG conditioning; PCFG completion using CYK-style algorithm, example; Topics related to PCFGs: PCFG as a BN; Issues with PCFGs: structural dependencies, lexical dependencies; Probabilistic lexicalized CFGs.
Files: Slides (PDF), Lecture notes (PDF).
P1 due
  Part III: Unification-based approach to NLP
22 Fri Mar 2First-order Predicate Logic and Resolution
Unification-based approach to NLP: bits of history; First-order predicate logic: constants, variables, functions, terms, predicates, formulae, sentences, axioms, theorems, inference rules; examples, Resolution-based inference system by Robinson.
Files: Slides (PDF), Lecture notes (PDF).
A4 out
23 Mon Mar 5Unification
Resolution inference example, substitution, unifiers and unifiability, composition of substitutions, most general unifier, Robinson's unification algorithm
Files: Slides (PDF), Lecture notes (PDF).
 
24 Wed Mar 7Efficient Unification, Prolog
Exponential running time of the Robinson's algorithm, Unification using graph representatino, Huet's unification algorithm, example. Prolog overview: Horn clauses, rules and facts, other Prolog elements.
Files: Slides (PDF), Lecture notes (PDF).
 
L6 Wed Mar 7Prolog Tutorial 1 Files: Prolog - Quick overview (PDF), Prolog slides (PDF). 
25 Fri Mar 9DCG, Feature Structures
Using Prolog to parse natural language, Definite Clause Grammars (DCG), DCG example with parse tree, handling agreement, embedded code, expressing PCFG; Unification-based grammars using feature structures, feature structures or attribute-value matrices, DCG expressed using AVMs, re-entrancy and cyclic AVMs.
Files: Slides (PDF), Lecture notes (PDF). Reading: [JM] chapter 15
 
  Mon Mar 12No class. Make-up class to be scheduled in a Wednesday lab time  
26 Wed Mar 14Feature Structure Unification and Chart Parsing
Lists in AVMs, PATR-II notation style; Feature structure unification: graph representation, example, Huet's unification algorithm; Parsing unification-based grammars: basic notions in chart parsing, top-down chart parsing (Earley's algorithm)
Files: Slides (PDF), Lecture notes (PDF).
A4 due
L7 Wed Mar 14Prolog Tutorial 2  
27 Fri Mar 16Unification-based Grammar Example
Bottom-up chart parsing; Example of a unification-based grammar.
Files: Slides (PDF), Lecture notes (PDF).
 
28 Mon Mar 19 Course Evaluation, Type Hierarchies.
Type hierarchies: trivial hierarchy, tree hierarchy, bounded complete partial order, type unification in BCPO; Implementing type hierarchies: unpacked-bit representation, packed bit representation.
Files: Slides (PDF), Lecture notes (PDF).
 
  Part IV: Course review
29 Wed Mar 21Course Review
Review of lectures 1 to 21.
Files: Slides (PDF), Lecture notes (PDF).
A5 out
  Part V: Student Presentations
30 Fri Mar 23Student presentations (19-jewkes), Course Review 2 Files: Slides (PDF), Lecture notes (PDF). 
31 Mon Mar 26Student presentations (13-shali,14-mosquera,15-jyoti)  
32 Wed Mar 28Student presentations (10-jankowsk,11-atwater,12-story)  
33 Wed Mar 28(1:30pm) Student presentations (16-vsingh,17-atkinson,18-rouleau)  
34 Fri Mar 30Student presentations (7-patra,8-raza,9-zai) A5 due
35 Mon Apr 2 Student presentations (4-sajadi,5-niri,6-mukanova)  
36 Wed Apr 4 Student presentations (1-mehta,2-ashah,3-soni)  
  Thu Apr 5        Project Reports due Report due
  Th Apr 12Final Exam (15:30-17:30, Dalplex) (Exam schedule) Exam

© 2002-2012 Vlado Keselj, last update: 12-Apr-2012