Course home for
LING 1340/2340
HOME
• Policies
• Term project guidelines
• Learning resources by topic
• Schedule table
*Class schedule is subject to revision throughout the semester.
W | Date | Due (before class @ 10:45am) | Topics Tools |
|
#To-do/Homework Project |
||||
1 | 1/8 | [slides] Course introduction, setup | ||
1/10 | #1 | [slides] Data management and version control | ||
1/12 | #2 | [slides] Linguistic datasets | ||
2 | 1/17 (W) |
Homework 1: Explore linguistic data | [slides] Processing linguistic data | |
1/19 (F) |
#3 | Data processing fundamentals | [slides, JNB] Python's numpy library | |
3 | 1/22 | #4 | [slides, JNB] Data frames with pandas | |
1/24 | #5 | [slides, JNB] More pandas | ||
1/26 | [JNB] Pandas wrap | |||
4 | 1/29 | #6 | Statistics | [JNB] Statistics crash course |
1/31 | - | [JNB] Stats (ctd), visualization | ||
2/2 | Homework 2: Process the ETS corpus (part 1) | [JNB] Stats wrap, visualization | ||
5 | 2/5 | Homework 2 (part 2) | [JNB] HW2 review | |
2/7 | #7 | HW2 review continued | ||
2/9 | #8 (due @9:30am!!) | Open access & data publishing, Data mining | Guest speaker: Dominic Bordelon (Pitt Library) | |
6 | 2/12 | [slides] Corpus data formats, conversion | ||
2/14 | - | Corpora, Annotation | [slides, JNB] Formats, social media and web mining | |
2/16 | #9 | [slides] Web mining, linguistic annotation | ||
7 | 2/19 | #10 | [slides] Annotation continued | |
2/21 | - | [slides] Annotation wrap | ||
2/23 | Machine learning | [JNB1] Regression | ||
8 | 2/26 | #11 | [JNB1, JNB2] Classifiers: count vectors, TF-IDF | |
2/28 | - | [JNB2, JNB3] Naive Bayes, pipelines, categorical data | ||
3/1 | - | [JNB3] SVC, categorical data, cross-validation | ||
9 | 3/4 | Homework 3: Machine Learning with ETS data | ML (ctd) | [slides] GitHub collaboration, cross-validation, feature weights |
3/6 | - | [JNB2, JNB1] HW 3 review | ||
3/8 | #12 | [JNB1, JNB3] HW 3 review | ||
No class: Spring break | ||||
10 | 3/18 | - | ML (ctd) | [JNB3] HW3 wrap: dimensionality reduction, ensemble model |
3/20 | - | Big data at CRC, and Machine learning (ctd), and Advanced NLP | [slides] Shell, command line | |
3/22 | - | [slides] Command line tools | ||
11 | 3/25 | #13 | [slides] Supercomputing, running jobs on CRC | |
3/27 | #14 | [slides] Big data wrangling, OnDemand on CRC | ||
3/29 | - | [slides, JNB1, JNB2] Computational efficiency, big data wrangling on CRC, advanced NLP | ||
12 | 4/1 | Homework 4 | [JNB3, JNB4] Clustering & topic modeling; grid search & parallel processing | |
4/3 | #15 | [slides] Text generation with TensorFlow by Ashley Feiler | ||
4/5 | - | Speech & multimedia | [slides] Speech data and corpora | |
13 | 4/8 | [slides] Speech data tools, forced aligner | ||
4/10 | -- | [slides, JNB] Montreal Forced Aligner, ASR | ||
4/12 | -- | [slides] ELAN for APLS by Maya Asher | ||
14 | 4/15 | RH | ||
4/17 | MA, DA | |||
4/19 | TD, MP | |||
15 | 4/28 (11pm) | Finals week |