Course home for
LING 1340/2340
HOME 
• Policies
• Term project guidelines
• Learning resources by topic
• Schedule table
*Class schedule is subject to revision throughout the semester.
| W | Date | Due (before class @ 10:45am) | Topics Tools |
|
| #To-do/Homework Project |
||||
| 1 | 1/8 | [slides] Course introduction, setup | ||
| 1/10 | #1 | [slides] Data management and version control | ||
| 1/12 | #2 | [slides] Linguistic datasets | ||
| 2 | 1/17 (W) |
Homework 1: Explore linguistic data | [slides] Processing linguistic data | |
| 1/19 (F) |
#3 | Data processing fundamentals | [slides, JNB] Python's numpy library | |
| 3 | 1/22 | #4 | [slides, JNB] Data frames with pandas | |
| 1/24 | #5 | [slides, JNB] More pandas | ||
| 1/26 | [JNB] Pandas wrap | |||
| 4 | 1/29 | #6 | Statistics | [JNB] Statistics crash course |
| 1/31 | - | [JNB] Stats (ctd), visualization | ||
| 2/2 | Homework 2: Process the ETS corpus (part 1) | [JNB] Stats wrap, visualization | ||
| 5 | 2/5 | Homework 2 (part 2) | [JNB] HW2 review | |
| 2/7 | #7 | HW2 review continued | ||
| 2/9 | #8 (due @9:30am!!) | Open access & data publishing, Data mining | Guest speaker: Dominic Bordelon (Pitt Library) | |
| 6 | 2/12 | [slides] Corpus data formats, conversion | ||
| 2/14 | - | Corpora, Annotation | [slides, JNB] Formats, social media and web mining | |
| 2/16 | #9 | [slides] Web mining, linguistic annotation | ||
| 7 | 2/19 | #10 | [slides] Annotation continued | |
| 2/21 | - | [slides] Annotation wrap | ||
| 2/23 | Machine learning | [JNB1] Regression | ||
| 8 | 2/26 | #11 | [JNB1, JNB2] Classifiers: count vectors, TF-IDF | |
| 2/28 | - | [JNB2, JNB3] Naive Bayes, pipelines, categorical data | ||
| 3/1 | - | [JNB3] SVC, categorical data, cross-validation | ||
| 9 | 3/4 | Homework 3: Machine Learning with ETS data | ML (ctd) | [slides] GitHub collaboration, cross-validation, feature weights |
| 3/6 | - | [JNB2, JNB1] HW 3 review | ||
| 3/8 | #12 | [JNB1, JNB3] HW 3 review | ||
| No class: Spring break | ||||
| 10 | 3/18 | - | ML (ctd) | [JNB3] HW3 wrap: dimensionality reduction, ensemble model |
| 3/20 | - | Big data at CRC, and Machine learning (ctd), and Advanced NLP | [slides] Shell, command line | |
| 3/22 | - | [slides] Command line tools | ||
| 11 | 3/25 | #13 | [slides] Supercomputing, running jobs on CRC | |
| 3/27 | #14 | [slides] Big data wrangling, OnDemand on CRC | ||
| 3/29 | - | [slides, JNB1, JNB2] Computational efficiency, big data wrangling on CRC, advanced NLP | ||
| 12 | 4/1 | Homework 4 | [JNB3, JNB4] Clustering & topic modeling; grid search & parallel processing | |
| 4/3 | #15 | [slides] Text generation with TensorFlow by Ashley Feiler | ||
| 4/5 | - | Speech & multimedia | [slides] Speech data and corpora | |
| 13 | 4/8 | [slides] Speech data tools, forced aligner | ||
| 4/10 | -- | [slides, JNB] Montreal Forced Aligner, ASR | ||
| 4/12 | -- | [slides] ELAN for APLS by Maya Asher | ||
| 14 | 4/15 | RH | ||
| 4/17 | MA, DA | |||
| 4/19 | TD, MP | |||
| 15 | 4/28 (11pm) | Finals week | ||