Data Science for Linguists 2019

Course home for
LING 1340/2340

HOME
Policies
Term project guidelines
Learning resources by topic
Schedule table

Learning Resources by Topic

      This is not just a link dump. These resources are carefully curated textbook stand-ins, and you are fully expected to learn from them! There are multiple types:

  1. Online tutorials. Watch, practice and learn. I pre-screened and narrowed down to very essential & relevant contents only, so you can stop wondering if you should learn the whole thing!
  2. Articles. Read them -- they will be referenced in lectures and used in classroom discussions.
  3. Book and book chapters. Python Data Science Handbook neatly aligns with our data science focus and doubles up as a reference book. Parts of the NLTK Book will also be referenced.
  4. Software installation links. Download and install on your machine.
  5. Bookmark pages. These are lists of useful links compiled by someone else, which often contain pointers to data sets or resources. Explore them and use them as needed; you should become familiar with what's on them.
  6. References -- for looking things up.

Linguistic Data, Open Access, Data Publishing

Corpus Linguistics

Linguistic Annotation, Ontology, and Knowledge Engineering

Speech and Multimedia Data

Statistics References

Data Processing Fundamentals: Python’s numpy, pandas, and visualization libraries

Data Mining & Machine Learning

Big Data Essentials

Tools

Below focuses more on the software tools side of resources.

Git and GitHub

Markdown

Anaconda and Jupyter Notebook

Command-line, Bash and Unix Tools

Text Editor

The topics below are not among the focus areas of this course, but parts of them will be relevant. They are provided for reference.

Natural Language Processing, NLTK, Computational Linguistics

Python References