Deep learning for natural language processing: Schedule

Lecture 1

  • Deep learning

  • What makes NLP different from other ML problems?

  • Word embeddings

  • Language modeling download notes

  • N-grams models

Download slides

Readings to go further:

Lecture 2

  • Reccurent neural networks

  • Application to language modeling, machine translation, sentence classification, word tagging

  • (Soft) memory write and erase

  • Long short-term memory (LSTM)

  • Bidirectionnal RNN, multi-layer and multi-stack architectures

Download slides

Readings to go further:

Blog posts:

Lecture 3

  • Sequence to sequence with attention

  • Interpretability of attention

  • Formal definition of attention and extensions

  • Transformers

Download slides

Readings to go further:

The transformer architecture is based on “tricks” that were proposed in these papers:

Advanced topics:

Blog post:

Lecture 4

  • Word Representation, transfer learning

  • Distributional semantics

  • word2vec: skip-grams, CBOW

  • Elmo, Bert

  • Segmentation, BPE

Download slides

Prompting methods

Recently, the development of large pre-trained language model as lead to new type of prediction models called “prompting”, i.e. where the language model is use to make a prediction. I don't have time to cover this topic in the course, but I strongly advise you to read the following paper on this topic if you are interested in NLP:

References:

Useful notes:

To go further: