Research Center for Digital Sustainability

NLP Seminar Spring 2022

This seminar provides a conceptual and practical introduction into modern Natural Language Processing (NLP) methods and technologies. Each lecture introduces a new NLP approach based on a seminal publication and including a presentation of an academic guest speaker. The NLP methods include Bag-of-words (BoW), term frequency–inverse document frequency (TF-IDF), word2vec, long short-term memory (LSTM), latent Dirichlet allocation (LDA), transformers, BERT, and GPT-3. The lectures have been recorded from the previous Semester and are (in most cases) available on BigBlueButton. Please find the schedule of the Seminar here, listing relevant research articles or blogposts. 

Each student will conduct and present an individual project related to NLP. The topic descriptions can be found here. For maximum learning opportunities from other student's presentations, active participation is mandatory.

This seminar is mandatory for all students conducting a bachelor or master thesis at the Research Center for Digital Sustainability. They will present the progress of their theses in the seminar.

The Natural Language Processing Seminar will stop with this edition because of a lack of funding.

Time, Location, and Links

Schedule 2022

Date Topic Mandatory Paper or BlogPost and Recording

Speaker

25 February 2022 Overview and introduction, NRP77 project on reidentification of Swiss judgments, presentation of topics for a thesis project Deep Learning Quiz: http://onlinemlquiz.com/deep_learning_quiz.php   
4 March 2022

Bag-of-words (BoW) and term frequency-inverse document frequency (TF-IDF)

Starting Presentations: David Bucher, Ronja Stern

A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques: https://arxiv.org/abs/1707.02919

01/10/2021

Dominic Schweizer, University of Bern
11 March 2022

word2vec

Intro to FDN-Sandbox by Adrian
Intermediate Presentation: Marco Buchholz

Word2Vec: https://jalammar.github.io/illustrated-word2vec/https://colah.github.io/posts/2014-07-NLP-RNNs-Representations/

08/10/2021

Prof. Dr. Tobias Hodel, Digital Humanities University of Bern
18 March 2022 Presentation of student project proposals

 

Students
25 March 2022

Recurrent Neural Networks

Presentation of student project proposals

LSTM: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Lecture Materials: https://drive.google.com/drive/folders/1ldet--Yjo6xos_cNnpGiqXmnV5OW-a3z 
Exercise Solutions: https://colab.research.google.com/drive/16dQdAhYfOZbPEAe-nsHOK8NuntLt0x2T?usp=sharing

22/10/2021

Dr. Mathias Müller, Postdoc and Lecturer at University of Zürich
1 April 2022

ML and NLP in industry

Starting Presentation: Tobias Brugger

Technical Debt: https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf 
MLOps: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf
(Compressible Subspace: https://arxiv.org/pdf/2110.04252.pdf)

29/10/2021

Siddhartha Singh
8 April 2022

Building Knowledge Graphs using NLP

Starting Presentation: Roman Martinez

Knowledge Graphs: https://towardsdatascience.com/the-building-a-large-scale-accurate-and-fresh-knowledge-graph-71ebd912210e

05/11/2021

Prof. Dr. Patrizio Collovà, Bern University of Applied Sciences
15 April 2022 Good Friday!

22 April 2022 Spring Break!

29 April 2022 Text Analysis with Contextualized Topic Models

Nice introduction about topic modeling: https://cacm.acm.org/magazines/2012/4/147361-probabilistic-topic-models/fulltext
Blog post about contextualized topic models for zero-shot cross-lingual prediction: https://fede-bianchi.medium.com/contextualized-topic-modeling-with-python-eacl2021-eacf6dfa576
Colab Notebook: https://colab.research.google.com/drive/1FLyZwR1Bg3ZOTLZcxCkmKUIeUFwwhnTF?usp=sharing

12/11/2021

Silvia Terragni, PhD student at University of Milano-Bicocca
6 May 2022 Transformers

Paper: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
BlogPost: https://jalammar.github.io/illustrated-transformer/

22/11/2021

Joel Niklaus, University of Bern

13 May 2022 GPT-3: Lessons from Generative Pre-Training and AI Marketing

Paper: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
BlogPost: https://medium.com/walmartglobaltech/the-journey-of-open-ai-gpt-models-32d95b7b7fb2

26/11/2021

Dr. Simon Clematide, Academic Associate at University of Zurich
20 May 2022 BERT

Paper: https://aclanthology.org/N19-1423/
BlogPost: https://jalammar.github.io/illustrated-bert/

03/12/2021

Dr. Ilias Chalkidis, NLP Postdoctoral Researcher at University of Copenhagen 
27 May 2022 Feedback for Posters Come with a first draft of the poster prepared!  
3 June 2022 Poster Session for final presentation of projects

 

Students