NLP Seminar Spring 2022

This seminar provides a conceptual and practical introduction into modern Natural Language Processing (NLP) methods and technologies. Each lecture introduces a new NLP approach based on a seminal publication and including a presentation of an academic guest speaker. The NLP methods include Bag-of-words (BoW), term frequency–inverse document frequency (TF-IDF), word2vec, long short-term memory (LSTM), latent Dirichlet allocation (LDA), transformers, BERT, and GPT-3. The lectures have been recorded from the previous Semester and are (in most cases) available on BigBlueButton. Please find the schedule of the Seminar here, listing relevant research articles or blogposts.

Each student will conduct and present an individual project related to NLP. The topic descriptions can be found here. For maximum learning opportunities from other student's presentations, active participation is mandatory.

This seminar is mandatory for all students conducting a bachelor or master thesis at the Research Center for Digital Sustainability. They will present the progress of their theses in the seminar.

The Natural Language Processing Seminar will stop with this edition because of a lack of funding.

Time, Location, and Links

Every Friday from 10:15h to 12:00h
Seminar Room 105, Main Building H4, 3012 Bern
Template Slide for Progess Report: https://docs.google.com/presentation/d/1NdaxR1CH34tKsD1ZVo3ZHp2wyg91X4oBzSlab_SIuyA/edit?usp=sharing
Hybrid on BigBlueButton: https://bbb.ch-open.ch/b/joe-2kn-jun-aqw
Telegram Chat: https://t.me/+cf5b8mOgWE4wOTI0
ILIAS: https://ilias.unibe.ch/goto_ilias3_unibe_crs_2281872.html
KSL: https://www.ksl.unibe.ch/KSL/kurzansicht?18&stammNr=471397&semester=FS2022&lfdNr=0

Schedule 2022

Date	Topic	Mandatory Paper or BlogPost and Recording	Speaker
25 February 2022	Overview and introduction, NRP77 project on reidentification of Swiss judgments, presentation of topics for a thesis project	Deep Learning Quiz: http://onlinemlquiz.com/deep_learning_quiz.php
4 March 2022	Bag-of-words (BoW) and term frequency-inverse document frequency (TF-IDF) Starting Presentations: David Bucher, Ronja Stern	A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques: https://arxiv.org/abs/1707.02919 01/10/2021	Dominic Schweizer, University of Bern
11 March 2022	word2vec Intro to FDN-Sandbox by Adrian Intermediate Presentation: Marco Buchholz	Word2Vec: https://jalammar.github.io/illustrated-word2vec/, https://colah.github.io/posts/2014-07-NLP-RNNs-Representations/ 08/10/2021	Prof. Dr. Tobias Hodel, Digital Humanities University of Bern
18 March 2022	Presentation of student project proposals		Students
25 March 2022	Recurrent Neural Networks Presentation of student project proposals	LSTM: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ Lecture Materials: https://drive.google.com/drive/folders/1ldet--Yjo6xos_cNnpGiqXmnV5OW-a3z Exercise Solutions: https://colab.research.google.com/drive/16dQdAhYfOZbPEAe-nsHOK8NuntLt0x2T?usp=sharing 22/10/2021	Dr. Mathias Müller, Postdoc and Lecturer at University of Zürich
1 April 2022	ML and NLP in industry Starting Presentation: Tobias Brugger	Technical Debt: https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf MLOps: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf (Compressible Subspace: https://arxiv.org/pdf/2110.04252.pdf) 29/10/2021	Siddhartha Singh
8 April 2022	Building Knowledge Graphs using NLP Starting Presentation: Roman Martinez	Knowledge Graphs: https://towardsdatascience.com/the-building-a-large-scale-accurate-and-fresh-knowledge-graph-71ebd912210e 05/11/2021	Prof. Dr. Patrizio Collovà, Bern University of Applied Sciences
15 April 2022	Good Friday!	–	–
22 April 2022	Spring Break!	–	–
29 April 2022	Text Analysis with Contextualized Topic Models	Nice introduction about topic modeling: https://cacm.acm.org/magazines/2012/4/147361-probabilistic-topic-models/fulltext Blog post about contextualized topic models for zero-shot cross-lingual prediction: https://fede-bianchi.medium.com/contextualized-topic-modeling-with-python-eacl2021-eacf6dfa576 Colab Notebook: https://colab.research.google.com/drive/1FLyZwR1Bg3ZOTLZcxCkmKUIeUFwwhnTF?usp=sharing 12/11/2021	Silvia Terragni, PhD student at University of Milano-Bicocca
6 May 2022	Transformers	Paper: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf BlogPost: https://jalammar.github.io/illustrated-transformer/ 22/11/2021	Joel Niklaus, University of Bern
13 May 2022	GPT-3: Lessons from Generative Pre-Training and AI Marketing	Paper: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf BlogPost: https://medium.com/walmartglobaltech/the-journey-of-open-ai-gpt-models-32d95b7b7fb2 26/11/2021	Dr. Simon Clematide, Academic Associate at University of Zurich
20 May 2022	BERT	Paper: https://aclanthology.org/N19-1423/ BlogPost: https://jalammar.github.io/illustrated-bert/ 03/12/2021	Dr. Ilias Chalkidis, NLP Postdoctoral Researcher at University of Copenhagen
27 May 2022	Feedback for Posters	Come with a first draft of the poster prepared!
3 June 2022	Poster Session for final presentation of projects		Students

Digital Sustainability Group

NLP Seminar Spring 2022

Time, Location, and Links

Schedule 2022