Seminar on Natural Language Processing (NLP)
This seminar provides a conceptual and practical introduction into modern Natural Language Processing (NLP) methods and technologies. Each lecture introduces a new NLP approach based on a seminal publication and including a presentation of an academic guest speaker. The NLP methods include Bag-of-words (BoW), term frequency–inverse document frequency (TF-IDF), word2vec, long short-term memory (LSTM), latent Dirichlet allocation (LDA), transformers, BERT, and GPT-3.
Before each lecture, the students have to read the indicated research article and ask a key question for the discussion. In addition, each student has to conduct and eventually present a personal project related to NLP. This seminar is mandatory for all students conducting a bachelor or master thesis at the Research Center for Digital Sustainability.
Time, Location, and Links
- Every Friday from 10:15h to 12:00h
- Room 107, Campus Engehalde, Schützenmattstrasse 14, 3012 Bern
- Hybrid on BigBlueButton: https://bbb.ch-open.ch/b/joe-2kn-jun-aqw
- ILIAS: https://ilias.unibe.ch/ilias.php?ref_id=2139552&cmdClass=ilobjcoursegui&cmd=view&cmdNode=11n:pq&baseClass=ilRepositoryGUI
|Date||Topic||Mandatory Paper or BlogPost||Speakers|
|24 September 2021||Overview and introduction, NRP77 project on reidentification of Swiss judgments, presentation of topics for a thesis project||Joel Niklaus and Matthias Stürmer, University of Bern|
|1 October 2021||Bag-of-words (BoW) and term frequency-inverse document frequency (TF-IDF)||
Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assefi, Saied Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, Krys Kochut - A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques, https://arxiv.org/abs/1707.02919
|Dominic Schweizer, University of Bern|
|8 October 2021||word2vec||https://colah.github.io/posts/2014-07-NLP-RNNs-Representations/||Prof. Dr. Tobias Hodel, Digital Humanities University of Bern|
|15 October 2021||Presentation of student project proposals||
Maximum 5 min per student!
|22 October 2021||Recurrent Neural Networks||
Lecture Materials: https://drive.google.com/drive/folders/1ldet--Yjo6xos_cNnpGiqXmnV5OW-a3z
|Dr. Mathias Müller, Postdoc and Lecturer at University of Zürich|
|29 Oktober 2021||ML and NLP in industry||Technical Debt: https://papers.nips.cc/paper/2015/file/86df7dcfd896fcaf2674f757a2463eba-Paper.pdf
(Compressible Subspace: https://arxiv.org/pdf/2110.04252.pdf)
|5 November 2021||Building Knowledge Graphs using NLP||https://towardsdatascience.com/the-building-a-large-scale-accurate-and-fresh-knowledge-graph-71ebd912210e||Prof. Dr. Patrizio Collovà, Bern University of Applied Sciences|
|12 November 2021||Latent Dirichlet allocation (LDA)||Silvia Terragni, PhD student at University of Milano-Bicocca|
|19 November 2021||Transformers||Joel Niklaus and Matthias Stürmer, University of Bern|
|26 November 2021||GPT3||Dr. Simon Clematide, Academic Associate at University of Zurich|
|3 December 2021||BERT||Dr. Ilias Chalkidis, NLP Postdoctoral Researcher at University of Copenhagen|
|10 December 2021||Student final presentations||
Maximum 10 min per student!
|17 December 2021||Student final presentations||Maximum 10 min per student!
Possible talking points:
What are your results (e.g. coverage)? What were the difficulties you faced and how did you deal with them? What methods worked best? What did you learn?
|24 December 2021||no lecture|