Programme

All lectures will be held in the IC2 auditorium. All labs will be held in IC2 classroom.

Monday, June 12
Deep Learning: Introduction & Acceleration (Host: Kamel Guerda, Nathan Cassereau, IDRIS)

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Deep Learning introduction (Kamel Guerda, Nathan Cassereau)
10:40 – 11:00 Stretch Break
11:00 – 12:00 Laboratory
12:00 – 13:00 Lunch Break (RU)
13:00 – 15:00 Deep Learning Optimization & Acceleration (Kamel Guerda, Nathan Cassereau)
15:00 – 15:20 Stretch Break
15:20 – 15:40 Computer Lab Setup
15:40 – 17:00 Laboratory

Tuesday, June 13
Machine Translation, Host: Ondrej Bojar, (UFAL, Charles University)

Machine translation (MT) can be seen as the basis of the current amazing and amazingly popular large language models: MT always had a greed for data, the use of monolingual texts was critical for neural MT to take off and Transformers were created for MT. This lecture will provide you with the background from translation which will help you to have a realistic view on LLMs and be wary of common evaluation issues and misconceptions.

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Machine translation 1 (Ondrej Bojar)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Machine translation 2 (Ondrej Bojar)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

Wednesday, June 14
Gigamodels, Hosts: Benoît Crabbé (LLF), François Yvon (LISN)

The Large Language Models introduced in the recent years have been found extremely helpful to advance the state-of-the-art in many Natural Language Applications, notably due to their ability to compute numerical, high-dimensional, representations of linguistic units such as words or sentences. Multilingual language models go one step further and add the ability to handle multiple languages, sometimes even multiple scripts, with just one single model. In this presentation, I will discuss multilingual language models at length, how they are typically learned and used, with a focus on the measurement of their multilingual abilities. The main question I will thus try to answer is "what does it mean for a multilingual model X to cover language Y ?".

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Giga Models 1 (Benoît Crabbé)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Evaluating Multilinguality in Large Language Models (François Yvon)

12:00 – 13:00 Lunch Break (RU)

Wednesday, June 14, afternoon
Fairness and equity in the data, Host: Denise Dipersio (University of Pennsylvania)

This session will cover concepts around the notion of data fairness. We begin with a discussion of general ethical principles, then apply those principles to research tasks in speech and natural language processing. These are manifested throughout the development, testing, deployment and post-deployment life cycle and include data diversity, bias and transparency. We also examine relevant laws and regulations impacting this research. We discuss strategies and resources for managing these concerns and explore potential use cases.

13:30 – 17:00 Fairness & equity in Data (Denise Dispersio)

Thursday, June 15
NLP, Host: Nils Holzenberger
08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Lecture 1 (Nils Holzenberger)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Lecture 2 (Ryan Cotterell)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

Friday, June 16
Combining Finite State Methods with Neural Networks, Host: Lucas Ondel Yang (LISN)

This lecture will cover how to use finite state methods to train and conduct inference with neural networks. We will explore how finite state methods can be used to define standard loss functions for sequence-to-sequence training, how to back-propagate the gradient through finite state automata and how one can leverage GPU to accelerate inference in large automata

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 ASR 1 (Lucas Ondel Yang)
10:20 – 10:40 Stretch Break
10:40 – 12:00 ASR 2 (Martin Kocour)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

__________________________________________________________________

___________________________________________________________

Monday, June 19
Speech segmentation and speaker diarization, Hosts: Marie Tahon, (LIUM) & Hervé Bredin, (IRIT)

This course introduces basis knowledge on speech segmentation. Processing a full recording, obtained for instance from a TV or radio show, requires to identify specific segments of the audio signal. In order to have clean speech with a single speaker, the presence of noise, speech and overlapping speech needs to be precisely determined under a segmentation task. Then, speaker diarization is the task of partitioning an audio stream into homogeneous temporal segments according to the identity of the speaker (i.e. answering the question "who speaks when?"). During the day, we will present the speech segmentation by classification approach and the speaker diarization process.

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Lecture 1 (Marie Tahon)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Lecture 2 (Hervé Bredin)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

Tuesday, June 20
Neural Conversational AI, Host: Petr Schwarz (BUT)

This lecture will give you a basic overview of dialog systems. It will start with transformer and pre-trained language models and continue with neural models for dialogue system components like language understanding, state tracking, and dialogue policy. Then the end-to-end neural models will be presented, and evaluation metrics and the lecture will be finished with current state-of-the-art approaches. We will train and evaluate our end-to-end dialog model during the practical part.

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Lecture 1 (Petr Schwartz)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Lecture 2 (Santosh Kesiraju, Ondrej Platek)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

Wednesday, June 21
Natural Language Processing, Deep Nets, Linear Algebra and Information Retrieval, Host: Kenneth Church, (Northeastern University)

Firth's famous quote, \textit{you shall know a word by the company it keeps}, has had considerable influence on Natural Language methods for processing words and phrases (PMI, Word2vec, BERT). Firth's approach has been generalized from words and phrases to topics and documents, using methods such as node2vec and graphical neural nets (GNNs).
However, these approaches often view documents, at least at inference time, as short (512) sequences of subwords. Recommender systems and systems for assigning papers to reviewers tend to focus on titles and abstracts, but in our collection of 200M documents from Semantic Scholar, there are more papers in the citation graph without abstracts (46M) than vice versa (34M). Links have been super-important in websearch. There is an opportunity to take more advantage of citations (and citing sentences) in topic modeling and information retrieval.
Methods such as GNNs and Specter use links at training time to improve models of 512-subwords, but these methods do not use links at inference time. In addition to BERT-like methods for encoding documents as vectors, we will also use ProNE, a node2vec method for encoding encoding nodes in a citation graph as vectors. Cosines of vectors based on text (e.g., bags of words, BERT) denote word similarity, whereas cosines based on node2vec (spectral clustering of the citation graph, $G$) can be interpreted in terms of distance in $G$.

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Lecture 1 (Kenneth Church)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Lecture 2 (Kenneth Church)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

Thursday, June 22
Explainability in Speech Procesing: From Wishes to practice, Host: Jean-François Bonastre (LIA)

In the field of artificial intelligence, explainability is rapidly moving from an optional aspect, considered as a candied fruit on a cake, to a mandatory feature requested in all situations. This is due to both regulatory changes and to citizens' opinions on the possibilities and dangers of AI.
This talk is composed of three parts:
- An introduction to explainability and interpretability in AI, with a focus on the specifics of speech processing. It includes a brief presentation of the main approaches and tools available.
- The presentation of two practical applications of explicability, in the fields of voice characterization and pathological voice assessment.
- A live session, where participants will be able to interact with the presenters and dive into the source code and results.

08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Lecture 1 (Jean François Bonastre)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Lecture 2 (Imen Ben amor, Sondes Aberrazek)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

Friday, June 23
Evaluation in speech and NLP, Hosts: Craig Greenberg, (NIST) & Olivier Galibert, (LNE)
08:00 – 09:00 Continental Breakfast
09:00 – 10:20 Lecture 1 (Craig Greenberg)
10:20 – 10:40 Stretch Break
10:40 – 12:00 Lecture 2 (Olivier Galibert)
12:00 – 13:00 Lunch Break (RU)
13:00 – 13:30 Computer Lab Setup
13:30 – 17:00 Laboratory

Published on June 23, 2023