PhD Defense
Modeling Symbolic Music with Natural Language Processing Approaches
Slides PDF Reveal.js (WIP)
Manuscript PDF
Abstract
Music is often described as a language because of its similarities to natural language. These include their respective representations through symbolic music notation and textual form. Therefore, the field of Music Information Retrieval (MIR) has often borrowed several tools from the Natural Language Processing (NLP) field to adapt them to process symbolic music data. In particular, this phenomenon has been increasingly popular with the breakthrough of Transformer models in the NLP field.
This thesis first provides a structured overview of adaptations of NLP methods developed in the MIR field for symbolic music processing. They are presented along three axes, each addressing the use of diverse representations of symbolic music at different levels. Symbolic music represented as sequential data has lead to the development of several tokenization strategies, which we propose to organize within a unified taxonomy. These representations are subsequently processed through models, such as recurrent or attention-based architectures initially developed for text data, giving rise to multiple adaptations for symbolic music processing. Finally, these abstract representations are used to perform tasks, where both parallels and distinctive characteristics emerge between MIR and NLP.
These aspects then structure the three technical contributions of this thesis. First, we study the expressiveness of sequential representations of music through the development of interval-based tokenization strategies, and the analysis of a subword tokenization strategy, Byte-Pair Encoding, applied to symbolic music tokens. We then propose a framework for model explainability which leads to the analysis of the attention mechanism of a Transformer-based model trained for functional harmony analysis. Finally, we develop a model adapted from NLP tools for a task of re-orchestration, framed as a case of multi-track music generation.
Ultimately, this thesis defends that NLP methods first remains a toolbox from which MIR studies can take some tools from. Beyond the analogies between music and natural language, the main motivation guiding a MIR study should be musical questions.
Jury
| Reviewers | ||
| M. Xavier HINAUT | Inria Bordeaux | Reviewer |
| Ms. Cheng-Zhi Anna HUANG | Massachusetts Institute of Technology | Reviewer |
| Examiners | ||
| Ms. Chloé BRAUD | Institut de Recherche en Informatique de Toulouse | Examiner |
| M. Emmanouil BENETOS | Queen Mary University of London | Examiner |
| M. Marius BILASCO | Université de Lille | Examiner |
| M. Patrick BAS | Université de Lille | Examiner ; Jury President |
| Thesis advisors | ||
| M. Marc TOMMASI | Université de Lille | Co-director |
| M. Louis BIGO | Université de Bordeaux | Co-director |
| Ms. Mikaela KELLER | Université de Lille | Co-advisor |