publications
2025
- Symbolic
Music AnalysisEvaluating Interval-based Tokenization for Pitch Representation in Symbolic Music AnalysisDinh-Viet-Toan Le, Louis Bigo, and Mikaela KellerIn Workshop Artificial Intelligence for Music at AAAI, Philadelphia, United States, Mar 2025Symbolic music analysis tasks are often performed by models originally developed for Natural Language Processing, such as Transformers. Such models require the input data to be represented as sequences, which is achieved through a process of tokenization. Tokenization strategies for symbolic music often rely on absolute MIDI values to represent pitch information. However, music research largely promotes the benefit of higher-level representations such as melodic contour and harmonic relations for which pitch intervals turn out to be more expressive than absolute pitches. In this work, we introduce a general framework for building interval-based tokenizations. By evaluating these tokenizations on three music analysis tasks, we show that such interval-based tokenizations improve model performances and facilitate their explainability
@inproceedings{le2025evaluating, title = {Evaluating Interval-based Tokenization for Pitch Representation in Symbolic Music Analysis}, author = {Le, Dinh-Viet-Toan and Bigo, Louis and Keller, Mikaela}, year = {2025}, month = mar, booktitle = {Workshop Artificial Intelligence for Music at AAAI}, location = {Philadelphia, United States}, conf_web = {https://ai4musicians.org/2025aaai.html} }
- SurveyNatural Language Processing Methods for Symbolic Music Generation and Information Retrieval: A SurveyACM Computing Surveys, Feb 2025
Music is frequently associated with the notion of language, as both domains share several similarities, including the ability for their content to be represented as sequences of symbols. In computer science, the fields of Natural Language Processing (NLP) and Music Information Retrieval (MIR) reflect this analogy through a variety of similar tasks, such as author detection or content generation. This similarity has long encouraged the adaptation of NLP methods to process musical data, particularly symbolic music data, and the rise of Transformer neural networks has considerably strengthened this practice. This survey reviews NLP methods applied to symbolic music generation and information retrieval following two axes. We first propose an overview of representations of symbolic music inspired by text sequential representations. We then review a large set of computational models, particularly deep learning models, which have been adapted from NLP to process these musical representations for various MIR tasks. These models are described and categorized through different prisms with a highlight on their music-specialized mechanisms. We finally present a discussion surrounding the adequate use of NLP tools to process symbolic music data. This includes technical issues regarding NLP methods which may open several doors for further research into more effectively adapting NLP tools to symbolic MIR.
@article{le2025natural, author = {Le, Dinh-Viet-Toan and Bigo, Louis and Herremans, Dorien and Keller, Mikaela}, title = {Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: A Survey}, year = {2025}, issue_date = {July 2025}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {57}, number = {7}, issn = {0360-0300}, url = {https://doi.org/10.1145/3714457}, doi = {10.1145/3714457}, journal = {ACM Computing Surveys}, month = feb, articleno = {175}, numpages = {40}, keywords = {Music information retrieval, natural language processing, symbolic music, music generation, music analysis, deep learning}, }
2024
- Symbolic
Music AnalysisAnalyzing Byte-Pair Encoding on Monophonic and Polyphonic Symbolic Music: A Focus on Musical Phrase SegmentationDinh-Viet-Toan Le, Louis Bigo, and Mikaela KellerIn Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA), Oakland, United States, Nov 2024Byte-Pair Encoding (BPE) is an algorithm commonly used in Natural Language Processing to build a vocabulary of subwords, which has been recently applied to symbolic music. Given that symbolic music can differ significantly from text, particularly with polyphony, we investigate how BPE behaves with different types of musical content. This study provides a qualitative analysis of BPE‘s behavior across various instrumentations and evaluates its impact on a musical phrase segmentation task for both monophonic and polyphonic music. Our findings show that the BPE training process is highly dependent on the instrumentation and that BPE “supertokens” succeed in capturing abstract musical content. In a musical phrase segmentation task, BPE notably improves performance in a polyphonic setting, but enhances performance in monophonic tunes only within a specific range of BPE merges.
@inproceedings{le2024analyzing, title = {Analyzing Byte-Pair Encoding on Monophonic and Polyphonic Symbolic Music: A Focus on Musical Phrase Segmentation}, author = {Le, Dinh-Viet-Toan and Bigo, Louis and Keller, Mikaela}, editor = {Kruspe, Anna and Oramas, Sergio and Epure, Elena V. and Sordo, Mohamed and Weck, Benno and Doh, SeungHeon and Won, Minz and Manco, Ilaria and Meseguer-Brocal, Gabriel}, booktitle = {Proceedings of the 3rd Workshop on NLP for Music and Audio (NLP4MusA)}, month = nov, year = {2024}, location = {Oakland, United States}, publisher = {Association for Computational Lingustics}, url = {https://aclanthology.org/2024.nlp4musa-1.12/}, pages = {69--74}, conf_web = {https://sites.google.com/view/nlp4musa-2024}, }
- Symbolic
Music GenerationMETEOR: Melody-aware Texture-controllable Symbolic Orchestral Music GenerationDinh-Viet-Toan Le, and Yi-Hsuan YangSep 2024Western music is often characterized by a homophonic texture, in which the musical content can be organized into a melody and an accompaniment. In orchestral music, in particular, the composer can select specific characteristics for each instrument’s part within the accompaniment, while also needing to adapt the melody to suit the capabilities of the instruments performing it. In this work, we propose METEOR, a model for Melody-aware Texture-controllable Orchestral music generation. This model performs symbolic multi-track music style transfer with a focus on melodic fidelity. We allow bar- and track-level controllability of the accompaniment with various textural attributes while keeping a homophonic texture. We show that the model can achieve controllability performances similar to strong baselines while greatly improve melodic fidelity.
@misc{le2024meteor, title = {METEOR: Melody-aware Texture-controllable Symbolic Orchestral Music Generation}, author = {Le, Dinh-Viet-Toan and Yang, Yi-Hsuan}, year = {2024}, month = sep, eprint = {2409.11753}, archiveprefix = {arXiv}, primaryclass = {cs.SD}, }
2022
- MusicologyA Corpus Describing Orchestral Texture in First Movements of Classical and Early-Romantic SymphoniesDinh-Viet-Toan Le, Mathieu Giraud, Florence Levé, and Francesco MaccariniIn Proceedings of the 9th International Conference on Digital Libraries for Musicology, Prague, Czech Republic, Jul 2022
Orchestration is the art of writing music for a possibly large ensemble of instruments, by blending or opposing their sounds and grouping them into an orchestral texture. We aim here at providing a deeper understanding of orchestration in classical and early-romantic symphonies by analyzing, at the bar level, how the instruments of the orchestra organize into melodic, rhythmic, harmonic, and mixed layers. We formalize the description of such layers and release an open corpus with more than 7900 annotations in 24 first movements of Haydn, Mozart, and Beethoven symphonies. Initial analyses of this corpus confirm specific roles of the instruments and their families (woodwinds, brass, and strings), some evolution between composers, as well as the contribution of orchestral texture to form. The model and the corpus offer perspectives for empirical and computational studies on orchestral music.
@inproceedings{le2022orchestral, title = {A Corpus Describing Orchestral Texture in First Movements of Classical and Early-Romantic Symphonies}, author = {Le, Dinh-Viet-Toan and Giraud, Mathieu and Lev\'{e}, Florence and Maccarini, Francesco}, year = {2022}, month = jul, booktitle = {Proceedings of the 9th International Conference on Digital Libraries for Musicology}, pages = {27-35}, location = {Prague, Czech Republic}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, isbn = {9781450396684}, url = {https://doi.org/10.1145/3543882.3543884}, doi = {10.1145/3543882.3543884}, numpages = {9}, keywords = {corpus, layers, music texture, orchestration, symbolic data}, series = {DLfM '22}, conf_web = {https://dlfm.web.ox.ac.uk/9th-international-conference-on-digital-libraries-for-musicology}, }