Gesture-Timbre Space: Multidimensional Feature Mapping Using Machine Learning and Concatenative Synthesis
Zbyszyński, Michael; Di Donato, Balandino; Visi, Federico and Tanaka, Atau. 2021. “GestureTimbre Space: Multidimensional Feature Mapping Using Machine Learning & Concatenative Synthesis.”
In: Richard Kronland-Martinet; Sølvi Ystad and Mitsuko Aramaki, eds. Perception, Representations, Image, Sound, Music. Springer International Publishing. ISBN 9783030702090
ABSTRACT
This chapter explores three systems for mapping embodied gesture, acquired with electromyography and motion sensing, to sound synthesis. A pilot study using granular synthesis is presented, followed by studies employing corpus-based concatenative synthesis, where small sound units are organized by derived timbral features. We use interactive machine learning in a mapping-by-demonstration paradigm to create regression models that map high-dimensional gestural data to timbral data without dimensionality reduction in three distinct workflows. First, by directly associating individual sound units and static poses (anchor points) in static regression. Second, in whole regression a sound tracing method leverages our intuitive associations between time-varying sound and embodied movement. Third, we extend interactive machine learning through the use of artificial agents and reinforcement learning in an assisted interactive machine learning workflow.
We discuss the benefits of organizing the sound corpus using self-organizing maps to address corpus sparseness, and the potential of regression-based mapping at different points in a musical workflow: gesture design, sound design, and mapping design. These systems support expressive performance by creating gesture-timbre spaces that maximize sonic diversity while maintaining coherence, enabling reliable reproduction of target sounds as well as improvisatory exploration of a sonic corpus. They have been made available to the research community, and have been used by the authors in concert performance.