MIMOSE: multimodal interaction for music orchestration sheet editors

The class diagram of the system’s architecture

Abstract

The increasing number and accuracy of sensors devoted to human- computer input are supporting the emergence of novel multimodal interaction paradigms. These, in turn, can unlock additional strategies to design innovative user-friendly systems. The underlying approaches to user-computer interaction leverage natural channels of communication (e.g. gestures and voice), therefore oftentimes are less cumbersome than traditional interface modalities. This paper proposes a wrapper-based strategy to easily map keyboard shortcuts onto multimodal actions. The presented case study is a music editor software. These applications are often overwhelming for novice users, therefore discouraging their interaction. MIMOSE - Multimodal Interaction for Music Orchestration Sheet Editors addresses these limitations. Instead of relying on buttons and mixture pads for the composition of a music opera, it provides a gesture- and voice-based multimodal wrapper for music editor applications. The user assumes the role of an orchestra conductor. Hence, the wrapper translates user gestures and music jargon keywords into mouse clicks or keyboard pressings, by substituting keyboard shortcuts with multimodal actions. This provides a user ecologically tuned and immersive environment of interaction. It is worth noticing that the wrapped application is not necessarily an open source one. In fact, events already captured by such application are just sent over different channels than keyboard and mouse and are triggered by multimodal actions instead of key pressing. After presenting the features of the wrapper, we describe its application to an open source software tool for music editing and present twofold evaluation results. We separately evaluated the performances of each interaction modality in terms of accuracy and F1 score. Furthermore, we asked real users to evaluate the usability of the application when extended by the wrapper. The user evaluation relies on ad-hoc tailored QUIS and SUXES questionnaires in order to assess the user-friendliness of the resulting application. The results are encouraging from both technical quality and usability points of view. The wrapper at the core of MIMOSE can be adapted to other kinds of applications, with a minimal coding effort.

Publication
In Multimedia Tools and Applications