Dynamic Visualization of Music Classification Systems
Source:
SIGIR, ACM, Singapore (2008)
Abstract:
INTRODUCTION
The traditional approach to the development of automated classification systems involves the optimization of an algorithm over a series of simple train/test splits of a database, where each example is denoted with a single class label. However, for the development of audio-based Music Information Retrieval (MIR) systems (such as mood, genre or artist classifiers), this approach has a serious shortcoming. Owing to the fact that only a single class label is applied to each track, the performance of automated MIR classification systems cannot be evaluated nor used effectively on tracks whose moods or genres vary across time. Furthermore, the classifications computed by MIR systems are monic (i.e., assigning only one class label per track). Hence, the simple accuracy scores computed do not provide any further information on the posterior probability profiles produced for each track. These probability profiles can be used to effectively index a music collection for search-by-example retrieval but only if the posterior probabilities used are well conditioned [1].
This demonstration introduces a highly configurable music classification, visualization and audition system designed to overcome these shortcomings by enabling researchers to address real-time music classification and to evaluate the posterior probabilities produced whilst interacting with real-time music streams. The system and its underlying Networked Environment for Music Analysis (NEMA) [2] infrastructure are written in Java and will be deployed for public use under the Software Environment for the Advancement of Scholarly Research (SEASR) [3] webservice framework being developed at UIUC.
USE SCENARIOS
The system supports the real-time extraction of a configurable stream of features from audio tracks and the application of a variety of classification models to short windows (e.g., 10 sec.) of these. The models are used to generate posterior probability distributions in real-time and to synchronize their display with audio playback. This empowers both researchers and general users to dynamically explore the effects and interactions of the complex parameters involved in automatic music classification. The system permits the user to select an arbitrary number of classification models from the system’s model library which currently comprises classifier sets for two distinct clasification tasks (i.e., genre and moodclassification). Each task collection contains a number of models built using a variety of machine learning techniques (e.g., naïve bayes, decision trees, SVMs, etc.). Advanced users are able to manipulate feature extraction options to further refine their explorations or to prototype new feature extraction algorithms. Figure 1 depicts a user simultaneously exploring the different real-time behaviors of two genre classifiers (CART and J48 decision trees) and a linear discriminant mood classifier. By examining the posterior probability profiles the user can better understand the behaviour of each model and can relate this information to the audio currently playing.
REFERENCES
- West, K and Lamere, P. A Model-Based Approach to Constructing Music Similarity Functions. EURASIP Journal on Advances in Signal Processing (2007), 1 - 10.
- NEMA Website. Available: http://nema.lis.uiuc.edu/.
- SEASR Website. Available: http://www.seasr.org/.