Archives of Acoustics, 39, 4, pp. 629-638, 2014

Classification of Music Genres Based on Music Separation into Harmonic and Drum Components

Institute of Informatics, Silesian University of Technology

Technische Universität München

Audio Acoustics Laboratory, Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology

This article presents a study on music genre classification based on music separation into harmonic and drum components. For this purpose, audio signal separation is executed to extend the overall vector of parameters by new descriptors extracted from harmonic and/or drum music content. The study is performed using the ISMIS database of music files represented by vectors of parameters containing music features. The Support Vector Machine (SVM) classifier and co-training method adapted for the standard SVM are involved in genre classification. Also, some additional experiments are performed using reduced feature vectors, which improved the overall result. Finally, results and conclusions drawn from the study are presented, and suggestions for further work are outlined.
Keywords: Music Information Retrieval, musical isound separation, drum separation, music genre classification, Support Vector Machine, co-training, Non-Negative Matrix Factorization.
Full Text: PDF


BEAUCHAMP J. (2011), Perceptually Correlated Parameters of Musical Instrument Tones, Archives of Acoustics, 36, 2, 225–238.

BLUM A., MITCHELL T. (1998), Combining labeled and unlabeled data with co-training. Proceedings of the Workshop on Computational Learning Theory, Morgan Kaufmann, 92-100.

BREGMAN A. (1990), Auditory scene analysis: the perceptual organization of sound, MIT Press.

CASEY M., WESTNER A. (2000), Separation of mixed audio sources by independent subspace analysis. Proceedings of International Computer Music Conference, 154-161, Berlin.

de CHEVEIGNÉ A. (1993), Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancellation model of auditory processing, J. Acoust. Soc. Am.

DZIUBIŃSKI M., DALKA P., KOSTEK B. (2005), Estimation of Musical Sound Separation Algorithm Effectiveness Employing Neural Networks, J. Intel. Inform. Systems, 24, 2, 133-157.

EWERET S., PRADO B., MULLER M., PLUMBLEY M. (2014), Score-Informed Source Separation for Musical Audio Recordings, IEEE Signal Proc. Magazine, 116-124.

GERBER T., DUTASTA M., GIRIN L., FÉVOTTE C. (2012), Professionally-produced music separation guided by covers, 13th International Society for Music Information Retrieval Conference.

GILLET O., RICHARD G. (2008), Transcription and separation of drum signals from polyphonic music, IEEE Transactions on Audio, Speech and Language Processing, 16, 529–540 .

GUNAWAN D., SEN S. (2012), Separation of Harmonic Musical Instrument Notes Using Spectro-Temporal Modeling of Harmonic Magnitudes and Spectrogram Inversion with Phase Optimization, JAES, 60, 12, pp. 1004-1014.

HERRERA P., AMATRIAIN X., BATLLE E., SERRA X. (2000), Towards instrument segmentation for music content description: a critical review of instrument classification techniques, Proceedings of International Symp. on Music Information Retrieval, Plymouth, Massachusetts.

KLAPURI A. (2001), Multipitch estimation and sound separation by the spectral smoothness principle, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3381-3384, Salt Lake City.

KLECZKOWSKI P. (2012), Perception of Mixture of Musical Instruments with Spectral Overlap Removed, Archives of Acoustics, 37, 3, 355–363.

KOSTEK B. (1999), Soft Computing in Acoustics, Applications of Neural Networks, Fuzzy Logic and Rough Sets to Musical Acoustics, Studies in Fuzziness and Soft Computing, Physica Verlag.

KOSTEK B., CZYZEWSKI A. (2001), Representing Musical Instrument Sounds for Their Automatic Classification, J. Audio Eng. Soc., 49, 9, 768-785.

KOSTEK B. (2004), Musical Instrument Classification and Duet Analysis Employing Music Information Retrieval Techniques, Proceedings of the IEEE, 92, 4, 712-729.

KOSTEK B. (2005), Perception-Based Data Processing in Acoustics, Applications to Music Information Retrieval and Psychophysiology of Hearing, Series on Cognitive Technologies, Springer Verlag, Berlin, Heidelberg, New York 2005.

KOSTEK B., DZIUBIŃSKI M. (2010), Evaluation of the separation algorithm performance employing ANNs, 34, Springer Verlag, in: Advances in Intelligent and Soft Computing, 80, 27 – 37, Berlin, Heidelberg.

KOSTEK B., KUPRYJANOW A., ZWAN P., JIANG W., RAS Z., WOJNARSKI M., SWIETLICKA J. (2011), Report of the ISMIS 2011 Contest: Music Information Retrieval, Foundations of Intelligent Systems, ISMIS 2011, Springer Verlag, 715–724, Berlin, Heidelberg.

KOSTEK B. (2013), Music Information Retrieval in Music Repositories, Rough Sets and Intelligent Systems (A. Skowron, Z. Suraj, eds.), 463-489, Springer Verlag, Berlin, Heilderberg.

LEE D.D. and SEUNG H.S. (1999), Learning the parts of objects by non-negative matrix factorization, Nature, 401:788-791.

LIUTKUS A., PINEL J., BADEAU R., GIRIN L., RICHARD G. (2012), Informed source separation through spectrogram coding and data embedding, Signal Processing, 92, 8,1937–1949.

LOHRI A., CARRAL S., CHATZIIOANNOU V. (2012), Combination Tones in Violins, Archives of Acoustics, 36, 4, 727–740.

MIKA D., KLECZKOWSKI P. (2011), ICA-based Single Channel Audio Separation: New Bases and Measures of Distance, Archives of Acoustics, 36, 2, 311–331.

NIKUNEN J., VIRTANEN T., VILERMO M. (2012), Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization, JAES, 60, 10, 794-806.

RAS Z., WIECZORKOWSKA A., eds. (2010), Advances in Music Information Retrieval (Studies in Computational Intelligence, no. 274), Springer Publishing Company.

ROSNER A., MICHALAK M., KOSTEK B. (2013a), A Study on Influence of Normalization Methods on Music Genre Classification Results Employing kNN Algorithm, Proceedings 9th National Conference on Bazy Danych: Aplikacje i Systemy, 411-423, Ustroń.

ROSNER A., WENINGER F., SCHULLER B., MICHALAK M., Kostek B. (2013b), A study on Influence of Instruments on Music Genre Classification Results, Proceedings of International Conference on Man-Machine Interactions, 467-473, Beskidy.

RUMP H., MIYABE S., TSUNOO E., ONO N., SAGAMA S. (2010), Autoregressive MFCC Models For Genre Classification Improved By Harmonic-Percussion Separation, Proceedings of the 11th International Society for Music Information Retrieval Conference, pp 87-92, Utrecht.

SERRA X., SMITH J. O. (1990), Spectral modeling synthesis: a sound analysis/synthesis system based on a deterministic plus stochastic decomposition, Computer Music Journal, 14, 4, 12-24.

SOFIANOS S., ARIYAEEINIA A., POLFREMAN R., SOTUDEH R. (2012) H-Semantics: a Hybrid Approach to Singing Voice Separation, JAES, 60, 10, pp. 831-841.

TERASAWA H, BERGER J., MAKINO S. (2012), In Search of a Perceptual Metric for Timbre: Dissimilarity Judgments among Synthetic Sounds with MFCC-Derived Spectral Envelopes, JAES, 60, 9, pp. 674-685.

TOLONEN T. (1999), Methods for Separation of Harmonic Sound Sources using Sinusoidal Modeling, 106th Audio Engineering Society Conv., Munich.

WACK N., GUAUS E., LAURIER C., MEYERS O., MARXER R., BOGDANOV D., SERRA J., HERRERA P. (2009), Music Type Groupers (Mtg): Generic Music Classification Algorithms, International Society for Music Information Retrieval.

WENINGER F., DURRIEU J., EYBEN F., RICHARD G., Schuller B. (2011), Combining monaural source separation with long short-term memory for increased robustness in vocalist gender recognition. In: Proceedings of International Conference on Acoustics Speech and Signal Processing, pp. 2196-2199, IEEE, Prague, Czech Republic.

WENINGER F., SCHULLER B. (2012), Optimization and parallelization of monaural source separation algorithms in the openblissart toolkit. J. Signal Processing Systems, 69(3), 267-277.

WIECZORKOWSKA A., KUBERA E., KUBIK-KOMAR A. (2011), Analysis of Recognition of a Musical Instrument in Sound Mixes Using Support Vector Machines, Fundamenta Informaticae, 107, 1. (Intern. Conference on Music Information Retrieval website).

DOI: 10.2478/aoa-2014-0068

Copyright © Polish Academy of Sciences & Institute of Fundamental Technological Research (IPPT PAN)