Teaching Machines on Snoring: A Benchmark on Computer Audition for Snore Sound Excitation Localisation

Kun QIAN; Christoph JANOTT; Zixing ZHANG; Jun DENG; Alice BAIRD; Clemens HEISER; Winfried HOHENHORST; Michael HERZOG; Werner HEMMERT; Björn SCHULLER

doi:10.24425/123918

Authors

Kun QIAN Technical University of Munich, University of Passau, Germany
Christoph JANOTT Technical University of Munich, Germany
Zixing ZHANG University of Passau, Germany
Jun DENG audEERING GmbH, Germany
Alice BAIRD University of Passau, Germany
Clemens HEISER Technical University of Munich, Germany
Winfried HOHENHORST Clinic for ENT Medicine, Head and Neck Surgery, Alfried Krupp Krankenhaus, Essen, Germany, Germany
Michael HERZOG Clinic for ENT Medicine, Head and Neck Surgery, Cottbus, Germany, Germany
Werner HEMMERT Technical University of Munich, Germany
Björn SCHULLER University of Passau, Imperial College London, audEERING GmbH, Germany

Abstract

This paper proposes a comprehensive study on machine listening for localisation of snore sound excitation. Here we investigate the effects of varied frame sizes, and overlap of the analysed audio chunk for extracting low-level descriptors. In addition, we explore the performance of each kind of feature when it is fed into varied classifier models, including support vector machines, $k$-nearest neighbours, linear discriminant analysis, random forests, extreme learning machines, kernel-based extreme learning machines, multilayer perceptrons, and deep neural networks. Experimental results demonstrate that, wavelet packet transform energy can outperform most other features. A deep neural network trained with subband energy ratios reaches the highest performance achieving an unweighted average recall of 72.8% from four types for snoring.

Keywords:

snore sound, obstructive sleep apnea, acoustic features, machine learning

References

1. Abdel-Hamid O., Mohamed A.-R., Jiang H., Deng L., Penn G., Yu D. (2014), Convolutional neural networks for speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22, 10, 1533–1545.

2. Agrawal S., Stone P., McGuinness K., Morris J., Camilleri A. (2002), Sound frequency analysis and the site of snoring in natural and induced sleep, Clinical Otolaryngology & Allied Sciences, 27, 3, 162–166.

3. Aldrich M.S. (1999), Sleep medicine, Oxford University Press, New York, USA.

4. Basheer I., Hajmeer M. (2000), Artificial neural networks: fundamentals, computing, design, and application, Journal of Microbiological Methods, 43, 1, 3–31.

5. Beeton R.J., Wells I., Ebden P., Whittet H., Clarke J. (2007), Snore site discrimination using statistical moments of free field snoring sounds recorded during sleep nasendoscopy, Physiological Measurement, 28, 10, 1225–1236.

6. Bishop C.M. (2006), Pattern recognition and machine learning, Springer, New York, US.

7. Breiman L. (2001), Random forests, Machine Learning, 45, 1, 5–32.

8. Chang C.-C., Lin C.-J. (2011), LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27, software available at http://www.csie.ntu.edu.tw/_cjlin/ libsvm.

9. Cortes C., Vapnik V. (1995), Support-vector networks, Machine Learning, 20, 3, 273–297.

10. El Badawey M.R., McKee G., Marshall H., Heggie N., Wilson J.A. (2003), Predictive value of sleep nasendoscopy in the management of habitual snorers, Annals of Otology, Rhinology & Laryngology, 112, 1, 40–44.

11. Eyben F. (2015), Real-time speech and music classification by large audio feature space extraction, Springer International Publishing, Cham, Switzerland.

12. Eyben F. et al. (2016), The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing, IEEE Transactions on Affective Computing, 7, 2, 190–202.

13. Eyben F., Weninger F., Gross F., Schuller B. (2013), Recent developments in opensmile, the munich open-source multimedia feature extractor, [in:] Proc. ACM MM, pp. 835–838, Barcelona, Catalunya, Spain.

14. Eyben F., Wöllmer M., Schuller B. (2010), Opensmile: the munich versatile and fast open-source audio feature extractor, [in:] Proc. ACM MM, pp. 1459– 1462, Firenze, Italy.

15. Fiz J.A., Jane R. (2012), Snoring analysis. A complex question, Journal of Sleep Disorders: Treatment and Care, 1, 1, 1–3.

16. Herzog M., Plössl S., Glien A., Herzog B., Rohrmeier C., Kühnel T., Plontke S., Kellner P. (2014), Evaluation of acoustic characteristics of snoring sounds obtained during drug-induced sleep endoscopy, Sleep and Breathing, pp. 1–9.

17. Hessel N., de Vries N. (2002), Diagnostic work-up of socially unacceptable snoring. II. Sleep endoscopy, European Archives of Oto-Rhino-Laryngology, 259, 158– 161.

18. Hill P., Lee B., Osborne J., Osman E. (1999), Palatal snoring identified by acoustic crest factor analysis, Physiological Measurement, 20, 2, 167–174.

19. Huang G.-B. (2014), An insight into extreme learning machines: random neurons, random features and kernels, Cognitive Computation, 6, 3, 376–390.

20. Huang G.-B., Zhu, Q.-Y., Siew C.-K. (2006), Extreme learning machine: theory and applications, Neurocomputing, 70, 1, 489–501.

21. Kezirian E.J., Hohenhorst W., de Vries N. (2011), Drug-induced sleep endoscopy: the vote classification, European Archives of Oto-Rhino-Laryngology, 268, 8, 1233–1236.

22. Marin J.M., Carrizo S.J., Vicente E., Agusti A.G. (2005), Long-term cardiovascular outcomes in men with obstructive sleep apnoea-hypopnoea with or without treatment with continuous positive airway pressure: an observational study, The Lancet, 365, 9464, 1046–1053.

23. Miyazaki S., Itasaka Y., Ishikawa K., Togawa K. (1998), Acoustic analysis of snoring and the site of airway obstruction in sleep related respiratory disorders, Acta Oto-Laryngologica, 118, 537, 47–51.

24. Mokhlesi B., Ham S., Gozal D. (2016), The effect of sex and age on the comorbidity burden of osa: an observational analysis from a large nationwide us health claims database, The European Respiratory Journal, 47, 4, 1162–1169.

25. Pancoast S., Akbacak M. (2012), Bag-of-audiowords approach for multimedia event classification, [in:] Proceedings of INTERSPEECH, pp. 2105–2108, Portland, Oregon.

26. Peppard P.E., Young T., Barnet J.H., Palta M., Hagen E.W., Hla K.M. (2013), Increased prevalence of sleep-disordered breathing in adults, American Journal of Epidemiology, 177, 9, 1006–1014.

27. Peppard P.E., Young T., Palta M., Skatrud J. (2000), Prospective study of the association between sleep-disordered breathing and hypertension, New England Journal of Medicine, 342, 19, 1378–1384.

28. Pevernagie D., Aarts R.M., De Meyer M. (2010), The acoustics of snoring, Sleep Medicine Reviews, 14, 2, 131–144.

29. Qian K., Fang Y., Xu Z., Xu H. (2013), Comparison of two acoustic features for classification of different snore signals, Chinese Journal of Electron Devices, 36, 4, 455–459.

30. Qian K. et al. (2017), Classification of the excitation location of snore sounds in the upper airway by acoustic multi-feature analysis, IEEE Transactions on Biomedical Engineering, 64, 8, 1731–1741.

31. Qian K., Janott C., Zhang Z., Heiser C., Schuller B. (2016), Wavelet features for classification of vote snore sounds, [in:] Proc. IEEE ICASSP, pp. 221–225, Shanghai, China.

32. Qian K., Xu Z., Xu H., Ng B.P. (2014), Automatic detection of inspiration related snoring signals from original audio recording, [in:] Proc. ChinaSIP, pp. 95– 99, Xi’an, China.

33. Qian K., Xu Z., Xu H., Wu Y., Zhao Z. (2015), Automatic detection, segmentation and classification of snore related signals from overnight audio recording, IET Signal Processing, 9, 1, 21–29.

34. Roebuck A. et al. (2014), A review of signals used in sleep analysis, Physiological Measurement, 35, 1, R1– R57.

35. Sak H., Senior A.W., Beaufays F. (2014), Long short-term memory recurrent neural network architectures for large scale acoustic modeling, [in:] Proceedings of INTERSPEECH, pp. 338–342, Singapore.

36. Schmitt M. et al. (2016), A bag-of-audio-words approach for snore sounds excitation localisation, [in:] Proc. ITG Speech Communication, pp. 230–234, Paderborn, Germany.

37. Schuller B., Steidl S., Batliner A. (2009), The interspeech 2009 emotion challenge, [in:] Proc. INTERSPEECH, pp. 312–315, Brighton, UK.

38. Schuller B. et al. (2013), The interspeech 2013 computational paralinguistics challenge: social signals, conflict, emotion, autism, [in:] Proc. INTERSPEECH, pp. 148–152, Lyon, France.

39. Spiegel M.R., Schiller J.J., Srinivasan R.A., LeVan M. (2009), Probability and statistics, McGraw- Hill, New York, NY, USA.

40. Strollo Jr P.J., Rogers R.M. (1996), Obstructive sleep apnea, New England Journal of Medicine, 334, 2, 99–104.

41. Vincent P., Larochelle H., Lajoie I., Bengio Y., Manzagol P.-A. (2010), Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, Journal of Machine Learning Research, 11, 3371–3408.

42. Yaggi H.K., Concato J., Kernan W.N., Lichtman J.H., Brass L.M., Mohsenin V. (2005), Obstructive sleep apnea as a risk factor for stroke and death, New England Journal of Medicine, 353, 19, 2034–2041.

43. Young T., Palta M., Dempsey J., Skatrud J., Weber S., Badr S. (1993), The occurrence of sleepdisordered breathing among middle-aged adults, New England Journal of Medicine, 328, 17, 1230–1235.

Online first
Early birds
2026, Vol 51
	No 1	No 2
2025, Vol 50
	No 1	No 2	No 3	No 4
2024, Vol 49
	No 1	No 2	No 3	No 4
2023, Vol 48
	No 1	No 2	No 3	No 4
2022, Vol 47
	No 1	No 2	No 3	No 4
2021, Vol 46
	No 1	No 2	No 3	No 4
2020, Vol 45
	No 1	No 2	No 3	No 4
2019, Vol 44
	No 1	No 2	No 3	No 4
2018, Vol 43
	No 1	No 2	No 3	No 4
2017, Vol 42
	No 1	No 2	No 3	No 4
2016, Vol 41
	No 1	No 2	No 3	No 4
2015, Vol 40
	No 1	No 2	No 3	No 4
2014, Vol 39
	No 1	No 2	No 3	No 4
2013, Vol 38
	No 1	No 2	No 3	No 4
2012, Vol 37
	No 1	No 2	No 3	No 4
2011, Vol 36
	No 1	No 2	No 3	No 4
2010, Vol 35
	No 1	No 2	No 3	No 4
2009, Vol 34
	No 1	No 2	No 3	No 4
2008, Vol 33
	No 1	No 2	No 3	No 4	No 4(S)
2007, Vol 32
	No 1	No 2	No 3	No 4	No 4(S)
2006, Vol 31
	No 1	No 2	No 3	No 4	No 4(S)
2005, Vol 30
	No 1	No 2	No 3	No 4
2004, Vol 29
	No 1	No 2	No 3	No 4
2003, Vol 28
	No 1	No 2	No 3	No 4
2002, Vol 27
	No 1	No 2	No 3	No 4
2001, Vol 26
	No 1	No 2	No 3	No 4
2000, Vol 25
	No 1	No 2	No 3	No 4
1999, Vol 24
	No 1	No 2	No 3	No 4
1998, Vol 23
	No 1	No 2	No 3	No 4
1997, Vol 22
	No 1	No 2	No 3	No 4
1996, Vol 21
	No 1	No 2	No 3	No 4
1995, Vol 20
	No 1	No 2	No 3	No 4
1994, Vol 19
	No 1	No 2	No 3	No 4
1993, Vol 18
	No 1	No 2	No 3	No 4
1992, Vol 17
	No 1	No 2	No 3	No 4
1991, Vol 16
	No 1	No 2	No 3-4
1990, Vol 15
	No 1-2		No 3-4
1989, Vol 14
	No 1-2		No 3-4
1988, Vol 13
	No 1-2		No 3-4
1987, Vol 12
	No 1	No 2	No 3-4
1986, Vol 11
	No 1	No 2	No 3	No 4
1985, Vol 10
	No 1	No 2	No 3	No 4
1984, Vol 9
	No 1-2		No 3	No 4
1983, Vol 8
	No 1	No 2	No 3	No 4
1982, Vol 7
	No 1	No 2	No 3-4
1981, Vol 6
	No 1	No 2	No 3	No 4
1980, Vol 5
	No 1	No 2	No 3	No 4
1979, Vol 4
	No 1	No 2	No 3	No 4
1978, Vol 3
	No 1	No 2	No 3	No 4
1977, Vol 2
	No 1	No 2	No 3	No 4
1976, Vol 1
	No 1	No 2	No 3	No 4

Teaching Machines on Snoring: A Benchmark on Computer Audition for Snore Sound Excitation Localisation

Downloads

Authors

Abstract

Keywords:

References

Other articles by the same author(s)

cover

ippt-pan

Issue

Pages

Section

DOI

Received

Accepted

Published

License

How to Cite

Principal Contact

Address

Support Contact