Influence of Facial, Head, and Neck Dimensions on Vocal Acoustic Parameters in Polish Speakers

Downloads

Authors

  • Łukasz Pawelec Wroclaw University of Environmental and Life Sciences, Poland ORCID ID 0000-0001-9406-9997
  • Kamila Słowik Wroclaw University of Environmental and Life Sciences, Poland
  • Anna Lipowicz Wroclaw University of Environmental and Life Sciences, Poland ORCID ID 0000-0002-9182-6953

Abstract

The relationships between human voice parameters and body dimensions have been previously described, but the connections between voice and face geometry remain poorly researched. This study aims to determine the relationships between face dimensions and acoustic parameters in both sexes and examines 111 adult participants (30 males). Each participant undergoes voice recording, which includes five sustained vowels, along with anthropometric measurements of the neck, head, and face regions. Comparisons between voice parameters and the head, face, and neck regions are conducted employing Pearson’s correlation coefficients (r) and a multiple linear regression model. The results reveal significant relationships between head, neck, face dimensions and acoustic parameters in both sexes. Males with higher noses, greater head circumferences, and wider faces tend to have lower formants and more stable voices. Females with larger head circumferences had lower formant values, and those with greater neck circumferences tend to have more stable voices. Also, females with increased nose height have a lower fourth formant (F4). Moreover, females with wider faces, noses, and jaws tend to have less rough voices (lower jitter) and longer maximum phonation time (MPT). These findings may be useful for scientists and law enforcement authorities in creating algorithms that build face models based on voice signals.

Keywords:

biometry, formants, fundamental frequency, pitch, personal identification

References


  1. Abitbol J., Abitbol P., Abitbol B. (1999), Sex hormones and the female voice, Journal of Voice, 13(3): 424–446, https://doi.org/10.1016/S0892-1997(99)80048-4

  2. Arnocky S., Hodges-Simeon C.R., Ouellette D., Albert G. (2018), Do men with more masculine voices have better immunocompetence?, Evolution and Human Behavior, 39(6): 602–610, https://doi.org/10.1016/j.evolhumbehav.2018.06.003

  3. Boersma P., Weenink D. (2019), Praat: Doing phonetics by computer, Computer program, Version 6.0.56, http://www.praat.org/ (access: 20.06.2019).

  4. Bommarito S. et al. (2019), Correlation between voice, speech, body and facial types in young adults, Global Journal of Otolaryngology, 20(4): 556041, https://doi.org/10.19080/gjo.2019.20.556041

  5. Bottalico P., Marunick M.T., Nudelman C.J., Webster J., Jackson-Menaldi M.C. (2021), Singing voice quality: The effects of maxillary dental arch and singing style, Journal of Voice, 35(3): 501.e11–501.e18, https://doi.org/10.1016/j.jvoice.2019.09.015

  6. Brattstrom V., Odenrick L., Leanderson R. (1991), Dentofacial morphology in professional opera singers, Acta Odontologica Scandinavica, 49(3): 147–151, https://doi.org/10.3109/00016359109005899

  7. Bruckert L., Lienard J.-S., Lacroix A., Kreutzer M., Leboucher G. (2006), Women use voice parameters to assess men’s characteristics, Biological Sciences, 273(1582): 83–89, https://doi.org/10.1098/rspb.2005.3265

  8. Bunker D. (2017), Speech2Face: Reconstructed lip syncing with generative adversarial networks. Data reflexions: Thoughts and Projects.

  9. Burton A.M., Wilson S., Cowan M., Bruce V. (1999), Face recognition in poor-quality video: Evidence from security surveillance, Psychological Science, 10(3): 243–248, https://doi.org/10.1111/1467-9280.00

  10. Byeon H., Cha S. (2020), Evaluating the effects of smoking on the voice and subjective voice problems using a meta-analysis approach, Scientific Reports, 10(1): 4720, https://doi.org/10.1038/s41598-020-61565-3

  11. Cazacu C.J. et al. (2025), Morphology of facial aging: A shape-based quantification, Romanian Journal of Morphology and Embryology = Revue Roumaine de Morphologie et Embryologie, 66(2): 367–373, https://doi.org/10.47162/RJME.66.2.10

  12. Dabbs Jr. J.M., Mallinger A. (1999), High testosterone levels predict low voice pitch among men, Personality and Individual Differences, 27(4): 801–804, https://doi.org/10.1016/S0191-8869(98)00272-4

  13. Damrose E.J. (2009), Quantifying the impact of androgen therapy on the female larynx, Auris Nasus Larynx, 36(1): 110–112, https://doi.org/10.1016/j.anl.2008.03.002

  14. Erber N.P. (1979), Real-time synthesis of optical lip shapes from vowel sounds, The Journal of the Acoustical Society of America, 66(5): 1542–1544, https://doi.org/10.1121/1.383511

  15. Evans S., Neave N., Wakelin D. (2006), Relationships between vocal characteristics and body size and shape in human males: An evolutionary explanation for a deep male voice, Biological Psychology, 72(2): 160–163, https://doi.org/10.1016/j.biopsycho.2005.09.003

  16. Evans S., Neave, N. Wakelin D., Hamilton C. (2008), The relationship between testosterone and vocal frequencies in human males, Physiology & Behavior, 93(4–5): 783–788, https://doi.org/10.1016/j.physbeh.2007.11.033

  17. Fant G. (1960), Acoustic Theory of Speech Production: With Calculations Based on X-Ray Studies of Russian Articulations, 2nd ed., Walter de Gruyter.

  18. Fitch W.T. (1997), Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques, The Journal of the Acoustical Society of America, 102(2): 1213–1222, https://doi.org/10.1121/1.421048

  19. Fitch W.T., Giedd J. (1999), Morphology and development of the human vocal tract: A study using magnetic resonance imaging, The Journal of the Acoustical Society of America, 106(3): 1511–1522, https://doi.org/10.1121/1.427148

  20. Gonzalez J. (2004), Formant frequencies and body size of speaker: A weak relationship in adult humans, Journal of Phonetics, 32(2): 277–287, https://doi.org/10.1016/S0095-4470(03)00049-4

  21. Gonzalez J. (2007), Correlations between speakers’ body size and acoustic parameters of voice, Perceptual and Motor Skills, 105(1): 215–220, https://doi.org/10.2466/pms.105.1.215-220

  22. Graddol D., Swann J. (1983), Speaking fundamental frequency: Some physical and social correlates, Language and Speech, 26(4): 351–366, https://doi.org/10.1177/002383098302600403

  23. Graja K., Krol B. (2022), Anthropometry and Anthroposcopy. Lecture Notes [in Polish: Antropometria i Antroposkopia. Skrypt do Ćwiczeń], Unpublished teaching materials of the Division of Anthropology, Wroclaw University of Environmental and Life Sciences.

  24. Hamdan A.L. et al. (2012), Relationship between acoustic parameters and body mass analysis in young males, Journal of Voice, 26(2): 144–147, https://doi.org/10.1016/j.jvoice.2011.01.011

  25. Hamdan A.L.H. et al. (2013), Formant frequency in relation to body mass composition, Journal of Voice, 27(5): 567–571, https://doi.org/10.1016/j.jvoice.2012.09.005

  26. Jandova M., Urbanova P. (2016), The relationship between facial morphology, body measurements and socioeconomic factors, Anthropological Review, 79(2): 181–200, https://doi.org/10.1515/anre-2016-0014

  27. Kamachi M., Hill H., Lander K., Vatikiotis-Bateson E. (2003), ‘Putting the face to the voice’: Matching identity across modality, Current Biology, 13(19): 1709–1714, https://doi.org/10.1016/j.cub.2003.09.005

  28. Kikuchi Y. (2008), Three-dimensional relationship between pharyngeal airway and maxillo-facial morphology, The Bulletin of Tokyo Dental College, 49(2): 65–75, https://doi.org/10.2209/tdcpublication.49.65

  29. Kim T., Kang Y., Ko H. (2002), Achieving real-time lip-synch via SVM-based phoneme classification and lip shape refinement, [in:] Proceedings Fourth IEEE International Conference on Multimodal Interfaces, https://doi.org/10.1109/6ICMI.2002.1167010

  30. Kirgezen T., Sunter A.V., Yigit O., Huq G.E. (2017), Sex hormone receptor expression in the human vocal fold subunits, Journal of Voice, 31(4): 476–482, https://doi.org/10.1016/j.jvoice.2016.11.005

  31. Klasmeyer G., Sendlmeier W.F. (2000), Voice and emotional states, [in:] Voice Quality Measurement, Kent R.D., Ball M.J. [Eds.], pp. 339–357, Singular Publishing Group.

  32. Kogelschatz L., Barenholtz E. (2013), Matching voice and face identity from static images, Journal of Vision, 12(9): 1023, https://doi.org/10.1167/12.9.1023

  33. Krauss R.M., Freyberg R., Morsella E. (2002), Inferring speakers’ physical attributes from their voices, Journal of Experimental Social Psychology, 38(6): 618–625, https://doi.org/10.1016/S0022-1031(02)00510-3

  34. Ladefoged P., Harshman R., Goldstein L., Rice L. (1978), Generating vocal tract shapes from formant frequencies, The Journal of the Acoustical Society of America, 64(4): 1027–1035, https://doi.org/10.1121/1.382086

  35. Li X., Wen Y., Yang M., Wang J., Singh R., Raj B. (2023), Rethinking voice-face correlation: A geometry view, [in:] MM ’23: Proceedings of the 31st ACM International Conference on Multimedia, pp. 2458–2467, https://doi.org/10.1145/3581783.3611779

  36. Lucas T., Hatfield D., Henneberg M. (2023), A morphological comparison between a death mask of the American Prophet Joseph Smith and a photograph likely to depict him, Anthropological Review, 85(4): 1–13, https://doi.org/10.18778/1898-6773.85.4.01

  37. Macari A.T. et al. (2017), Association between facial length and width and fundamental frequency, Journal of Voice, 31(4): 410–415, https://doi.org/10.1016/j.jvoice.2016.12.001

  38. Macari A.T., Karam I.A., Tabri D., Sarieddine D., Hamdan A.L. (2015), Formants frequency and dispersion in relation to the length and projection of the upper and lower jaws, Journal of Voice, 29(1): 83–90, https://doi.org/10.1016/j.jvoice.2014.05.011

  39. Martin R. (1914), A Textbook of Anthropology: A Systematic Presentation [in German: Lehrbuch der Anthropologie in systematischer Darstellung], Jena, Gustav Fischer.

  40. Marunick M.T., Menaldi C.J. (2000), Maxillary dental arch form related to voice classification: A pilot study, Journal of Voice, 14(1): 82–91, https://doi.org/10.1016/S0892-1997(00)80097-1

  41. Mercer E., Lowell S.Y. (2020), The low mandible maneuver: Preliminary study of its effects on aerodynamic and acoustic measures, Journal of Voice, 34(4): 645.e1–645.e9, https://doi.org/10.1016/j.jvoice.2018.12.005

  42. Moreira T. de C. et al. (2015), Substance use, voice changes and quality of life in licit and illicit drug users [in Portuguese], Revista CEFAC, 17(2): 374–384, https://doi.org/10.1590/1982-021620156714

  43. Newman S.R., Butler J., Hammond E.H., Gray S.D. (2000), Preliminary report on hormone receptors in the human vocal fold, Journal of Voice, 14(1): 72–81, https://doi.org/10.1016/S0892-1997(00)80096-X

  44. Ning H., Zheng X., Lu X., Yuan Y. (2021), Disentangled representation learning for cross-modal biometric matching, IEEE Transactions on Multimedia, 24: 1763–1774, https://doi.org/10.1109/TMM.2021.3071243

  45. Oh T.H. et al. (2019), Speech2Face: Learning the face behind a voice, [in:] 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7539–7548, https://doi.org/10.1109/cvpr.2019.00772

  46. O’Toole A., Weimer S., Dunlop J., Barwick R., Ayyad J., Phillips J. (2010), Recognizing people from dynamic video: Dissecting identity information with a fusion approach, Journal of Vision, 10(7): 643, https://doi.org/10.1167/10.7.643

  47. Pawelec Ł.P., Graja K., Lipowicz A. (2022), Vocal indicators of size, shape and body composition in Polish men, Journal of Voice, 36(6): 878.e9-878.e22, https://doi.org/10.1016/j.jvoice.2020.09.011

  48. Pawelec Ł., Kierczak K., Lipowicz A. (2023), Assessment of the obesity based on voice perception, Anthropological Review, 85(4): 43–60, https://doi.org/10.18778/1898-6773.85.4.04

  49. Perini T.A., Oliveira G. de L., Ornellas J. dos S., Oliveira F. de P. (2005), Technical error of measurement in anthropometry [in Portuguese], Revista Brasileira De Medicina Do Esporte, 11(1): 81–85, https://doi.org/10.1590/S1517-86922005000100009

  50. Pisanski K. et al. (2014), Vocal indicators of body size in men and women: A meta-analysis, Animal Behaviour, 95: 89–99, https://doi.org/10.1016/j.anbehav.2014.06.011

  51. Pisanski K. et al. (2016), Voice parameters predict sex-specific body morphology in men and women, Animal Behaviour, 112, 13–22, https://doi.org/10.1016/j.anbehav.2015.11.008

  52. Raine J., Pisanski K., Simner J., Reby D. (2019), Vocal communication of simulated pain, Bioacoustics, 28(5): 404–426, https://doi.org/10.1080/09524622.2018.1463295

  53. Raj A., Gupta B., Chowdhury A., Chadha S. (2010), A study of voice changes in various phases of menstrual cycle and in postmenopausal women, Journal of Voice, 24(3): 363–368, https://doi.org/10.1016/j.jvoice.2008.10.005

  54. Reinheimer D.M. et al. (2021), Formant frequencies, cephalometric measures, and pharyngeal airway width in adults with congenital, isolated, and untreated growth hormone deficiency, Journal of Voice, 35(1): 61–68, https://doi.org/10.1016/j.jvoice.2019.04.014

  55. Rendall D., Kollias S., Ney C., Lloyd P. (2005), Pitch (F0) and formant profiles of human vowels and vowel-like baboon grunts: The role of vocalizer body size and voice-acoustic allometry, The Journal of the Acoustical Society of America, 117(2): 944–955, https://doi.org/10.1121/1.1848011

  56. Rice A., Phillips P.J., Natu V., An X., O’Toole A.J. (2013), Unaware person recognition from the body when face identification fails, Psychological Science, 24(11): 2235–2243, https://doi.org/10.1177/0956797613492986

  57. Robbins R.A., Coltheart M. (2012), The effects of inversion and familiarity on face versus body cues to person recognition, Journal of Experimental Psychology: Human Perception and Performance, 38(5): 1098–1104, https://doi.org/10.1037/a0028584

  58. Roers F., Murbe D., Sundberg J. (2009), Voice classification and vocal tract of singers: a study of x-ray images and morphology, The Journal of the Acoustical Society of America, 125(1): 503–512, https://doi.org/10.1121/1.3026326

  59. Rojas S., Kefalianos E., Vogel A. (2020), How does our voice change as we age? A systematic review and meta-analysis of acoustic and perceptual voice data from healthy adults over 50 years of age, Journal of Speech, Language, and Hearing Research, 63(2): 533–551, https://doi.org/10.1044/2019_jslhr-19-00099

  60. Rothkrantz L.J.M., Wiggers P., van Wees J.W.A., van Vark R.J. (2004), Voice stress analysis, [in:] Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science, Sojka P., Kopecek I., Pala K. [Eds.], Vol. 3206, pp. 449–456, https://doi.org/10.1007/978-3-540-30120-2_57

  61. Sondhi S., Khan M., Vijay R., Salhan A.K. (2015), Vocal indicators of emotional stress, International Journal of Computer Applications, 122(15): 38–43, https://doi.org/10.5120/21780-5056

  62. Sorokowski P. et al. (2019), Voice of authority: Professionals lower their vocal frequencies when giving expert advice, Journal of Nonverbal Behavior, 43(2): 257–269, https://doi.org/10.1007/s10919-019-00307-0

  63. Story B.H., Titze I.R., Hoffman E.A. (2001), The relationship of vocal tract shape to three voice qualities, The Journal of the Acoustical Society of America, 109(4): 1651–1667, https://doi.org/10.1121/1.1352085

  64. Teixeira J.P., Oliveira C., Lopes C. (2013), Vocal acoustic analysis – Jitter, shimmer and HNR parameters, Procedia Technology, 9: 1112–1122, https://doi.org/10.1016/j.protcy.2013.12.124

  65. Titze I.R. (1994), Fluctuations and perturbations in vocal output, [in:] Principles of Voice Production, Prentice Hall, pp. 209–306.

  66. Titze I.R. (2011), Vocal fold mass is not a useful quantity for describing F0 in vocalization, Journal of Speech Language and Hearing Research, 54(2): 520–522, https://doi.org/10.1044/1092-4388(2010/09-0284)

  67. Voelter C. et al. (2008), Detection of hormone receptors in the human vocal fold, European Archives of Oto-Rhino-Laryngology, 265: 1239–1244, https://doi.org/10.1007/s00405-008-0632-x

  68. Vorperian H.K., Kent R.D., Gentry L.R., Yandell B.S. (1999), Magnetic resonance imaging procedures to study the concurrent anatomic development of vocal tract structures: Preliminary results, International Journal of Pediatric Otorhinolaryngology, 49(3): 197–206, https://doi.org/10.1016/S0165-5876(99)00208-6

  69. Wen P., Xu Q., Jiang Y., Yang Z., He Y., Huang Q. (2021), Seeking the shape of sound: An adaptive framework for learning voice-face association, [in:] 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16347–16356, https://doi.org/10.1109/CVPR46437.2021.01608

  70. Wood S. (1986), The acoustical significance of tongue, lip, and larynx maneuvers in rounded palatal vowels, The Journal of the Acoustical Society of America, 80(2): 391–401, https://doi.org/10.1121/1.394090

  71. Wu C.Y., Hsu C.C., Neumann U. (2022), Cross-modal perceptionist: Can face geometry be gleaned from voices?, [in:] 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10452–10461, https://doi.org/10.1109/CVPR52688.2022.01020

  72. Wyganowska-Świątkowska M., Kowalkowska I., Mehr K., Dąbrowski M. (2013), An anthropometric analysis of the head and face in vocal students, Folia Phoniatrica et Logopaedica, 65(3): 136–142, https://doi.org/10.1159/000354939

  73. Young A.W., Bruce V. (2011), Understanding person perception, British Journal of Psychology, 102(4): 959–974, https://doi.org/10.1111/j.2044-8295.2011.02045.x

  74. Zheng A., Hu M., Jiang B., Huang Y., Yan Y., Luo B. (2021), Adversarial-metric learning for audio-visual cross-modal matching, IEEE Transactions on Multimedia, 24: 338–351, https://doi.org/10.1109/TMM.2021.3050089

Other articles by the same author(s)