Influence of Facial, Head, and Neck Dimensions on Vocal Acoustic Parameters in Polish Speakers
Abstract
The relationships between human voice parameters and body dimensions have been previously described, but the connections between voice and face geometry remain poorly researched. This study aims to determine the relationships between face dimensions and acoustic parameters in both sexes and examines 111 adult participants (30 males). Each participant undergoes voice recording, which includes five sustained vowels, along with anthropometric measurements of the neck, head, and face regions. Comparisons between voice parameters and the head, face, and neck regions are conducted employing Pearson’s correlation coefficients (r) and a multiple linear regression model. The results reveal significant relationships between head, neck, face dimensions and acoustic parameters in both sexes. Males with higher noses, greater head circumferences, and wider faces tend to have lower formants and more stable voices. Females with larger head circumferences had lower formant values, and those with greater neck circumferences tend to have more stable voices. Also, females with increased nose height have a lower fourth formant (F4). Moreover, females with wider faces, noses, and jaws tend to have less rough voices (lower jitter) and longer maximum phonation time (MPT). These findings may be useful for scientists and law enforcement authorities in creating algorithms that build face models based on voice signals.
Keywords:
biometry, formants, fundamental frequency, pitch, personal identificationReferences
- Abitbol J., Abitbol P., Abitbol B. (1999), Sex hormones and the female voice, Journal of Voice, 13(3): 424–446, https://doi.org/10.1016/S0892-1997(99)80048-4
- Arnocky S., Hodges-Simeon C.R., Ouellette D., Albert G. (2018), Do men with more masculine voices have better immunocompetence?, Evolution and Human Behavior, 39(6): 602–610, https://doi.org/10.1016/j.evolhumbehav.2018.06.003
- Boersma P., Weenink D. (2019), Praat: Doing phonetics by computer, Computer program, Version 6.0.56, http://www.praat.org/ (access: 20.06.2019).
- Bommarito S. et al. (2019), Correlation between voice, speech, body and facial types in young adults, Global Journal of Otolaryngology, 20(4): 556041, https://doi.org/10.19080/gjo.2019.20.556041
- Bottalico P., Marunick M.T., Nudelman C.J., Webster J., Jackson-Menaldi M.C. (2021), Singing voice quality: The effects of maxillary dental arch and singing style, Journal of Voice, 35(3): 501.e11–501.e18, https://doi.org/10.1016/j.jvoice.2019.09.015
- Brattstrom V., Odenrick L., Leanderson R. (1991), Dentofacial morphology in professional opera singers, Acta Odontologica Scandinavica, 49(3): 147–151, https://doi.org/10.3109/00016359109005899
- Bruckert L., Lienard J.-S., Lacroix A., Kreutzer M., Leboucher G. (2006), Women use voice parameters to assess men’s characteristics, Biological Sciences, 273(1582): 83–89, https://doi.org/10.1098/rspb.2005.3265
- Bunker D. (2017), Speech2Face: Reconstructed lip syncing with generative adversarial networks. Data reflexions: Thoughts and Projects.
- Burton A.M., Wilson S., Cowan M., Bruce V. (1999), Face recognition in poor-quality video: Evidence from security surveillance, Psychological Science, 10(3): 243–248, https://doi.org/10.1111/1467-9280.00
- Byeon H., Cha S. (2020), Evaluating the effects of smoking on the voice and subjective voice problems using a meta-analysis approach, Scientific Reports, 10(1): 4720, https://doi.org/10.1038/s41598-020-61565-3
- Cazacu C.J. et al. (2025), Morphology of facial aging: A shape-based quantification, Romanian Journal of Morphology and Embryology = Revue Roumaine de Morphologie et Embryologie, 66(2): 367–373, https://doi.org/10.47162/RJME.66.2.10
- Dabbs Jr. J.M., Mallinger A. (1999), High testosterone levels predict low voice pitch among men, Personality and Individual Differences, 27(4): 801–804, https://doi.org/10.1016/S0191-8869(98)00272-4
- Damrose E.J. (2009), Quantifying the impact of androgen therapy on the female larynx, Auris Nasus Larynx, 36(1): 110–112, https://doi.org/10.1016/j.anl.2008.03.002
- Erber N.P. (1979), Real-time synthesis of optical lip shapes from vowel sounds, The Journal of the Acoustical Society of America, 66(5): 1542–1544, https://doi.org/10.1121/1.383511
- Evans S., Neave N., Wakelin D. (2006), Relationships between vocal characteristics and body size and shape in human males: An evolutionary explanation for a deep male voice, Biological Psychology, 72(2): 160–163, https://doi.org/10.1016/j.biopsycho.2005.09.003
- Evans S., Neave, N. Wakelin D., Hamilton C. (2008), The relationship between testosterone and vocal frequencies in human males, Physiology & Behavior, 93(4–5): 783–788, https://doi.org/10.1016/j.physbeh.2007.11.033
- Fant G. (1960), Acoustic Theory of Speech Production: With Calculations Based on X-Ray Studies of Russian Articulations, 2nd ed., Walter de Gruyter.
- Fitch W.T. (1997), Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques, The Journal of the Acoustical Society of America, 102(2): 1213–1222, https://doi.org/10.1121/1.421048
- Fitch W.T., Giedd J. (1999), Morphology and development of the human vocal tract: A study using magnetic resonance imaging, The Journal of the Acoustical Society of America, 106(3): 1511–1522, https://doi.org/10.1121/1.427148
- Gonzalez J. (2004), Formant frequencies and body size of speaker: A weak relationship in adult humans, Journal of Phonetics, 32(2): 277–287, https://doi.org/10.1016/S0095-4470(03)00049-4
- Gonzalez J. (2007), Correlations between speakers’ body size and acoustic parameters of voice, Perceptual and Motor Skills, 105(1): 215–220, https://doi.org/10.2466/pms.105.1.215-220
- Graddol D., Swann J. (1983), Speaking fundamental frequency: Some physical and social correlates, Language and Speech, 26(4): 351–366, https://doi.org/10.1177/002383098302600403
- Graja K., Krol B. (2022), Anthropometry and Anthroposcopy. Lecture Notes [in Polish: Antropometria i Antroposkopia. Skrypt do Ćwiczeń], Unpublished teaching materials of the Division of Anthropology, Wroclaw University of Environmental and Life Sciences.
- Hamdan A.L. et al. (2012), Relationship between acoustic parameters and body mass analysis in young males, Journal of Voice, 26(2): 144–147, https://doi.org/10.1016/j.jvoice.2011.01.011
- Hamdan A.L.H. et al. (2013), Formant frequency in relation to body mass composition, Journal of Voice, 27(5): 567–571, https://doi.org/10.1016/j.jvoice.2012.09.005
- Jandova M., Urbanova P. (2016), The relationship between facial morphology, body measurements and socioeconomic factors, Anthropological Review, 79(2): 181–200, https://doi.org/10.1515/anre-2016-0014
- Kamachi M., Hill H., Lander K., Vatikiotis-Bateson E. (2003), ‘Putting the face to the voice’: Matching identity across modality, Current Biology, 13(19): 1709–1714, https://doi.org/10.1016/j.cub.2003.09.005
- Kikuchi Y. (2008), Three-dimensional relationship between pharyngeal airway and maxillo-facial morphology, The Bulletin of Tokyo Dental College, 49(2): 65–75, https://doi.org/10.2209/tdcpublication.49.65
- Kim T., Kang Y., Ko H. (2002), Achieving real-time lip-synch via SVM-based phoneme classification and lip shape refinement, [in:] Proceedings Fourth IEEE International Conference on Multimodal Interfaces, https://doi.org/10.1109/6ICMI.2002.1167010
- Kirgezen T., Sunter A.V., Yigit O., Huq G.E. (2017), Sex hormone receptor expression in the human vocal fold subunits, Journal of Voice, 31(4): 476–482, https://doi.org/10.1016/j.jvoice.2016.11.005
- Klasmeyer G., Sendlmeier W.F. (2000), Voice and emotional states, [in:] Voice Quality Measurement, Kent R.D., Ball M.J. [Eds.], pp. 339–357, Singular Publishing Group.
- Kogelschatz L., Barenholtz E. (2013), Matching voice and face identity from static images, Journal of Vision, 12(9): 1023, https://doi.org/10.1167/12.9.1023
- Krauss R.M., Freyberg R., Morsella E. (2002), Inferring speakers’ physical attributes from their voices, Journal of Experimental Social Psychology, 38(6): 618–625, https://doi.org/10.1016/S0022-1031(02)00510-3
- Ladefoged P., Harshman R., Goldstein L., Rice L. (1978), Generating vocal tract shapes from formant frequencies, The Journal of the Acoustical Society of America, 64(4): 1027–1035, https://doi.org/10.1121/1.382086
- Li X., Wen Y., Yang M., Wang J., Singh R., Raj B. (2023), Rethinking voice-face correlation: A geometry view, [in:] MM ’23: Proceedings of the 31st ACM International Conference on Multimedia, pp. 2458–2467, https://doi.org/10.1145/3581783.3611779
- Lucas T., Hatfield D., Henneberg M. (2023), A morphological comparison between a death mask of the American Prophet Joseph Smith and a photograph likely to depict him, Anthropological Review, 85(4): 1–13, https://doi.org/10.18778/1898-6773.85.4.01
- Macari A.T. et al. (2017), Association between facial length and width and fundamental frequency, Journal of Voice, 31(4): 410–415, https://doi.org/10.1016/j.jvoice.2016.12.001
- Macari A.T., Karam I.A., Tabri D., Sarieddine D., Hamdan A.L. (2015), Formants frequency and dispersion in relation to the length and projection of the upper and lower jaws, Journal of Voice, 29(1): 83–90, https://doi.org/10.1016/j.jvoice.2014.05.011
- Martin R. (1914), A Textbook of Anthropology: A Systematic Presentation [in German: Lehrbuch der Anthropologie in systematischer Darstellung], Jena, Gustav Fischer.
- Marunick M.T., Menaldi C.J. (2000), Maxillary dental arch form related to voice classification: A pilot study, Journal of Voice, 14(1): 82–91, https://doi.org/10.1016/S0892-1997(00)80097-1
- Mercer E., Lowell S.Y. (2020), The low mandible maneuver: Preliminary study of its effects on aerodynamic and acoustic measures, Journal of Voice, 34(4): 645.e1–645.e9, https://doi.org/10.1016/j.jvoice.2018.12.005
- Moreira T. de C. et al. (2015), Substance use, voice changes and quality of life in licit and illicit drug users [in Portuguese], Revista CEFAC, 17(2): 374–384, https://doi.org/10.1590/1982-021620156714
- Newman S.R., Butler J., Hammond E.H., Gray S.D. (2000), Preliminary report on hormone receptors in the human vocal fold, Journal of Voice, 14(1): 72–81, https://doi.org/10.1016/S0892-1997(00)80096-X
- Ning H., Zheng X., Lu X., Yuan Y. (2021), Disentangled representation learning for cross-modal biometric matching, IEEE Transactions on Multimedia, 24: 1763–1774, https://doi.org/10.1109/TMM.2021.3071243
- Oh T.H. et al. (2019), Speech2Face: Learning the face behind a voice, [in:] 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7539–7548, https://doi.org/10.1109/cvpr.2019.00772
- O’Toole A., Weimer S., Dunlop J., Barwick R., Ayyad J., Phillips J. (2010), Recognizing people from dynamic video: Dissecting identity information with a fusion approach, Journal of Vision, 10(7): 643, https://doi.org/10.1167/10.7.643
- Pawelec Ł.P., Graja K., Lipowicz A. (2022), Vocal indicators of size, shape and body composition in Polish men, Journal of Voice, 36(6): 878.e9-878.e22, https://doi.org/10.1016/j.jvoice.2020.09.011
- Pawelec Ł., Kierczak K., Lipowicz A. (2023), Assessment of the obesity based on voice perception, Anthropological Review, 85(4): 43–60, https://doi.org/10.18778/1898-6773.85.4.04
- Perini T.A., Oliveira G. de L., Ornellas J. dos S., Oliveira F. de P. (2005), Technical error of measurement in anthropometry [in Portuguese], Revista Brasileira De Medicina Do Esporte, 11(1): 81–85, https://doi.org/10.1590/S1517-86922005000100009
- Pisanski K. et al. (2014), Vocal indicators of body size in men and women: A meta-analysis, Animal Behaviour, 95: 89–99, https://doi.org/10.1016/j.anbehav.2014.06.011
- Pisanski K. et al. (2016), Voice parameters predict sex-specific body morphology in men and women, Animal Behaviour, 112, 13–22, https://doi.org/10.1016/j.anbehav.2015.11.008
- Raine J., Pisanski K., Simner J., Reby D. (2019), Vocal communication of simulated pain, Bioacoustics, 28(5): 404–426, https://doi.org/10.1080/09524622.2018.1463295
- Raj A., Gupta B., Chowdhury A., Chadha S. (2010), A study of voice changes in various phases of menstrual cycle and in postmenopausal women, Journal of Voice, 24(3): 363–368, https://doi.org/10.1016/j.jvoice.2008.10.005
- Reinheimer D.M. et al. (2021), Formant frequencies, cephalometric measures, and pharyngeal airway width in adults with congenital, isolated, and untreated growth hormone deficiency, Journal of Voice, 35(1): 61–68, https://doi.org/10.1016/j.jvoice.2019.04.014
- Rendall D., Kollias S., Ney C., Lloyd P. (2005), Pitch (F0) and formant profiles of human vowels and vowel-like baboon grunts: The role of vocalizer body size and voice-acoustic allometry, The Journal of the Acoustical Society of America, 117(2): 944–955, https://doi.org/10.1121/1.1848011
- Rice A., Phillips P.J., Natu V., An X., O’Toole A.J. (2013), Unaware person recognition from the body when face identification fails, Psychological Science, 24(11): 2235–2243, https://doi.org/10.1177/0956797613492986
- Robbins R.A., Coltheart M. (2012), The effects of inversion and familiarity on face versus body cues to person recognition, Journal of Experimental Psychology: Human Perception and Performance, 38(5): 1098–1104, https://doi.org/10.1037/a0028584
- Roers F., Murbe D., Sundberg J. (2009), Voice classification and vocal tract of singers: a study of x-ray images and morphology, The Journal of the Acoustical Society of America, 125(1): 503–512, https://doi.org/10.1121/1.3026326
- Rojas S., Kefalianos E., Vogel A. (2020), How does our voice change as we age? A systematic review and meta-analysis of acoustic and perceptual voice data from healthy adults over 50 years of age, Journal of Speech, Language, and Hearing Research, 63(2): 533–551, https://doi.org/10.1044/2019_jslhr-19-00099
- Rothkrantz L.J.M., Wiggers P., van Wees J.W.A., van Vark R.J. (2004), Voice stress analysis, [in:] Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science, Sojka P., Kopecek I., Pala K. [Eds.], Vol. 3206, pp. 449–456, https://doi.org/10.1007/978-3-540-30120-2_57
- Sondhi S., Khan M., Vijay R., Salhan A.K. (2015), Vocal indicators of emotional stress, International Journal of Computer Applications, 122(15): 38–43, https://doi.org/10.5120/21780-5056
- Sorokowski P. et al. (2019), Voice of authority: Professionals lower their vocal frequencies when giving expert advice, Journal of Nonverbal Behavior, 43(2): 257–269, https://doi.org/10.1007/s10919-019-00307-0
- Story B.H., Titze I.R., Hoffman E.A. (2001), The relationship of vocal tract shape to three voice qualities, The Journal of the Acoustical Society of America, 109(4): 1651–1667, https://doi.org/10.1121/1.1352085
- Teixeira J.P., Oliveira C., Lopes C. (2013), Vocal acoustic analysis – Jitter, shimmer and HNR parameters, Procedia Technology, 9: 1112–1122, https://doi.org/10.1016/j.protcy.2013.12.124
- Titze I.R. (1994), Fluctuations and perturbations in vocal output, [in:] Principles of Voice Production, Prentice Hall, pp. 209–306.
- Titze I.R. (2011), Vocal fold mass is not a useful quantity for describing F0 in vocalization, Journal of Speech Language and Hearing Research, 54(2): 520–522, https://doi.org/10.1044/1092-4388(2010/09-0284)
- Voelter C. et al. (2008), Detection of hormone receptors in the human vocal fold, European Archives of Oto-Rhino-Laryngology, 265: 1239–1244, https://doi.org/10.1007/s00405-008-0632-x
- Vorperian H.K., Kent R.D., Gentry L.R., Yandell B.S. (1999), Magnetic resonance imaging procedures to study the concurrent anatomic development of vocal tract structures: Preliminary results, International Journal of Pediatric Otorhinolaryngology, 49(3): 197–206, https://doi.org/10.1016/S0165-5876(99)00208-6
- Wen P., Xu Q., Jiang Y., Yang Z., He Y., Huang Q. (2021), Seeking the shape of sound: An adaptive framework for learning voice-face association, [in:] 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16347–16356, https://doi.org/10.1109/CVPR46437.2021.01608
- Wood S. (1986), The acoustical significance of tongue, lip, and larynx maneuvers in rounded palatal vowels, The Journal of the Acoustical Society of America, 80(2): 391–401, https://doi.org/10.1121/1.394090
- Wu C.Y., Hsu C.C., Neumann U. (2022), Cross-modal perceptionist: Can face geometry be gleaned from voices?, [in:] 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10452–10461, https://doi.org/10.1109/CVPR52688.2022.01020
- Wyganowska-Świątkowska M., Kowalkowska I., Mehr K., Dąbrowski M. (2013), An anthropometric analysis of the head and face in vocal students, Folia Phoniatrica et Logopaedica, 65(3): 136–142, https://doi.org/10.1159/000354939
- Young A.W., Bruce V. (2011), Understanding person perception, British Journal of Psychology, 102(4): 959–974, https://doi.org/10.1111/j.2044-8295.2011.02045.x
- Zheng A., Hu M., Jiang B., Huang Y., Yan Y., Luo B. (2021), Adversarial-metric learning for audio-visual cross-modal matching, IEEE Transactions on Multimedia, 24: 338–351, https://doi.org/10.1109/TMM.2021.3050089

