Archives of Acoustics, 44, 1, pp. 13–26, 2019
10.24425/aoa.2019.126348

Perceptual Identification of Polish Vowels Due to F0 Changes

Mariusz Stanisław OWSIANNY
1) Adam Mickiewicz University in Poznań 2) Poznań Supercomputing and Networking Center
Poland

The paper investigates the interdependence between the perceptual identification of the vocalic quality of six isolated Polish vowels traditionally defined by the spectral envelope and the fundamental frequency F0. The stimuli used in the listening experiments were natural female and male voices, which were modified by changing the F0 values in the ±1 octave range. The results were then compared with the outcome of the experiments on fully synthetic voices. Despite the differences in the generation of the investigated stimuli and their technical quality, consistent results were obtained. They confirmed the findings that in the perceptual identification of vowels of key importance is not only the position of the formants on the F1 × F2 plane but also their relationship to F0, the connection between the formants and the harmonics and other factors. The paper presents, in quantitative terms, all possible kinds of perceptual shifts of Polish vowels from one phonetic category to another in the function of voice pitch. An additional perceptual experiment was also conducted to check a broader range of F0 changes and their impact on the identification of vowels in CVC (consonant, vowel, consonant) structures. A mismatch between the formants and the glottal tone value can lead to a change in phonetic category.
Keywords: F0; formants; speech perception; vowel shifts; voice quality
Full Text: PDF

References

Assamann P.F., Nearey T.M. (2008), Identyfication of frequency schifted vowels, The Journal of the Acoustical Society of America, 124, 5, 3203–3212.

Boersma P., Weenink D. (2013), PRAAT: doing phonetics by computer [Computer program]. Version 5.3.59, retrieved December 11, 2014 from http://www.praat.org.

Carlson R., Fant G., Granström B. (1975), Two-formant models, pitch, and vowel perception, [In:] G. Fant, M.A.A. Tatham (Eds.), Auditory Analysis and Perception of Speech, Academic Press, London, pp. 55–82.

Chistovich L.A., Sheikin R.L., Lublinskaya V.V. (1979), "Centers of gravity" and spectral peaks as the determinants of vowel quality, [In:] B. Lindblom, S. Öhman (Eds.), Frontiers of Speech Communication Research, Academic Press, London, pp. 143–157.

Chladkova K., Boersma P., Podlipsky V.J. (2009), On-line formant shifting as a function of F0, Proceedings of the INTERSPEECH 2009 Conference, pp. 464–467, Brighton, UK.

Di Benedetto M.G. (1994), Acoustic and perceptual evidence of a complex relation between F1 and F0 in determining vowel height, Journal of Phonetics, 22, 205–224.

Diehl R.L., Lindblom B., Hoemeke K.A., Fahey R.P. (1996), On explaining certain male-female differences in the phonetic realization of vowel categories, Journal of Phonetics, 24, 187–208.

Długosz-Kurczabowa K., Dubisz S. (2006), Historical grammar of Polish language [in Polish: Gramatyka historyczna języka polskiego], Wydawnictwo Uniwersytetu Warszawskiego, Warszawa, pp. 96, 129.

Fant G. (1960), Acoustic theory of speech production, Mouton, Hague.

Hirahara T., Kato H. (1992), The effect of F0 on vowel identification, [In:] Speech Perception, Production and Linguistic Structure, Y. Tohkura, E. Vatikiotis-Bateson, Y. Sagisaka (Eds.), Ohmsha, Tokyo, pp. 89–112.

Imiołczyk J. (1991), Determination of perceptual boundaries between the male female and child's voices in isolated synthetic polish vowels, Archives of Acoustics, 16, 2, 305–323.

Jassem W. (1992), Acoustic-phonetic variability of polish vowels, Archives of Acoustics, 17, 2, 217–233.

Johnson K. (1988a), F0 normalization and adjusting to talker, Research on Speech Perception, Progress Report 14, pp. 237–258.

Johnson K. (1988b), Intonational context and F0 normalization, Research on Speech Perception, Progress Report 14, pp. 81–108.

Jorasz U. (1999), Selectivity of the auditory system, Adam Mickiewicz University Press, Poznań, pp. 38–51.

Klatt D.H., Klatt L.C. (1990), Analysis, synthesis, and perception of voice quality variations among female and male talkers, Journal of the Acoustical Society of America, 87, 2, 820–857.

Klatt D.H. (1980), Software for a cascade/parallel formant synthesizer, Journal of the Acoustical Society of America, 67, 971–995.

Kortekaas R.W.L., Kohlrausch A. (1997), Psychoacoustical evaluation of the pitch-synchronous overlapand-add speech-waveform manipulation technique using single-formant stimuli, Journal of the Acoustical Society of America, 101, 4, 2202–2213.

Maurer D., Suter H., Friedrichs D., Dellwo V. (2015), Gender and age differences in vowel-related formant patterns: What happens if men, women, and children produce vowels on different and on similar F0?, Journal of the Acoustical Society of America, 137, 4, 2416–2416.

Meister E., Werner S. (2009), Vowel Category Perception Affected by Microdurational Variations, Proceedings of the INTERSPEECH 2009 conference, pp. 388–391, Brighton, UK.

Mousa A. (2010), Voice conversion using pitch shifting algorithm by time stretching with PSOLA and re-sampling, Journal of Electrical Engineering, 61, 1, 57–61.

Obrębowski A. (2008), Vocal organ and its importance in social communication [in Polish: Narząd głosu i jego znaczenie w komunikacji społecznej], Wydawnictwo Naukowe Uniwersytetu Medycznego w Poznaniu.

Owsianny M. (1994), The synthesis of female voices using a software synthesizer, Archives of Acoustics, 19, 2, 185–199.

Owsianny M. (1995), The Effect of Voice Pitch on the Perception of Synthetic Polish Vowels, Proceedings of the 4th European Conference on Speech Communication and Technology – EUROSPEECH'95, pp. 945–948, Madrid, Spain.

Owsianny M. (2001), Interaction between vocalic quality and fundamental frequency in the perception of Polish vowels, Proceedings of the PROSODY 2000 Conference, Speech Recognition and Synthesis, pp. 197–204, Kraków, Poland.

Peirce J.W. (2007), PsychoPy – Psychophysics software in Python, Journal of Neuroscience Methods, 162, 1–2, 8–13.

Peterson G.E., Barney H.L. (1952), Control methods used in a study of the vowels, Journal of the Acoustical Society of America, 24, 175–184.

Sundberg J. (1977), The Acoustics of the Singing Voice, Scientific American, 236, 3, 82–4, 86, 88–91.

Syrdal A.K. (1985), Aspects of a model of the auditory representation of American English vowels, Speech Communication, 4, 121–135.

Traunmüller H. (1981), Perceptual dimension of openness in vowels, Journal of the Acoustical Society of America, 69, 5, 1465–1475.




DOI: 10.24425/aoa.2019.126348

Copyright © Polish Academy of Sciences & Institute of Fundamental Technological Research (IPPT PAN)