Archives of Acoustics,
36, 4, pp. 695–712, 2011
A New Approach to Parametric Modeling of Glottal Flow
Glottal waveform models have long been employed in improving the quality of
speech synthesis. This paper presents a new approach for modeling the glottal flow.
The model is based on three control volumes that strike a one-mass and two-springs
system sequentially and generate a glottal pulse. The first, second and third control
volumes represent the opening, closing and closed phases of the vocal folds, respec-
tively. The masses of the three control volumes and the size of the first one are the
four parameters that define the shape, pitch and amplitude of the glottal pulse. The
model may be viewed as parametric approach governed by second order differential
equations rather than analytical functions and is very flexible for designing a glottal
pulse. The glottal pulse generated by the present model, when compared with those
generated by Rosenberg, LF and mucosal wave propagation models demonstrates
that it appropriately represents the opening, closing and closed phases of the vo-
cal fold oscillation. This leads to the validity of our model. Numerical solution of
the present model has been found to be very efficient as compared to its analytical
solution and two other well-known parametric models Rosenberg++ and LF. The
accuracy of the numerical solution has been illustrated with the help of analytical
solution. It has been observed that the accuracy improves by increasing the size of
the first control volume and may decrease insignificantly with increase in the mass
of any of the control volumes. Two experiments with the present model support
its successful implementation as a voice source in speech synthesis. Thus our model
renders itself as an efficient, accurate and realistic choice as a voice source to be
employed in real-time speech production.
speech synthesis. This paper presents a new approach for modeling the glottal flow.
The model is based on three control volumes that strike a one-mass and two-springs
system sequentially and generate a glottal pulse. The first, second and third control
volumes represent the opening, closing and closed phases of the vocal folds, respec-
tively. The masses of the three control volumes and the size of the first one are the
four parameters that define the shape, pitch and amplitude of the glottal pulse. The
model may be viewed as parametric approach governed by second order differential
equations rather than analytical functions and is very flexible for designing a glottal
pulse. The glottal pulse generated by the present model, when compared with those
generated by Rosenberg, LF and mucosal wave propagation models demonstrates
that it appropriately represents the opening, closing and closed phases of the vo-
cal fold oscillation. This leads to the validity of our model. Numerical solution of
the present model has been found to be very efficient as compared to its analytical
solution and two other well-known parametric models Rosenberg++ and LF. The
accuracy of the numerical solution has been illustrated with the help of analytical
solution. It has been observed that the accuracy improves by increasing the size of
the first control volume and may decrease insignificantly with increase in the mass
of any of the control volumes. Two experiments with the present model support
its successful implementation as a voice source in speech synthesis. Thus our model
renders itself as an efficient, accurate and realistic choice as a voice source to be
employed in real-time speech production.
Keywords:
control volumes; spring-mass system; vocal folds; Rosenberg glottal model; LF glottal model
Full Text:
PDF
Copyright © Polish Academy of Sciences & Institute of Fundamental Technological Research (IPPT PAN).