10.1515/aoa-2017-0014
Speaker Model Clustering to Construct Background Models for Speaker Verification
References
Apsingekar V.R., De Leon P.L. (2009), Speaker Model Clustering for Efficient Speaker Identification in Large Population Applications, IEEE Trans. Audio. Speech. Lang. Processing, 17, 848–853.
Auckenthaler R., Mason J.S. (2001), Gaussian selection applied to text-independent speaker verification, Proc. Speaker Odyssey: The Speaker Recognition Workshop, 83–88, Greece.
Beigi H.S.M., Maes S.H., Chaudhari U.V., Sorensen S. (1999), A hierarchical approach to largescale speaker recognition, European Conference on Speech Communication and Technology, 2203–2206, Hungary.
Bimbot F., Bonastre J.-F., Fredouille C., Gravier G., Magrin-Chagnolleau I., Meignier S., Merlin T., Ortega-Garcia J., PetrovskaDelacretaz D., Reynolds D.A. (2004), A Tutorial on Text-Independent Speaker Verification, EURASIP J. Adv. Signal Process., 2004, 430–451.
Brew A., Cunningham P. (2009), Combining Cohort and UBM Models in Open Set Speaker Identification, Seventh International Workshop on ContentBased Multimedia Indexing, 62–67, Crete.
Brew A., Cunningham P. (2010), Combining cohort and UBM models in open set speaker detection, Multimed. Tools Appl., 48, 141–159.
Campbell J.P. (1997), Speaker recognition: a tutorial, Proc. IEEE, 85, 1437–1462.
Campbell W.M., Sturim D.E., Reynolds D.A., Solomonoff A. (2006), SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation, IEEE International Conference on Acoustics Speed and Signal Processing Proceedings, I-97-100, France.
De Leon P.L., Apsingekar V. (2007), Reducing Speaker Model Search Space in Speaker Identification, Biometrics Symposium, 1–6, USA.
Dehak N., Kenny P.J., Dehak R., Dumouchel P., Ouellet P. (2011), Front-End Factor Analysis for Speaker Verification, IEEE Trans. Audio. Speech. Lang. Processing, 19, 788–798.
Doddington G., Przybocki M., Martin A., Reynolds D. (2000), The NIST speaker recognition evaluation – Overview, methodology, systems, results, perspective, Speech Communication, 31, 225–254.
Gillick L., Cox S. (1989), Some statistical issues in the comparison of speech recognition algorithms, International Conference on Acoustics, Speech, and Signal Processing, 532–535.
Hossa R., Makowski R. (2016), An Effective Speaker Clustering Method using UBM and Ultra-Short Training Utterances, Archives of Acoustics, 41, 107–118.
Kenny P. (2005), Joint factor analysis of speaker and session variability: Theory and algorithms, CRIM, Montr. CRIM-06/08-13, 1–17.
Kenny P., Boulianne G., Ouellet P., Dumouchel P. (2007), Joint Factor Analysis Versus Eigenchannels in Speaker Recognition, IEEE Trans. Audio, Speech Lang. Process., 15, 1435–1447.
Kinnunen T., Li H. (2010), An overview of textindependent speaker recognition: From features to supervectors, Speech Communication, 52, 12–40.
McClanahan R.D., De Leon P.L. (2012), Mixture Component Clustering for Efficient Speaker Verification, Interspeech, 1086-1090, USA.
McClanahan R.D., De Leon P.L. (2015), Reducing computation in an i-vector speaker recognition system using a tree-structured universal background model, Speech Communication, 66, 36–46.
McLaren M., Vogt R., Baker B., Sridharan S. (2010), Data-Driven Background Dataset Selection for SVM-Based Speaker Verification, IEEE Trans. Audio. Speech. Lang. Processing, 18, 1496–1506.
Pallet D., FisherW., Fiscus J. (1990), Tools for the analysis of benchmark speech recognition, International Conference on Acoustics, Speech, and Signal Processing, 97–100.
Reynolds D.A. (1995), Speaker Identification and Verification using Gaussian mixture speaker odels, Speech Communication, 17, 91–108.
Reynolds D.A. (1997), Comparison of Background Normalization Methods for Text-Independent Speaker Verification, European Conference on Speech Communication and Technology, Greece.
Reynolds D.A., Quatieri T.F., Dunn R.B. (2000), Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing, 10, 19–41.
Reynolds D.A., Rose R.C. (1995), Robust textindependent speaker identification using Gaussian mixture speaker models, IEEE Trans. Speech Audio Process., 3, 72–83.
Richardson F., Reynolds D., Dehak N. (2015), Deep Neural Network Approaches to Speaker and Language Recognition, IEEE Signal Processing Letters, 22, 1671–1675.
Sadjadi S.O., Slaney M., Heck L. (2013), MSR Identity Toolbox v1.0: A MATLAB Toolbox for Speaker Recognition Research, Speech and Language Processing Technical Committee Newsletter, IEEE, 1–4.
Saeidi R., Kinnunen T., Mohammadi H.R.S., Rodman R., Franti P. (2010), Joint frame and Gaussian selection for text independent speaker verification, IEEE International Conference on Acoustics, Speech and Signal Processing, 4530–4533, USA.
Xiang B., Berger T. (2003), Efficient textindependent speaker verification with structural gaussian mixture models and neural network, IEEE Trans. Speech Audio Process., 11, 447–456.
Xiong Z., Zheng T.F., Song Z., Soong F., Wu W. (2006), A tree-based kernel selection approach to efficient Gaussian mixture model–universal background model based speaker identification, Speech Communication, 48, 1273–1282.
Zhu D., Ma B., Li H. (2011), Speaker Verification With Feature-Space MAPLR Parameters, IEEE Trans. Audio. Speech. Lang. Processing, 19, 505–515.
DOI: 10.1515/aoa-2017-0014