Development of Yoruba Dialects Classification Model for Automatic Speech Recognition Systems Using KNN
How to cite (IJASEIT) :
This research presents, the development of Yoruba dialects classification Model for automatic speech recognition systems (ASRs) using K-Nearest Neighbor (K-NN). Research had revealed that ASRs perform better with correct dialects classification. Therefore, a non-parametric (i.e K-NN) model was developed and implemented on a Matlab 2021 platform to classify three (3) dialects (Ijebu, Ibadan and Ondo) from Ogun, Oyo and Ondo states respectively of Nigeria. The dialects were recorded at different environments, data sizes and at “opus file” format. They were later converted to “.wav” using the EZ CD Audio Converter Software. The Program4Pc Video Converter Pro was used to trim the converted audio waveforms to the same size and converted them to image signals suitable for model training, validation and testing. The results showed that the developed K-NN Classifier worked with an average performance accuracy of 91.11% and Recall {Sensitivity) of 86.67%. These results indicated that the model can be used to classify dialects of the same language hence, can help to improve the performance of robust ASR systems. However, for further improvement, better Classifiers that can handle large volumes of data should be employer.
Oladipo, FO., Habeeb, R.A. Musa, A.E. mezuruike, C. and Adeiza, O.A. (2021): “Automatic SpeechRecognition and Accent Identification of Ethnically Diverse Nigerian English Speakers”. International Journal of Applied Information Systems (IJAIS) – ISSN : 2249-0868 Foundation of Computer Science FCS, New York, USA Volume 12– No.36, May 2021 – www.ijais.org. Pp 41-48.
Wendy Baker, David Eddington, Lyndsey Nay, (2009): “Dialect identification: The effects of region of origin and amount of experience”. American Speech, American Dialect Society. Vol. 84, No. 1, Pp 48-71.
ZongzeRen, Guofu Yang, ShugongXu, “Two-stage Training for Chinese Dialect Recognition”, 2019 In Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai, China.
Nagaratna B. Chittaragi, AsavariLimaye, N. T Chandana, B Annappa, Shashidhar G. Koolagudi, “Automatic Text-Independent Kannada Dialect Identification System”, 2019 In Department of Computer Science and Engineering, National Institute of Technology Karnataka, Surathkal, Mangalore, India.
Rashmi Kethireddy, Sudarsana Reddy Kadiri, Suryakanth V. Gangashetty, “Exploration of temporal dynamics of frequency domain linear prediction cepstral coefficients for dialect classification”,2021.
Sunija A. P, Rajisha T.M, Riyas K.S, (2016): “Comparative Study of Different Classifiers for Malayalam Dialect Recognition System”. Elsevier Ltd .Procedia Technology 24 ( 2016 )Pp 1080-1088.
Bo Li, Tara N. Sainath, Khe Chai Sim, MichielBacchiani, Eugene Weinstein, Patrick Nguyen, Zhifeng Chen, Yonghui Wu, Kanishka Rao, “Multi-dialect speech recognition with a single sequence-to-sequence model“,2017 in Google Inc., USA.
Cynthia G. Clopper, et. al., “Variation in the strength of lexical encoding across dialects”,2016.
Mohamed O., and Aly S. A. “Arabic Speech Emotion Recognition From Saudi Dialect Corpus “, 2021 in Department of Computer Science and Artificial Intelligence, College of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia.
Yusof, S.A.,,Atanda, A.F. and Hariharan, M. (2013): “A Review of Yorùbá Automatic Speech Recognition”. 2013 IEEE 3rd International Conference on System Engineering and Technology, 19 - 20 Aug. 2013, Shah Alam, Malaysia. Pp 242-247.