•  
  •  
 

Abstract

Speech recognition-based applications increased and developed as a result of artificial intelligence's rapid growth, particularly Machine Learning, which play a crucial role in many aspects of daily life, such as applications related to human-computer interaction, and natural language processing. The complexity and diversity of speech signals provides challenges in maximizing the rate of accuracy and efficiency of speech recognition systems. Hyperparameter tuning is a crucial step in machine learning that has a significant role in optimizing the performance and generalization by determining the optimal values for the model's hyperparameters. This paper employed the recently developed WAR Strategy optimization algorithm for optimizing the features related to the speech signal and tuning the hyperparameters of machine learning typical models for accurate and rapid speech recognition. Two types of features are extracted from the speech signal including the spectral feature using the Mel-Frequency Cepstral Coefficients (MFCCs) technique and the statistical features. Afterward these features are optimized using the WAR Strategy optimization algorithm to obtain the optimum features set that describe the speech signal important information. Finally, the hyperparameters of six classical machine learning models are tuned to serve as newly designed classifiers in the final classification phase of the proposed system. Three different language speech datasets are used to evaluate the proposed system (i.e. English, Arabic, Malaysian) to prove the high generalization property of the proposed system. The obtained recognition accuracy that was ranging from 98.38% to 100% in a training time between 0.001 to 19.8 second demonstrate the high effectiveness of the proposed speech recognition system in dealing with the many obstacles facing the recognition of speech signal within high accuracy, low resources requirements, and minimum training time.

Share

COinS