Ontology Features-Based Arabic Text Augmentation Using Word2Vec

Enas Tariq Khudair, REGIM-Lab.: REsearch Groups in Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENETCOM), Sfax, 3038, Tunisia
Onsa Lazzez, REGIM-Lab.: REsearch Groups in Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENETCOM), Sfax, 3038, Tunisia
Mourad Zaied, RTM-LAB, University of Gabes
Tarek M. Hamdani, REGIM-Lab.: REsearch Groups in Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENETCOM), Sfax, 3038, Tunisia AND Higher Institute of Computer Science Mahdia (ISIMa), University of Monastir, Tunisia, City, 10587, Country
Ahmed T. Sadiq, Department of Computer Science, College of Science, University of Technology-Iraq, Baghdad, Iraq
Habib Chabchoub, College of Bu
Adel M. Alimi, REGIM-Lab.: REsearch Groups in Intelligent Machines, University of Sfax, National Engineering School of Sfax (ENETCOM), Sfax, 3038, Tunisia AND Department of Electrical and Electronic Engineering Science, Faculty of Engineering and the Built Environment, University of Johannesburg, Johannesburg 2006, South Africa

Abstract

Text augmentation plays a major role when data is scarce. In this context, there are few Arabic news texts for specific purposes, and hence, there is a dire need to generate Arabic text, especially news. This paper presents an enhanced approach to Arabic text augmentation based on Arabic ontology features. The Arabic part of speech, particularly adjectives, verbs, and prepositions, and the ontology properties regarding such parts to create new texts, make up the first stage of the system, which has multiple stages. Word2Vector (Word2Vec) plays a pivotal role in giving Arabic ontology features to the specific Arabic Part of Speech (PoS). Two important words (synonym and antonym) are used to generate the expanded texts. The results show a good Arabic text that is close/opposite to the meaning of the original text. In real-world applications, we need similarity and opposite to the original sentences. Hence, our approach achieves these specifications to a large extent. We used both ASTD and AraSarcasm-v2 datasets, and the ASTD dataset was better than the AraSarcasm-V2, giving the highest percentages, especially when our approach replaced verbs with 84%, adjectives with 79%, and nouns with 73%, We used 3 datasets LIAR, ASTD and AraSarcasm-v2 datasets, and the LIAR dataset was better than the ASTD and AraSarcasm-V2, giving the highest percentages, especially when our approach replaced verbs with 86%, adjectives with 82%, and nouns with 77%.

Reason for Expression of Concern:The Editors wish to alert readers to potential concerns regarding the reliability of the findings reported in ``Ontology Features-Based Arabic Text Augmentation Using Word2Vec (Manuscript 1302)''. \\bigskip The journal has initiated an additional editorial assessment of the article's methodology, data provenance, and reported outcomes to confirm their reliability and reproducibility. \\bigskip This notice is issued to ensure transparency while the review is ongoing. The Expression of Concern does not constitute a final determination regarding the validity of the work. The journal will update readers once the assessment is completed and will take any necessary editorial action in accordance with the journal's policies and COPE guidance." See expression of concern available at:
DOI: https://doi.org/10.52866/2788-7421.1386.
Available at: http://ijcsm.researchcommons.org/ijcsm/vol7/iss1/40

We have found no evidence of misconduct or invalidity in the research. The gindings presented in the original article are considered robust and accurate. This notice formally resolves the previously published Expression of Concern. The original article stands as published, and we affirm the integrity of the research. See resolution of expression of concern available at:
DOI: https://doi.org/10.52866/2788-7421.1411
Available at: http://ijcsm.researchcommons.org/ijcsm/vol7/iss2/11
The publication timeline is as follows: Original Article → Expression of Concern → Resolution.

Recommended Citation

Khudair, Enas Tariq; Lazzez, Onsa; Zaied, Mourad; Hamdani, Tarek M.; Sadiq, Ahmed T.; Chabchoub, Habib; and Alimi, Adel M. (2025) "Ontology Features-Based Arabic Text Augmentation Using Word2Vec," Iraqi Journal for Computer Science and Mathematics: Vol. 6: Iss. 3, Article 40.
DOI: https://doi.org/10.52866/2788-7421.1302
Available at: https://ijcsm.researchcommons.org/ijcsm/vol6/iss3/40

Download

Included in

Computer Engineering Commons

COinS

Ontology Features-Based Arabic Text Augmentation Using Word2Vec

Authors

Abstract

Recommended Citation

Included in

Share

Search