•  
  •  
 

Abstract

Text augmentation plays a major role when data is scarce. In this context, there are few Arabic news texts for specific purposes, and hence, there is a dire need to generate Arabic text, especially news. This paper presents an enhanced approach to Arabic text augmentation based on Arabic ontology features. The Arabic part of speech, particularly adjectives, verbs, and prepositions, and the ontology properties regarding such parts to create new texts, make up the first stage of the system, which has multiple stages. Word2Vector (Word2Vec) plays a pivotal role in giving Arabic ontology features to the specific Arabic Part of Speech (PoS). Two important words (synonym and antonym) are used to generate the expanded texts. The results show a good Arabic text that is close/opposite to the meaning of the original text. In real-world applications, we need similarity and opposite to the original sentences. Hence, our approach achieves these specifications to a large extent. We used both ASTD and AraSarcasm-v2 datasets, and the ASTD dataset was better than the AraSarcasm-V2, giving the highest percentages, especially when our approach replaced verbs with 84%, adjectives with 79%, and nouns with 73%, We used 3 datasets LIAR, ASTD and AraSarcasm-v2 datasets, and the LIAR dataset was better than the ASTD and AraSarcasm-V2, giving the highest percentages, especially when our approach replaced verbs with 86%, adjectives with 82%, and nouns with 77%.

Share

COinS