Karim Bouzoubaa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karim Bouzoubaa is active.

Explore More

Publication

Featured researches published by Karim Bouzoubaa.

conference of the european chapter of the association for computational linguistics | 2009

Structure-Based Evaluation of an Arabic Semantic Query Expansion Using the JIRS Passage Retrieval System

Lahsen Abouenour; Karim Bouzoubaa; Paolo Rosso

The adoption of semantic Query Expansion (QE) could be useful in the context of Question/Answering (Q/A) systems. For the Arabic language this is a challenging task since it has many particularities (short vowels, absence of capital letters, complex morphology, etc.). This paper presents an evaluation of a proposed semantic QE based on Arabic WordNet (AWN). Two types of experiments are conducted: the keyword-based evaluation which uses a classical search engine as passage retrieval system, and the structure-based evaluation that uses the Java Information Retrieval System (JIRS) which takes into account the structure of the question. Results show that the best performances in terms of accuracy and Mean Reciprocal Rank are reached when the proposed semantic QE together with JIRS are used.

2013 ACS International Conference on Computer Systems and Applications (AICCSA) | 2013

Bootstrapping a WordNet for an Arabic dialect from other WordNets and dictionary resources

Violetta Cavalli-Sforza; Hind Saddiki; Karim Bouzoubaa; Lahsen Abouenour; Mohamed Maamouri; Emily Goshey

We describe an experiment in developing a first version of WordNet for Iraqi Arabic starting from Arabic WordNet (for Modern Standard Arabic), Princeton WordNet (for English) and a bidirectional English-Iraqi Arabic dictionary. The resulting initial version of the target WordNet so-constructed was made available to human experts in Iraqi Arabic for correction and evaluation.

acs/ieee international conference on computer systems and applications | 2015

Text readability for Arabic as a foreign language

Hind Saddiki; Karim Bouzoubaa; Violetta Cavalli-Sforza

In this study, we evaluate the informativeness of lexical, morphological and semantic features in determining the readability of texts geared towards learners of Arabic as a foreign language. We have gathered low-complexity features with the purpose of establishing a baseline for future research in readability assessment, using freely available natural language processing (NLP) and machine learning (ML) tools on a publicly accessible corpus. We tested common classification algorithms, as well as random forests-an ensemble learning method-and report on their results using several evaluation measures for comparability with similar work. Our results suggest that a small set of easily computed features can be indicative of the reading level of a text. Moreover, our findings will serve as a common ground, for ourselves and others, to evaluate and compare the performance of more elaborate techniques and feature sets.

Proceedings of the International Conference on Learning and Optimization Algorithms: Theory and Applications | 2018

The Development of a Standard Morpho-Syntactic Lexicon for Arabic NLP

Abdelhamid El Jihad; Driss Namly; fettah Hamdani; Karim Bouzoubaa

In this paper, we present a linguistic resource developed at our institute which will soon be available in open source. ALIF (Arabic Lexicon Inflected Forms) is a morpho-syntactic lexicon of the inflected forms of the Arabic language in which each inflected form is associated with morpho-syntactic information (lemma, root, POS, type, prefix, and suffix). This lexicon is designated for use in computer applications of Arabic natural language processing and in particular: analyzers, spell-checkers, and POS tagger.

Archive | 2018

A Survey and Comparative Study of Arabic NLP Architectures

Younes Jaafar; Karim Bouzoubaa

Arabic Natural Language Processing (ANLP) has known a significant progress during the last years. As a result, several ANLP tools and applications have been developed such as tokenizers, Part Of Speech taggers, morphological analyzers, syntactic parsers, etc. However, most of these tools are heterogeneous and can hardly be reused in the context of other projects without modifying their source code. This problem is known to be common to all languages, that is why some advanced NLP language independent architectures have emerged such as GATE (Cunningham et al. ACL, 2002) [1] and UIMA (Apache UIMA Manuals and Guides, 2015) [2]. These architectures have significantly changed the way NLP applications are designed and developed. They provide homogenous structures for applications, better reusability and faster deployment. In this article, we present a comparative study of NLP architectures in order to specify which ones can suitably deal with Arabic language and its specificities.

Procedia Computer Science | 2017

Enhancing Visualization in Readability Reports for Arabic Texts.

Hind Saddiki; Violetta Cavalli-Sforza; Karim Bouzoubaa

Abstract Readability assessment for Arabic is still largely underserved in both research and software development. We believe that improved usability of the few tools currently released will motivate a greater user-base, and in doing so garner more interest in this topic from the research community. With that in mind, we examine recently developed readability tools with a graphical component, formulate recommendations, and propose visual enhancements to the way readability scores are reported to improve usability and informativeness.

International Conference on Arabic Language Processing | 2017

A New Tool for Benchmarking and Assessing Arabic Syntactic Parsers

Younes Jaafar; Karim Bouzoubaa

This work aims to develop a Natural Language Processing (NLP) tool for benchmarking and assessing Arabic syntactic parsers. This tool is integrated within the Software Architecture For Arabic language pRocessing (SAFAR). Indeed, SAFAR contains several ANLP tools from simple preprocessing up to the semantic level. The benchmarking tool will take advantage of the available basic tools in addition to the flexibility and reusability of SAFAR. The benchmark process takes as input an evaluation corpus and one/several syntactic parsers implementations. As a result, it outputs the most common metrics used for evaluation namely: precision, recall, accuracy and F-measure. We introduced also a new metric called Gp-score which takes into account the execution time besides the accuracy. The execution time is very crucial for some tasks such as real-time automatic translators or in the context of processing huge data. This benchmarking solution will help researchers in comparing their parsers against each other; it will help as well other researchers in selecting the appropriate parser to use within their high level projects. Two Arabic syntactic parsers are evaluated to give a concrete example of this tool: The Stanford parser and the ATKS parser.

acs/ieee international conference on computer systems and applications | 2016

Interoperable Arabic language resources building and exploitation in SAFAR platform

Driss Namly; Yasser Regragui; Karim Bouzoubaa

This article presents the logic followed by our team in Arabic language resources building processes. Our approach consists to deliver to the community some useful resources respecting interoperability rules based on the Lexical Markup Framework standard, enclosing linguistic features gathered by a team of linguists and easily usable through three offered methods that are the direct exploitation of resource files, the use of the SAFAR resources API or the online browsing provided by the SAFAR web.

international conference on conceptual structures | 2011

Integration of the controlled language ACE to the amine platform

Mohammed Nasri; Adil Kabbaj; Karim Bouzoubaa

This paper presents the integration of the controlled language ACE (Attempto Controlled English) to Amine platform. Since the parser engine of ACE (ACE Parser Engine or APE) generates a DRS structure (Discourse Representation Structure), we have developed a mapping from DRS to CG (DRS2CG) which produces a CG equivalent to the DRS produced by APE. Through this mapping and this integration of ACE into Amine, Amine users can use controlled language to express their knowledge or specifications, instead of having to express them in CG directly.

language resources and evaluation | 2013