Rani Qumsiyeh
Brigham Young University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rani Qumsiyeh.
international conference on tools with artificial intelligence | 2011
Rani Qumsiyeh; Yiu-Kai Ng
Reading is an integral part of educational development, however, it is frustrating for people who struggle to understand (are not motivated to read, respectively) text documents that are beyond (below, respectively) their readability levels. Finding appropriate reading materials, with or without first scanning through their contents, is a challenge, since there are tremendous amount of documents these days and a clear majority of them are not tagged with their readability levels. Even though existing readability assessment tools determine readability levels of text documents, they analyze solely the lexical, syntactic, and/or semantic properties of a document, which are neither fully-automated, generalized, nor well-defined and are mostly based on observations. To advance the current readability analysis technique, we propose a robust, fully-automated readability analyzer, denoted ReadAid, which employs support vector machines to combine features from the US Curriculum and College Board, traditional readability measures, and the author(s) and subject area(s) of a text document d to assess the readability level of d. ReadAid can be applied for (i) filtering documents (retrieved in response to a web query) of a particular readability level, (ii) determining the readability levels of digitalized text documents, such as book chapters, magazine articles, and news stories, or (iii) dynamically analyzing, in real time, the grade level of a text document being created. The novelty of ReadAid lies on using authorship, subject areas, and academic concepts and grammatical constructions extracted from the US Curriculum to determine the readability level of a text document. Experimental results show that ReadAid is highly effective and outperforms existing state-of-the-art readability assessment tools.
Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013
Rani Qumsiyeh; Yiu-Kai Ng
Current web search engines, such as Google, Bing, and Yahoo!, rank the set of documents S retrieved in response to a user query and display the URL of each document D in S with a title and a snippet, which serves as an abstract of D. Snippets, however, are not as useful as they are designed for, which is supposed to assist its users to quickly identify results of interest, if they exist. These snippets fail to (i) provide distinct information and (ii) capture the main contents of the corresponding documents. Moreover, when the intended information need specified in a search query is ambiguous, it is very difficult, if not impossible, for a search engine to identify precisely the set of documents that satisfy the users intended request without requiring additional inputs. Furthermore, a document title is not always a good indicator of the content of the corresponding document. All of these design problems can be solved by our proposed query-based cluster and labeler, called QCL. QCL generates concise clusters of documents covering various subject areas retrieved in response to a user query, which saves the users time and effort in searching for specific information of interest without having to browse through the documents one by one. Experimental results show that QCL is effective and efficient in generating high-quality clusters of documents on specific topics with informative labels.
Information Systems | 2013
Maria Soledad Pera; Rani Qumsiyeh; Yiu-Kai Ng
Taking advantage of the popularity of the web, online marketplaces such as Ebay (.com), advertisements (ads for short) websites such as Craigslist(.org), and commercial websites such as Carmax(.com) (allow users to) post ads on a variety of products and services. Instead of browsing through numerous websites to locate ads of interest, web users would benefit from the existence of a single, fully integrated database (DB) with ads in multiple domains, such as Cars-for-Sale and Job-Postings, populated from various online sources so that ads of interest could be retrieved at a centralized site. Since existing ads websites impose their own structures and formats for storing and accessing ads, generating a uniform, integrated ads repository is not a trivial task. The challenges include (i) identifying ads domains, (ii) dealing with the diversity in structures of ads in various ads domains, and (iii) analyzing data with different meanings in each ads domain. To handle these problems, we introduce ADEx, a tool that relies on various machine learning approaches to automate the process of extracting (un-/semi-/fully- structured) data from online ads to create ads records archived in an underlying DB through domain classification, keyword tagging, and identification of valid attribute values. Experimental results generated using a dataset of 18,000 online ads originated from Craigslist, Ebay, and KSL(.com) show that ADEx is superior in performance compared with existing text classification, keyword labeling, and data extraction approaches. Further evaluations verify that ADEx either outperforms or performs at least as good as current state-of-the-art information extractors in mapping data from unstructured or (semi-)structured sources into DB records.
web information systems engineering | 2010
Maria Soledad Pera; Rani Qumsiyeh; Yiu-Kai Ng
These days web users searching for opinions expressed by others on a particular product or service PS can turn to review repositories, such as Epinions.com or Imdb.com. While these repositories often provide a high quantity of reviews on PS, browsing through archived reviews to locate different opinions expressed on PS is a time-consuming and tedious task, and in most cases, a very labor-intensive process. To simplify the task of identifying reviews expressing positive, negative, and neutral opinions on PS, we introduce a simple, yet effective sentiment classifier, denoted SentiClass, which categorizes reviews on PS using the semantic, syntactic, and sentiment content of the reviews. To speed up the classification process, SentiClass summarizes each review to be classified using eSummar, a single-document, extractive, sentiment summarizer proposed in this paper, based on various sentence scores and anaphora resolution. SentiClass (eSummar, respectively) is domain and structure independent and does not require any training for performing the classification (summarization, respectively) task. Empirical studies conducted on two widely-used datasets, Movie Reviews and Game Reviews, in addition to a collection of Epinions.com reviews, show that SentiClass (i) is highly accurate in classifying summarized or full reviews and (ii) outperforms well-known classifiers in categorizing reviews.
International Journal of Web Information Systems | 2016
Rani Qumsiyeh; Yiu-Kai Ng
Purpose The purpose of this paper is to introduce a summarization method to enhance the current web-search approaches by offering a summary of each clustered set of web-search results with contents addressing the same topic, which should allow the user to quickly identify the information covered in the clustered search results. Web search engines, such as Google, Bing and Yahoo!, rank the set of documents S retrieved in response to a user query and represent each document D in S using a title and a snippet, which serves as an abstract of D. Snippets, however, are not as useful as they are designed for, i.e. assisting its users to quickly identify results of interest. These snippets are inadequate in providing distinct information and capture the main contents of the corresponding documents. Moreover, when the intended information need specified in a search query is ambiguous, it is very difficult, if not impossible, for a search engine to identify precisely the set of documents that satisfy the user’s intended request without requiring additional information. Furthermore, a document title is not always a good indicator of the content of the corresponding document either. Design/methodology/approach The authors propose to develop a query-based summarizer, called QSum, in solving the existing problems of Web search engines which use titles and abstracts in capturing the contents of retrieved documents. QSum generates a concise/comprehensive summary for each cluster of documents retrieved in response to a user query, which saves the user’s time and effort in searching for specific information of interest by skipping the step to browse through the retrieved documents one by one. Findings Experimental results show that QSum is effective and efficient in creating a high-quality summary for each cluster to enhance Web search. Originality/value The proposed query-based summarizer, QSum, is unique based on its searching approach. QSum is also a significant contribution to the Web search community, as it handles the ambiguous problem of a search query by creating summaries in response to different interpretations of the search which offer a “road map” to assist users to quickly identify information of interest.
World Wide Web | 2014
Rani Qumsiyeh; Yiu-Kai Ng
One of the useful tools offered by existing web search engines is query suggestion (QS), which assists users in formulating keyword queries by suggesting keywords that are unfamiliar to users, offering alternative queries that deviate from the original ones, and even correcting spelling errors. The design goal of QS is to enrich the web search experience of users and avoid the frustrating process of choosing controlled keywords to specify their special information needs, which releases their burden on creating web queries. Unfortunately, the algorithms or design methodologies of the QS module developed by Google, the most popular web search engine these days, is not made publicly available, which means that they cannot be duplicated by software developers to build the tool for specifically-design software systems for enterprise search, desktop search, or vertical search, to name a few. Keyword suggested by Yahoo! and Bing, another two well-known web search engines, however, are mostly popular currently-searched words, which might not meet the specific information needs of the users. These problems can be solved by WebQS, our proposed web QS approach, which provides the same mechanism offered by Google, Yahoo!, and Bing to support users in formulating keyword queries that improve the precision and recall of search results. WebQS relies on frequency of occurrence, keyword similarity measures, and modification patterns of queries in user query logs, which capture information on millions of searches conducted by millions of users, to suggest useful queries/query keywords during the user query construction process and achieve the design goal of QS. Experimental results show that WebQS performs as well as Yahoo! and Bing in terms of effectiveness and efficiency and is comparable to Google in terms of query suggestion time.
Knowledge and Information Systems | 2016
Rani Qumsiyeh; Yiu-Kai Ng
Current web search engines, such as Google, Bing, and Yahoo!, rank the set of documents SD retrieved in response to a user query and display each document D in SD with a title and a snippet, which serves as an abstract of D. Snippets, however, are not as useful as they are designed for, i.e., assisting its users to quickly identify results of interest, if they exist. These snippets are inadequate in providing distinct information and capturing the main contents of the corresponding documents. Moreover, when the intended information need specified in a search query is ambiguous, it is very difficult, if not impossible, for a search engine to identify precisely the set of documents that satisfy the user’s intended request without requiring additional inputs. Furthermore, a document title is not always a good indicator of the content of the corresponding document. All of these design problems can be solved by our proposed query-based cluster and summarizer, called
web intelligence | 2015
Rani Qumsiyeh; Yiu-Kai Ng
international conference on computational science and its applications | 2015
Rani Qumsiyeh; Yiu-Kai Ng
Q_{Sum}
international acm sigir conference on research and development in information retrieval | 2012
Rani Qumsiyeh; Yiu-Kai Ng