Mohammed Korayem | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohammed Korayem is active.

Explore More

Publication

Featured researches published by Mohammed Korayem.

International Conference on Advanced Machine Learning Technologies and Applications | 2012

Subjectivity and Sentiment Analysis of Arabic: A Survey

Mohammed Korayem; David J. Crandall; Muhammad Abdul-Mageed

Subjectivity and sentiment analysis (SSA) has recently gained considerable attention, but most of the resources and systems built so far are tailored to English and other Indo-European languages. The need for designing systems for other languages is increasing, especially as blogging and micro-blogging websites become popular throughout the world. This paper surveys different techniques for SSA for Arabic. After a brief synopsis about Arabic, we describe the main existing techniques and test corpora for Arabic SSA that have been introduced in the literature.

web search and data mining | 2012

Beyond co-occurrence: discovering and visualizing tag relationships from geo-spatial and temporal similarities

Haipeng Zhang; Mohammed Korayem; Erkang You; David J. Crandall

Studying relationships between keyword tags on social sharing websites has become a popular topic of research, both to improve tag suggestion systems and to discover connections between the concepts that the tags represent. Existing approaches have largely relied on tag co-occurrences. In this paper, we show how to find connections between tags by comparing their distributions over time and space, discovering tags with similar geographic and temporal patterns of use. Geo-spatial, temporal and geo-temporal distributions of tags are extracted and represented as vectors which can then be compared and clustered. Using a dataset of tens of millions of geo-tagged Flickr photos, we show that we can cluster Flickr photo tags based on their geographic and temporal patterns, and we evaluate the results both qualitatively and quantitatively using a panel of human judges. We also develop visualizations of temporal and geographic tag distributions, and show that they help humans recognize semantic relationships between tags. This approach to finding and visualizing similar tags is potentially useful for exploring any data having geographic and temporal annotations.

international world wide web conferences | 2012

Mining photo-sharing websites to study ecological phenomena

Haipeng Zhang; Mohammed Korayem; David J. Crandall; Gretchen LeBuhn

The popularity of social media websites like Flickr and Twitter has created enormous collections of user-generated content online. Latent in these content collections are observations of the world: each photo is a visual snapshot of what the world looked like at a particular point in time and space, for example, while each tweet is a textual expression of the state of a person and his or her environment. Aggregating these observations across millions of social sharing users could lead to new techniques for large-scale monitoring of the state of the world and how it is changing over time. In this paper we step towards that goal, showing that by analyzing the tags and image features of geo-tagged, time-stamped photos we can measure and quantify the occurrence of ecological phenomena including ground snow cover, snow fall and vegetation density. We compare several techniques for dealing with the large degree of noise in the dataset, and show how machine learning can be used to reduce errors caused by misleading tags and ambiguous visual content. We evaluate the accuracy of these techniques by comparing to ground truth data collected both by surface stations and by Earth-observing satellites. Besides the immediate application to ecology, our study gives insight into how to accurately crowd-source other types of information from large, noisy social sharing datasets.

international conference on big data | 2014

Crowdsourced query augmentation through semantic discovery of domain-specific jargon

Khalifeh AlJadda; Mohammed Korayem; Trey Grainger; Chris Russell

Most work in semantic search has thus far focused upon either manually building language-specific taxonomies/ontologies or upon automatic techniques such as clustering or dimensionality reduction to discover latent semantic links within the content that is being searched. The former is very labor intensive and is hard to maintain, while the latter is prone to noise and may be hard for a human to understand or to interact with directly. We believe that the links between similar users queries represent a largely untapped source for discovering latent semantic relationships between search terms. The proposed system is capable of mining user search logs to discover semantic relationships between key phrases in a manner that is language agnostic, human understandable, and virtually noise-free.

international conference on big data | 2014

PGMHD: A scalable probabilistic graphical model for massive hierarchical data problems

Khalifeh AlJadda; Mohammed Korayem; Camilo Ortiz; Trey Grainger; John A. Miller; William S. York

In the big data era, scalability has become a crucial requirement for any useful computational model. Probabilistic graphical models are very useful for mining and discovering data insights, but they are not scalable enough to be suitable for big data problems. Bayesian Networks particularly demonstrate this limitation when their data is represented using few random variables with a massive set of outcome values for each of them. With hierarchical data - data that is arranged in a treelike structure with several levels - one would expect to see hundreds of thousands or millions of values distributed over even just a small number of levels. When modeling this kind of hierarchical data across large data sets, Bayesian networks become unsuitable for representing the probability distributions for the following reasons: i) each level represents a single random variable with hundreds of thousands of values, ii) the number of levels is usually small, so there are also few random variables, and iii) the structure of the network is predefined since the dependency is modeled top-down from each parent to each of its child nodes. In this paper we propose a scalable probabilistic graphical model to overcome these limitations for massive hierarchical data. We believe the proposed model will lead to an easily-scalable, more readable, and expressive implementation for problems that require probabilistic-based solutions for massive amounts of hierarchical data. We successfully applied this model to solve two different challenging probabilistic-based problems on massive hierarchical data sets for different domains, namely, bioinformatics and latent semantic discovery over search logs.

international conference on computer vision | 2013

Observing the Natural World with Flickr

Jingya Wang; Mohammed Korayem; David J. Crandall

The billions of public photos on online social media sites contain a vast amount of latent visual information about the world. In this paper, we study the feasibility of observing the state of the natural world by recognizing specific types of scenes and objects in large-scale social image collections. More specifically, we study whether we can recreate satellite maps of snowfall by automatically recognizing snowy scenes in geo-tagged, time stamped images from Flickr. Snow recognition turns out to be a surprisingly doff cult and under-studied problem, so we test a variety of modern scene recognition techniques on this problem and introduce a large-scale, realistic dataset of images with ground truth annotations. As an additional proof-of-concept, we test the ability of recognition algorithms to detect a particular species of flower, the California Poppy, which could be used to give biologists a new source of data on its geospatial distribution over time.

international conference on machine learning and applications | 2012

Learning Visual Features for the Avatar Captcha Recognition Challenge

Mohammed Korayem; Abdallah A. Mohamed; David J. Crandall; Roman V. Yampolskiy

Captchas are frequently used on the modern world wide web to differentiate human users from automated bots by giving tests that are easy for humans to answer but difficult or impossible for algorithms. As artificial intelligence algorithms have improved, new types of Captchas have had to be developed. Recent work has proposed a new system called Avatar Captcha, in which a user is asked to distinguish between facial images of real humans and those of avatars generated by computer graphics. This novel system has been proposed on the assumption that this Captcha is very difficult for computers to break. In this paper we test a variety of modern visual features and learning algorithms on this avatar recognition task. We find that relatively simple techniques can perform very well on this task, and in some cases can even surpass human performance.

acm multimedia | 2016

Tracking Natural Events through Social Media and Computer Vision

Jingya Wang; Mohammed Korayem; Saúl A. Blanco; David J. Crandall

Accurate, efficient, global observation of natural events is important for ecologists, meteorologists, governments, and the public. Satellites are effective but limited by their perspective and by atmospheric conditions. Public images on photo-sharing websites could provide crowd-sourced ground data to complement satellites, since photos contain evidence of the state of the natural world. In this work, we test the ability of computer vision to observe natural events in millions of geo-tagged Flickr photos, over nine years and an entire continent. We use satellites as (noisy) ground truth to train two types of classifiers, one that estimates if a Flickr photo has evidence of an event, and one that aggregates these estimates to produce an observation for given times and places. We present a web tool for visualizing the satellite and photo observations, allowing scientists to explore this novel combination of data sources.

Social Network Analysis and Mining | 2016

Sentiment/subjectivity analysis survey for languages other than English

Mohammed Korayem; Khalifeh AlJadda; David J. Crandall

Subjective and sentiment analysis have gained considerable attention recently. Most of the resources and systems built so far are done for English. The need for designing systems for other languages is increasing. This paper surveys different ways used for building systems for subjective and sentiment analysis for languages other than English. There are three different types of systems used for building these systems. The first (and the best) one is the language-specific systems. The second type of systems involves reusing or transferring sentiment resources from English to the target language. The third type of methods is based on using language-independent methods. The paper presents a separate section devoted to Arabic sentiment analysis.

international conference on big data | 2015

Query sense disambiguation leveraging large scale user behavioral data

Mohammed Korayem; Camilo Ortiz; Khalifeh AlJadda; Trey Grainger

Term ambiguity - the challenge of having multiple potential meanings for a keyword or phrase - can be a major problem for search engines. Contextual information is essential for word sense disambiguation, but search queries are often limited to very few keywords, making the available textual context needed for disambiguation minimal or non-existent. In this paper we propose a novel system to identify and resolve term ambiguity in search queries using large-scale user behavioral data. The proposed system demonstrates that, despite the lack of context in most keyword queries, multiple potential senses of a keyword or phrase within a search query can be accurately identified, disambiguated, and expressed in order to maximize the likelihood of fulfilling a users information need. The proposed system overcomes the immediate lack of context by leveraging large-scale user behavioral data from historical query logs. Unlike traditional word sense disambiguation methods that rely on knowledge sources or available textual corpora, our system is language-agnostic, is able to easily handle domain-specific terms and meanings, and is automatically generated so that it does not grow out of date or require manual updating as ambiguous terms emerge or undergo a shift in meaning. The system has been implemented using the Hadoop eco-system and integrated within CareerBuilders semantic search engine.

Explore More