Sameena Shah
Thomson Reuters
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sameena Shah.
Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013
Masoud Makrehchi; Sameena Shah; Wenhui Liao
We propose a novel approach to label social media text using significant stock market events (big losses or gains). Since stock events are easily quantifiable using returns from indices or individual stocks, they provide meaningful and automated labels. We extract significant stock movements and collect appropriate pre, post and contemporaneous text from social media sources (for example, tweets from twitter). Subsequently, we assign the respective label (positive or negative) for each tweet. We train a model on this collected set and make predictions for labels of future tweets. We aggregate the net sentiment per each day (amongst other metrics) and show that it holds significant predictive power for subsequent stock market movement. We create successful trading strategies based on this system and find significant returns over other baseline methods.
Swarm Intelligence | 2013
Jayadeva; Sameena Shah; Amit Bhaya; Ravi Kothari; Suresh Chandra
In the most basic application of Ant Colony Optimization (ACO), a set of artificial ants find the shortest path between a source and a destination. Ants deposit pheromone on paths they take, preferring paths that have more pheromone on them. Since shorter paths are traversed faster, more pheromone accumulates on them in a given time, attracting more ants and leading to reinforcement of the pheromone trail on shorter paths. This is a positive feedback process that can also cause trails to persist on longer paths, even when a shorter path becomes available. To counteract this persistence on a longer path, ACO algorithms employ remedial measures, such as using negative feedback in the form of uniform evaporation on all paths. Obtaining high performance in ACO algorithms typically requires fine tuning several parameters that govern pheromone deposition and removal. This paper proposes a new ACO algorithm, called EigenAnt, for finding the shortest path between a source and a destination, based on selective pheromone removal that occurs only on the path that is actually chosen for each trip. We prove that the shortest path is the only stable equilibrium for EigenAnt, which means that it is maintained for arbitrary initial pheromone concentrations on paths, and even when path lengths change with time. The EigenAnt algorithm uses only two parameters and does not require them to be finely tuned. Simulations that illustrate these properties are provided.
international conference on social computing | 2014
Wenhui Liao; Sameena Shah; Masoud Makrehchi
We propose a novel yet simple method for creating a stock market trading strategy by following successful stock market expert in social media. The problem of “how and where to invest” is translated into “who to follow in my investment”. In other words, looking for stock market investment strategy is converted into stock market expert search. Fortunately, many stock market experts are active in social media and openly express their opinions about market. By analyzing their behavior, and mining their opinions and suggested actions in Twitter, and simulating their recommendations, we are able to score each expert based on his/her performance. Using this scoring system, experts with most successful trading are recommended. The main objective in this research is to identify traders that outperform market historically, and aggregate the opinions from such traders to recommend trades.
conference on information and knowledge management | 2016
Xiaomo Liu; Quanzhi Li; Armineh Nourbakhsh; Rui Fang; Merine Thomas; Kajsa Anderson; Russ Kociuba; Mark Vedder; Steven Pomerville; Ramdev Wudali; Robert Martin; John Duprey; Arun Vachher; William M. Keenan; Sameena Shah
News professionals are facing the challenge of discovering news from more diverse and unreliable information in the age of social media. More and more news events break on social media first and are picked up by news media subsequently. The recent Brussels attack is such an example. At Reuters, a global news agency, we have observed the necessity of providing a more effective tool that can help our journalists to quickly discover news on social media, verify them and then inform the public. In this paper, we describe Reuters Tracer, a system for sifting through all noise to detect news events on Twitter and assessing their veracity. We disclose the architecture of our system and discuss the various design strategies that facilitate the implementation of machine learning models for noise filtering and event detection. These techniques have been implemented at large scale and successfully discovered breaking news faster than traditional journalism
Information Sciences | 2013
Sameena Shah; Ravi Kothari
Load balancers distribute workload across multiple nodes based on a variation of the round robin algorithm, or a more complex algorithm that optimizes a specified objective or allows for horizontal scalability and higher availability. In this paper, we investigate whether robust load balancing can be achieved using a local co-operative mechanism between the resources (nodes). The local aspect of the mechanism implies that each node interacts with a small subset of the nodes that define its neighborhood. The co-operative aspect of the mechanism implies that a node may offload some of load to its neighbor nodes that have lesser load or accept jobs from neighbor nodes that have higher load. Each node is thus only aware of the state of its neighboring nodes and there is no central entity that has the knowledge of the state of all the nodes. We model the overall mechanism of load balancing based on local interactions as a congestion game and show that convergence to the Nash equilibrium is possible using only local interactions. We derive worst case bounds on the number of transfers (time) required to achieve global load balancing under this setup. We also include simulation results to demonstrate emergent global load balancing based only on local interactions and local information.
conference on information and knowledge management | 2016
Quanzhi Li; Sameena Shah; Xiaomo Liu; Armineh Nourbakhsh; Rui Fang
Classifying tweets into topic categories is necessary and important for many applications, since tweets are about a variety of topics and users are only interested in certain topical areas. Many tweet classification approaches fail to achieve high accuracy due to data sparseness issue. Tweet, as a special type of short text, in additional to its text, also has other metadata that can be used to enrich its context, such as user name, mention, hashtag and embedded link. In this demonstration, we present TweetSift, an efficient and effective real time tweet topic classifier. TweetSift exploits external tweet-specific entity knowledge to provide more topical context for a tweet, and integrates them with topic enhanced word embeddings for topic classification. The demonstration will show how TweetSift works and how it is incorporated with our social media event detection system.
web intelligence | 2016
Quanzhi Li; Sameena Shah; Xiaomo Liu; Armineh Nourbakhsh; Rui Fang
Many classification tasks on short text, such as tweet, fail to achieve high accuracy due to data sparseness. One approach to solving this problem is to enrich the context of data by using external data sources, or distributed language representations trained on huge amount of data. In this paper, we present several tweet topic classification methods by exploiting different types of data: tweet text, tweet text plus entity knowledge base, word embeddings derived from tweet text, distributed representations of tweets, and topical word embeddings. The word embedding, topical word embedding and sentence representation models are generated from billions of words from tweets without supervision. To the best of our knowledge, this is the first study of applying distributed language representations to tweet topic classification task.
international conference on data engineering | 2017
Quanzhi Li; Armineh Nourbakhsh; Sameena Shah; Xiaomo Liu
In this paper, we present a new approach for detecting novel events from social media, specially Twitter, at real-time. An event is usually defined by who, what, where and when, and an event tweet usually contains terms corresponding to these aspects. To exploit this information, we propose a method that incorporates simple semantics by splitting the tweet term space into groups of terms that have the meaning of the same type. These groups are called semantic categories (classes) and each reflects one or more event aspects. The semantic classes include named entity, mention, location, hashtag, verb, noun and embedded link. To group tweets talking about the same event into the same cluster, similarity measuring is conducted by calculating class-wise similarity and then aggregating them together. Users of a real-time event detection system are usually only interested in novel (new) events, which are happening now or just happened a short time ago. To fulfill this requirement, a temporal identification module is used to filter out event clusters that are about old stories. The clustering module also computes a novelty score for each event cluster, which reflects how novel the event is, compared to previous events. We evaluated our event detection method using multiple quality metrics and a large-scale event corpus having millions of tweets. The experiment results show that the proposed online event detection method achieves the state-of-the-art performance. Our experiment also shows that the temporal identification module can effectively detect old events.
international conference on data mining | 2015
Armineh Nourbakhsh; Xiaomo Liu; Sameena Shah; Rui Fang; Mohammad M. Ghassemi; Quanzhi Li
Rumor events differ in how and where they originate, what topics they address, the emotions they invoke, and how they engage their audience. In this paper, we study various semantic aspects of rumors and analyze the motivational and functional roles they play. Using Twitter as a case study, we develop a framework to characterize rumors. Our characterization covers intrinsic and extrinsic factors, tweet and event-level, as well as usage analysis. We determine the roles various user-types play and analyze rumor propagation from both a re-tweeting and burstiness perspective.
conference on information and knowledge management | 2016
Quanzhi Li; Sameena Shah; Armineh Nourbakhsh; Xiaomo Liu; Rui Fang
In this paper, we present a new approach of recommending hashtags for tweets. It uses Learning to Rank algorithm to incorporate features built from topic enhanced word embeddings, tweet entity data, hashtag frequency, hashtag temporal data and tweet URL domain information. The experiments using millions of tweets and hashtags show that the proposed approach outperforms the three baseline methods -- the LDA topic, the tf.idf based and the general word embedding approaches.