Ahmed Abbasi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ahmed Abbasi is active.

Explore More

Publication

Featured researches published by Ahmed Abbasi.

ACM Transactions on Information Systems | 2008

Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums

Ahmed Abbasi; Hsinchun Chen; Arab Salem

The Internet is frequently used as a medium for exchange of information and opinions, as well as propaganda dissemination. In this study the use of sentiment analysis methodologies is proposed for classification of Web forum opinions in multiple languages. The utility of stylistic and syntactic features is evaluated for sentiment classification of English and Arabic content. Specific feature extraction components are integrated to account for the linguistic characteristics of Arabic. The entropy weighted genetic algorithm (EWGA) is also developed, which is a hybridized genetic algorithm that incorporates the information-gain heuristic for feature selection. EWGA is designed to improve performance and get a better assessment of key features. The proposed features and techniques are evaluated on a benchmark movie review dataset and U.S. and Middle Eastern Web forum postings. The experimental results using EWGA with SVM indicate high performance levels, with accuracies of over 91% on the benchmark dataset as well as the U.S. and Middle Eastern forums. Stylistic features significantly enhanced performance across all testbeds while EWGA also outperformed other feature selection methods, indicating the utility of these features and techniques for document-level classification of sentiments.

IEEE Intelligent Systems | 2005

Applying authorship analysis to extremist-group Web forum messages

Ahmed Abbasi; Hsinchun Chen

The speed, ubiquity, and potential anonymity of Internet media - email, Web sites, and Internet forums - make them ideal communication channels for militant groups and terrorist organizations. Analyzing Web content has therefore become increasingly important to the intelligence and security agencies that monitor these groups. Authorship analysis can assist this activity by automatically extracting linguistic features from online messages and evaluating stylistic details for patterns of terrorist communication. However, authorship analysis techniques are rooted in work with literary texts, which differ significantly from online communication. To explore these problems, we modified an existing framework for analyzing online authorship and applied it to Arabic and English Web forum messages associated with known extremist groups. We developed a special multilingual model - the set of algorithms and related features - to identify Arabic messages, gearing this model toward the languages unique characteristics. Furthermore, we incorporated a complex message extraction component to allow the use of a more comprehensive set of features tailored specifically toward online messages. Evaluating the linguistic features of Web messages and comparing them to known writing styles offers the intelligence community a tool for identifying patterns of terrorist communication.

ACM Transactions on Information Systems | 2008

Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace

Ahmed Abbasi; Hsinchun Chen

One of the problems often associated with online anonymity is that it hinders social accountability, as substantiated by the high levels of cybercrime. Although identity cues are scarce in cyberspace, individuals often leave behind textual identity traces. In this study we proposed the use of stylometric analysis techniques to help identify individuals based on writing style. We incorporated a rich set of stylistic features, including lexical, syntactic, structural, content-specific, and idiosyncratic attributes. We also developed the Writeprints technique for identification and similarity detection of anonymous identities. Writeprints is a Karhunen-Loeve transforms-based technique that uses a sliding window and pattern disruption algorithm with individual author-level feature sets. The Writeprints technique and extended feature set were evaluated on a testbed encompassing four online datasets spanning different domains: email, instant messaging, feedback comments, and program code. Writeprints outperformed benchmark techniques, including SVM, Ensemble SVM, PCA, and standard Karhunen-Loeve transforms, on the identification and similarity detection tasks with accuracy as high as 94% when differentiating between 100 authors. The extended feature set also significantly outperformed a baseline set of features commonly used in previous research. Furthermore, individual-author-level feature sets generally outperformed use of a single group of attributes.

Management Information Systems Quarterly | 2008

CyberGate: a design framework and system for text analysis of computer-mediated communication

Ahmed Abbasi; Hsinchun Chen

Content analysis of computer-mediated communication (CMC) is important for evaluating the effectiveness of electronic communication in various organizational settings. CMC text analysis relies on systems capable of providing suitable navigation and knowledge discovery functionalities. However, existing CMC systems focus on structural features, with little support for features derived from message text. This deficiency is attributable to the informational richness and representational complexities associated with CMC text. In order to address this shortcoming, we propose a design framework for CMC text analysis systems. Grounded in systemic functional linguistic theory, the proposed framework advocates the development of systems capable of representing the rich array of information types inherent in CMC text. It also provides guidelines regarding the choice of features, feature selection, and visualization techniques that CMC text analysis systems should employ. The CyberGate system was developed as an instantiation of the design framework. CyberGate incorporates a rich feature set and complementary feature selection and visualization methods, including the writeprints and ink blots techniques. An application example was used to illustrate the systems ability to discern important patterns in CMC text. Furthermore, results from numerous experiments conducted in comparison with benchmark methods confirmed the viability of CyberGates features and techniques. The results revealed that the CyberGate system and its underlying design framework can dramatically improve CMC text analysis capabilities over those provided by existing systems.

IEEE Transactions on Knowledge and Data Engineering | 2011

Selecting Attributes for Sentiment Classification Using Feature Relation Networks

Ahmed Abbasi; Zhu Zhang; Hsinchun Chen

A major concern when incorporating large sets of diverse n-gram features for sentiment classification is the presence of noisy, irrelevant, and redundant attributes. These concerns can often make it difficult to harness the augmented discriminatory potential of extended feature sets. We propose a rule-based multivariate text feature selection method called Feature Relation Network (FRN) that considers semantic information and also leverages the syntactic relationships between n-gram features. FRN is intended to efficiently enable the inclusion of extended sets of heterogeneous n-gram features for enhanced sentiment classification. Experiments were conducted on three online review testbeds in comparison with methods used in prior sentiment classification research. FRN outperformed the comparison univariate, multivariate, and hybrid feature selection methods; it was able to select attributes resulting in significantly better classification accuracy irrespective of the feature subset sizes. Furthermore, by incorporating syntactic information about n-gram relations, FRN is able to select features in a more computationally efficient manner than many multivariate and hybrid techniques.

Management Information Systems Quarterly | 2010

Detecting fake websites: the contribution of statistical learning theory

Ahmed Abbasi; Zhu Zhang; David Zimbra; Hsinchun Chen; Jay F. Nunamaker

Fake websites have become increasingly pervasive, generating billions of dollars in fraudulent revenue at the expense of unsuspecting Internet users. The design and appearance of these websites makes it difficult for users to manually identify them as fake. Automated detection systems have emerged as a mechanism for combating fake websites, however most are fairly simplistic in terms of their fraud cues and detection methods employed. Consequently, existing systems are susceptible to the myriad of obfuscation tactics used by fraudsters, resulting in highly ineffective fake website detection performance. In light of these deficiencies, we propose the development of a new class of fake website detection systems that are based on statistical learning theory (SLT). Using a design science approach, a prototype system was developed to demonstrate the potential utility of this class of systems. We conducted a series of experiments, comparing the proposed system against several existing fake website detection systems on a test bed encompassing 900 websites. The results indicate that systems grounded in SLT can more accurately detect various categories of fake websites by utilizing richer sets of fraud cues in combination with problem-specific knowledge. Given the hefty cost exacted by fake websites, the results have important implications for e-commerce and online security.

Management Information Systems Quarterly | 2012

Metafraud: a meta-learning framework for detecting financial fraud

Ahmed Abbasi; Conan C. Albrecht; Anthony Vance; James V. Hansen

Financial fraud can have serious ramifications for the long-term sustainability of an organization, as well as adverse effects on its employees and investors, and on the economy as a whole. Several of the largest bankruptcies in U.S. history involved firms that engaged in major fraud. Accordingly, there has been considerable emphasis on the development of automated approaches for detecting financial fraud. However, most methods have yielded performance results that are less than ideal. In consequence, financial fraud detection continues as an important challenge for business intelligence technologies. In light of the need for more robust identification methods, we use a design science approach to develop MetaFraud, a novel meta-learning framework for enhanced financial fraud detection. To evaluate the proposed framework, a series of experiments are conducted on a test bed encompassing thousands of legitimate and fraudulent firms. The results reveal that each component of the framework significantly contributes to its overall effectiveness. Additional experiments demonstrate the effectiveness of the meta-learning framework over state-of-the-art financial fraud detection methods. Moreover, the MetaFraud framework generates confidence scores associated with each prediction that can facilitate unprecedented financial fraud detection performance and serve as a useful decision-making aid. The results have important implications for several stakeholder groups, including compliance officers, investors, audit firms, and regulators.

intelligence and security informatics | 2007

Affect Intensity Analysis of Dark Web Forums

Ahmed Abbasi

Affects play an important role in influencing peoples perceptions and decision making. Affect analysis is useful for measuring the presence of hate, violence, and the resulting propaganda dissemination across extremist groups. In this study we performed affect analysis of U.S. and Middle Eastern extremist group forum postings. We constructed an affect lexicon using a probabilistic disambiguation technique to measure the usage of violence and hate affects. These techniques facilitate in depth analysis of multilingual content. The proposed approach was evaluated by applying it across 16 U.S. supremacist and Middle Eastern extremist group forums. Analysis across regions reveals that the Middle Eastern test bed forums have considerably greater violence intensity than the U.S. groups. There is also a strong linear relationship between the usage of hate and violence across the Middle Eastern messages.

international conference on social computing | 2013

Twitter Sentiment Analysis: A Bootstrap Ensemble Framework

Ammar Hassan; Ahmed Abbasi; Daniel Zeng

Twitter sentiment analysis has become widely popular. However, stable Twitter sentiment classification performance remains elusive due to several issues: heavy class imbalance in a multi-class problem, representational richness issues for sentiment cues, and the use of diverse colloquial linguistic patterns. These issues are problematic since many forms of social media analytics rely on accurate underlying Twitter sentiments. Accordingly, a text analytics framework is proposed for Twitter sentiment analysis. The framework uses an elaborate bootstrapping ensemble to quell class imbalance, sparsity, and representational richness issues. Experiment results reveal that the proposed approach is more accurate and balanced in its predictions across sentiment classes, as compared to various comparison tools and algorithms. Consequently, the bootstrapping ensemble framework is able to build sentiment time series that are better able to reflect events eliciting strong positive and negative sentiments from users. Considering the importance of Twitter as one of the premiere social media platforms, the results have important implications for social media analytics and social intelligence.

Journal of Management Information Systems | 2008

Stylometric Identification in Electronic Markets: Scalability and Robustness

Ahmed Abbasi; Hsinchun Chen; Jay F. Nunamaker

Online reputation systems are intended to facilitate the propagation of word of mouth as a credibility scoring mechanism for improved trust in electronic marketplaces. However, they experience two problems attributable to anonymity abuse—easy identity changes and reputation manipulation. In this study, we propose the use of stylometric analysis to help identify online traders based on the writing style traces inherent in their posted feedback comments. We incorporated a rich stylistic feature set and developed the Writeprint technique for detection of anonymous trader identities. The technique and extended feature set were evaluated on a test bed encompassing thousands of feedback comments posted by 200 eBay traders. Experiments conducted to assess the scalability (number of traders) and robustness (against intentional obfuscation) of the proposed approach found it to significantly outperform benchmark stylometric techniques. The results indicate that the proposed method may help militate against easy identity changes and reputation manipulation in electronic markets.

Explore More