David Zimbra
University of Arizona
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Zimbra.
IEEE Intelligent Systems | 2010
Hsinchun Chen; David Zimbra
The advent of Web 2.0 and social media content has stirred much excitement and created abundant opportunities for understanding the opinions of the general public and consumers toward social events, political movements, company strategies, marketing campaigns, and product preferences. Many new and exciting social, geopolitical, and business-related research questions can be answered by analyzing the thousands, even millions, of comments and responses expressed in various blogs (such as the blogosphere), forums (such as Yahoo Forums), social media and social network sites (including YouTube, Facebook, and Flikr), virtual worlds (such as Second Life), and tweets (Twitter). Opinion mining, a subdiscipline within data mining and computational linguistics, refers to the computational techniques for extracting, classifying, understanding, and assessing the opinions expressed in various online news sources, social media comments, and other user-generated content. Sentiment analysis is often used in opinion mining to identify sentiment, affect, subjectivity, and other emotional states in online text.
Expert Systems With Applications | 2013
Manoochehr Ghiassi; J. Skinner; David Zimbra
Twitter messages are increasingly used to determine consumer sentiment towards a brand. The existing literature on Twitter sentiment analysis uses various feature sets and methods, many of which are adapted from more traditional text classification problems. In this research, we introduce an approach to supervised feature reduction using n-grams and statistical analysis to develop a Twitter-specific lexicon for sentiment analysis. We augment this reduced Twitter-specific lexicon with brand-specific terms for brand-related tweets. We show that the reduced lexicon set, while significantly smaller (only 187 features), reduces modeling complexity, maintains a high degree of coverage over our Twitter corpus, and yields improved sentiment classification accuracy. To demonstrate the effectiveness of the devised Twitter-specific lexicon compared to a traditional sentiment lexicon, we develop comparable sentiment classification models using SVM. We show that the Twitter-specific lexicon is significantly more effective in terms of classification recall and accuracy metrics. We then develop sentiment classification models using the Twitter-specific lexicon and the DAN2 machine learning approach, which has demonstrated success in other text classification problems. We show that DAN2 produces more accurate sentiment classification results than SVM while using the same Twitter-specific lexicon.
Management Information Systems Quarterly | 2010
Ahmed Abbasi; Zhu Zhang; David Zimbra; Hsinchun Chen; Jay F. Nunamaker
Fake websites have become increasingly pervasive, generating billions of dollars in fraudulent revenue at the expense of unsuspecting Internet users. The design and appearance of these websites makes it difficult for users to manually identify them as fake. Automated detection systems have emerged as a mechanism for combating fake websites, however most are fairly simplistic in terms of their fraud cues and detection methods employed. Consequently, existing systems are susceptible to the myriad of obfuscation tactics used by fraudsters, resulting in highly ineffective fake website detection performance. In light of these deficiencies, we propose the development of a new class of fake website detection systems that are based on statistical learning theory (SLT). Using a design science approach, a prototype system was developed to demonstrate the potential utility of this class of systems. We conducted a series of experiments, comparing the proposed system against several existing fake website detection systems on a test bed encompassing 900 websites. The results indicate that systems grounded in SLT can more accurately detect various categories of fake websites by utilizing richer sets of fraud cues in combination with problem-specific knowledge. Given the hefty cost exacted by fake websites, the results have important implications for e-commerce and online security.
IEEE Intelligent Systems | 2010
Yong Liu; Yubo Chen; Robert F. Lusch; Hsinchun Chen; David Zimbra; Shuo Zeng
Enabled by Web 2.0 technologies, online social media in the forms of discussion forums, message boards, and blogs has become a prevalent channe lof communication for consumers and businesses. Online social media allows consumers to share their product opinions and experience at an unprecedented pace and scale. This user generated content, or online word of mouth (WOM), has the potential to influence product sales and firm strategy. Consequently, as Web-mining and opinion-mining tools and technology continue to proliferate, it is critical to examine how WOM information can be measured and used to improve managerial decisions.In this article, we explore the predictive validity of various text and sentiment measures of online WOM for the market success of new products. From the firms’ perspective, it is important to effectively predict the sales of new products in the product development process. The earlier such a forecast can be made, the more useful it will be, since marketing strategies can then be adjusted accordingly. We thus examine online WOM that appears at different stages of the new-product life cycle, such as before production, before introduction, and after introduction. New-product development is a highly risky process, and it is useful to examine different aspects of its success. In addition to examining product sales directly, we also study product evaluation by third-party professionals and how the product would receive marketing support from the firm, both of which could influence sales. The context of our study is the Hollywood movie industry. The forecast of movie sales is highly challenging and has started to incorporate online WOM. We collected online WOM information from the message board of Yahoo Movies for a total of 257 movies released from 2005 to 2006. We used Senti Word Net and Opinion Finder, two lexical packages of computational linguistics, to construct the sentiment measures for the WOM data. We will first examine the evolution patterns of online WOM over time, followed by a correlation analysis of how various sentiment measures relate to the metrics of new product success.When Adam Smith wrote the Wealth of Nations in 1776, he concluded that individuals, firms, and nations obtain comparative advantage by specialization. Markets worked as the invisible hand to efficiently allocate resources between specialized parties. During the Industrial Revolution, manufacturing organizations helped the nation become wealthy by creating mechanisms for the internal allocation and integration of resources to produce largely tangible output. Today, both markets and organizations are undergoing a phase transition.to the “skills, technologies, applications, and practices used to support decision making” (http:// en.wikipedia.org/wiki/Business_intelligence). On the basis of a survey of 1,400 CEOs, the Gartner Group projected BI revenue to reach
Journal of Computer-Mediated Communication | 2010
David Zimbra; Ahmed Abbasi; Hsinchun Chen
3 billion in 2009.1 Through BI initiatives, businesses are gaining insights from the growing volumes of transaction, product, inventory, customer, competitor, and industry data generated by enterprise-wide applications such as enterprise resource planning (ERP), customer relationship management (CRM), supply-chain management (SCM), knowledge management, collaborative computing, Web analytics, and so on. The same Gartner survey also showed that BI surpassed security as the top business IT priority in 2006.1 BI has been used as an umbrella term to describe concepts and methods for improving business decision making by using fact-based support systems. BI also includes the underlying architectures, tools, databases, applications, and methodologies. BI’s major objectives are to enable interactive and easy access to diverse data, enable manipulation and transformation of these data, and give business managers and analysts the ability to conduct appropriate analyses and then act.2 BI is now widely adopted in the world of IT practice and has also become popular in information systems curricula.3 Successful BI initiatives have been reported for major industries—from healthcare and airlines to major IT and telecommunications fi rms.2 As a data-centered approach, BI relies heavily on various advanced data collection, extraction, and analysis technologies.2,3 Data warehousing is often considered the foundation of BI. Design of data marts and tools for extraction, transformation, and load (ETL) are essential for converting and integrating enterprise-specifi c data. Organizations often next adopt database query, online analytical processing (OLAP), and advanced reporting tools to explore important data characteristics. Business performance management (BPM) using scorecards and dashboards allow analysis and visualization of various employee performance metrics. In addition to these well-established business analytics functions, organizations can adopt advanced knowledge discovery using data and text mining for association rule mining, database segmentation and clustering, anomaly detection, and predictive modeling in various information systems and human resources, accounting, fi nance, and marketing applications. Since about 2004, Web intelligence, Web analytics, Web 2.0, and user-generated content have begun to usher in a new and exciting era of business research, which we could call Business Intelligence 2.0. An immense amount of company, industry, product, and customer information can be gathered from the Web and organized and visualized through various knowledge-mapping, Web portal, and multilingual retrieval techniques.4 By analyzing customer clickstream data logs, Web analytics tools such as Google Analytics provide a trail of the user’s online activities and reveal the user’s browsing and purchasing patterns. Web site design, product placement optimization, customer transaction analysis, and product recommendations can Business Intelligence (BI), a term coined in 1989, has gained much traction in the IT
hawaii international conference on system sciences | 2016
David Zimbra; Manoochehr Ghiassi; Sean Lee
This paper presents a cyber-archaeology approach to social movement research. The approach overcomes many of the issues of scale and complexity facing social research in the Internet, enabling broad and longitudinal study of the virtual communities supporting social movements. Cultural cyber-artifacts of significance to the social movement are collected and classified using automated techniques, enabling analysis across multiple related virtual communities. Approaches to the analysis of cyber-artifacts are guided by perspectives of social movement theory. A case study on a broad group of related social movement virtual communities is presented to demonstrate the efficacy of the framework, and provide a detailed instantiation of the proposed approach for evaluation.
Journal of Management Information Systems | 2016
Manoochehr Ghiassi; David Zimbra; Sean Lee
We present an approach to brand-related Twitter sentiment analysis using feature engineering and the Dynamic Architecture for Artificial Neural Networks (DAN2). The approach addresses challenges associated with the unique characteristics of the Twitter language, and the recall of mild sentiment expressions that are of interest to brand management practitioners. We demonstrate the effectiveness of the approach on a Starbucks brand-related Twitter data set. The feature engineering produced a final tweet feature representation consisting of only seven dimensions, with greater feature density. Two sets of experiments were conducted in three-class and five-class tweet sentiment classification. We compare the proposed approach to the performances of two state-of-the-art Twitter sentiment analysis systems from the academic and commercial domains. The results indicate that the approach outperforms these state-of-the-art systems in both three-class and five-class tweet sentiment classification by wide margins, with classification accuracies above 80% and excellent recall of mild sentiment tweets.
acm transactions on management information systems | 2015
David Zimbra; Hsinchun Chen; Robert F. Lusch
Abstract Social media communications offer valuable feedback to firms about their brands. We present a targeted approach to Twitter sentiment analysis for brands using supervised feature engineering and the dynamic architecture for artificial neural networks. The proposed approach addresses challenges associated with the unique characteristics of the Twitter language and brand-related tweet sentiment class distribution. We demonstrate its effectiveness on Twitter data sets related to two distinctive brands. The supervised feature engineering for brands offers final tweet feature representations of only seven dimensions with greater feature density. Reducing the dimensionality of the representations reduces the complexity of the classification problem and feature sparsity. Two sets of experiments are conducted for each brand in three-class and five-class tweet sentiment classification. We examine five-class classification to target the mild sentiment expressions that are of particular interest to firms and brand management practitioners. We compare the proposed approach to the performances of two state-of-the-art Twitter sentiment analysis systems from the academic and commercial domains. The results indicate that it outperforms these state-of-the-art systems by wide margins, with classification F1-measures as high as 88 percent and excellent recall of tweets expressing mild sentiments. Furthermore, they demonstrate the tweet feature representations, though consisting of only seven dimensions, are highly effective in capturing indicators of Twitter sentiment expression. The proposed approach and vast majority of features identified through supervised feature engineering are applicable across brands, allowing researchers and brand management practitioners to quickly generate highly effective tweet feature representations for Twitter sentiment analysis on other brands.
intelligence and security informatics | 2012
David Zimbra; Hsinchun Chen
In this study, we present stakeholder analyses of firm-related web forums. Prior analyses of firm-related forums have considered all participants in the aggregate, failing to recognize the potential for diversity within the populations. However, distinctive groups of forum participants may represent various interests and stakes in a firm worthy of consideration. To perform the stakeholder analyses, the Stakeholder Analyzer system for firm-related web forums is developed following the design science paradigm of information systems research. The design of the system and its approach to stakeholder analysis is guided by two kernel theories, the stakeholder theory of the firm and the systemic functional linguistic theory. A stakeholder analysis identifies distinctive groups of forum participants with shared characteristics expressed in discussion and evaluates their specific opinions and interests in the firm. Stakeholder analyses are performed in six major firm-related forums hosted on Yahoo Finance over a 3-month period. The relationships between measures extracted from the forums and subsequent daily firm stock returns are examined using multiple linear regression models, revealing statistically significant indicators of firm stock returns in the discussions of the stakeholder groups of each firm with stakeholder-model-adjusted R2 values reaching 0.83. Daily stock return prediction is also performed for 31 trading days, and stakeholder models correctly predicted the direction of return on 67% of trading days and generated an impressive 17% return in simulated trading of the six firm stocks. These evaluations demonstrate that the stakeholder analyses provided more refined assessments of the firm-related forums, yielding measures at the stakeholder group level that better explain and predict daily firm stock returns than aggregate forum-level information.
Journal of the Association for Information Science and Technology | 2014
Victor A. Benjamin; Hsinchun Chen; David Zimbra
This study examines several approaches to sentiment classification in the Dark Web Forum Portal, and opportunities to transfer classifiers and text features across multiple forums to improve scalability and performance. Although sentiment classifiers typically perform poorly when transferred across domains, experimentation reveals the devised approaches offer performance equivalent to the traditional forum-specific approach in classification in an unknown domain. Furthermore, incorporating the text features identified as significant indicators of sentiment in other forums can greatly improve the classification accuracy of the traditional forum-specific approach.