Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Conrad S. Tucker is active.

Publication


Featured researches published by Conrad S. Tucker.


Journal of Biomedical Informatics | 2014

An ensemble heterogeneous classification methodology for discovering health-related knowledge in social media messages

Suppawong Tuarob; Conrad S. Tucker; Marcel Salathé; Nilam Ram

OBJECTIVES The role of social media as a source of timely and massive information has become more apparent since the era of Web 2.0.Multiple studies illustrated the use of information in social media to discover biomedical and health-related knowledge.Most methods proposed in the literature employ traditional document classification techniques that represent a document as a bag of words.These techniques work well when documents are rich in text and conform to standard English; however, they are not optimal for social media data where sparsity and noise are norms.This paper aims to address the limitations posed by the traditional bag-of-word based methods and propose to use heterogeneous features in combination with ensemble machine learning techniques to discover health-related information, which could prove to be useful to multiple biomedical applications, especially those needing to discover health-related knowledge in large scale social media data.Furthermore, the proposed methodology could be generalized to discover different types of information in various kinds of textual data. METHODOLOGY Social media data is characterized by an abundance of short social-oriented messages that do not conform to standard languages, both grammatically and syntactically.The problem of discovering health-related knowledge in social media data streams is then transformed into a text classification problem, where a text is identified as positive if it is health-related and negative otherwise.We first identify the limitations of the traditional methods which train machines with N-gram word features, then propose to overcome such limitations by utilizing the collaboration of machine learning based classifiers, each of which is trained to learn a semantically different aspect of the data.The parameter analysis for tuning each classifier is also reported. DATA SETS Three data sets are used in this research.The first data set comprises of approximately 5000 hand-labeled tweets, and is used for cross validation of the classification models in the small scale experiment, and for training the classifiers in the real-world large scale experiment.The second data set is a random sample of real-world Twitter data in the US.The third data set is a random sample of real-world Facebook Timeline posts. EVALUATIONS Two sets of evaluations are conducted to investigate the proposed models ability to discover health-related information in the social media domain: small scale and large scale evaluations.The small scale evaluation employs 10-fold cross validation on the labeled data, and aims to tune parameters of the proposed models, and to compare with the stage-of-the-art method.The large scale evaluation tests the trained classification models on the native, real-world data sets, and is needed to verify the ability of the proposed model to handle the massive heterogeneity in real-world social media. FINDINGS The small scale experiment reveals that the proposed method is able to mitigate the limitations in the well established techniques existing in the literature, resulting in performance improvement of 18.61% (F-measure).The large scale experiment further reveals that the baseline fails to perform well on larger data with higher degrees of heterogeneity, while the proposed method is able to yield reasonably good performance and outperform the baseline by 46.62% (F-Measure) on average.


ASME 2013 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference | 2013

FAD OR HERE TO STAY: PREDICTING PRODUCT MARKET ADOPTION AND LONGEVITY USING LARGE SCALE, SOCIAL MEDIA DATA

Conrad S. Tucker

The authors of this work propose a Knowledge Discovery in Databases (KDD) model for predicting product market adoption and longevity using large scale, social media data. Social media data, available through sites such as Twitter R


Journal of Mechanical Design | 2015

Automated Discovery of Lead Users and Latent Product Features by Mining Large Scale Social Media Networks

Suppawong Tuarob; Conrad S. Tucker

Lead users play a vital role in next generation product development, as they help designers discover relevant product feature preferences months or even years before they are desired by the general customer base. Existing design methodologies proposed to extract lead user preferences are typically constrained by temporal, geographic, size, and heterogeneity limitations. To mitigate these challenges, the authors of this work propose a set of mathematical models that mine social media networks for lead users and the product features that they express relating to specific products. The authors hypothesize that: (i) lead users are discoverable from large scale social media networks and (ii) product feature preferences, mined from lead user social media data, represent product features that do not currently exist in product offerings but will be desired in future product launches. An automated approach to lead user product feature identification is proposed to identify latent features (product features unknown to the public) from social media data. These latent features then serve as the key to discovering innovative users from the ever increasing pool of social media users. The authors collect 2.1 10 social media messages in the United States during a period of 31 months (from March 2011 to September 2013) in order to determine whether lead user preferences are discoverable and relevant to next generation cell phone designs. [DOI: 10.1115/1.4030049]


Journal of Mechanical Design | 2008

Optimal Product Portfolio Formulation by Merging Predictive Data Mining With Multilevel Optimization

Conrad S. Tucker; Harrison M. Kim

This paper addresses two important fundamental areas in product family formulation that have recently begun to receive great attention. First is the incorporation of market demand that we address through a data mining approach where realistic customer preference data are translated into performance design targets. Second is product architecture reconfiguration that we model as a dynamic design entity. The dynamic approach to product architecture optimization differs from conventional static approaches in that a product architecture is not fixed at the initial stage of product design, but rather evolves with fluctuations in customer performance preferences. The benefits of direct customer input in product family design will be realized through the cell phone product family example presented in this work. An optimal family of cell phones is created with modularity decisions made analytically at the engineering level that maximize company profit.


Journal of Mechanical Design | 2011

Trend Mining for Predictive Product Design

Conrad S. Tucker; Harrison M. Kim

The Preference Trend Mining (PTM) algorithm that is proposed in this work aims to address some fundamental challenges of current demand modeling techniques being employed in the product design community. The first contribution is a multistage predictive modeling approach that captures changes in consumer preferences (as they relate to product design) over time, hereby enabling design engineers to anticipate next generation product features before they become mainstream=unimportant. Because consumer preferences may exhibit monotonically increasing or decreasing, seasonal, or unobservable trends, we proposed employing a statistical trend detection technique to help detect time series attribute patterns. A time series exponential smoothing technique is then used to forecast future attribute trend patterns and generates a demand model that reflects emerging product preferences over time. The second contribution of this work is a novel classification scheme for attributes that have low predictive power and hence may be omitted from a predictive model. We propose classifying such attributes as either standard, nonstandard, or obsolete by assigning the appropriate classification based on the time series entropy values that an attribute exhibits. By modeling attribute irrelevance, design engineers can determine when to retire certain product features (deemed obsolete) or incorporate others into the actual product architecture (standard) while developing modules for those attributes exhibiting inconsistent patterns throughout time (nonstandard). Several time series data sets using publicly available data are used to validate the proposed preference trend mining model and compared it to traditional demand modeling techniques for predictive accuracy and ease of model generation. [DOI: 10.1115/1.4004987]


Journal of Computing and Information Science in Engineering | 2009

Data-Driven Decision Tree Classification for Product Portfolio Design Optimization

Conrad S. Tucker; Harrison M. Kim

The formulation of a product portfolio requires extensive knowledge about the product market space and also the technical limitations of a company’s engineering design and manufacturing processes. A design methodology is presented that significantly enhances the product portfolio design process by eliminating the need for an exhaustive search of all possible product concepts. This is achieved through a decision tree data mining technique that generates a set of product concepts that are subsequently validated in the engineering design using multilevel optimization techniques. The final optimal product portfolio evaluates products based on the following three criteria: (1) it must satisfy customer price and performance expectations (based on the predictive model) defined here as the feasibility criterion; (2) the feasible set of products/variants validated at the engineering level must generate positive profit that we define as the optimality criterion; (3) the optimal set of products/variants should be a manageable size as defined by the enterprise decision makers and should therefore not exceed the product portfolio limit. The strength of our work is to reveal the tremendous savings in time and resources that exist when decision tree data mining techniques are incorporated into the product portfolio design and selection process. Using data mining tree generation techniques, a customer data set of 40,000 responses with 576 unique attribute combinations (entire set of possible product concepts) is narrowed down to 46 product concepts and then validated through the multilevel engineering design response of feasible products. A cell phone example is presented and an optimal product portfolio solution is achieved that maximizes company profit, without violating customer product performance expectations.


conference on information and knowledge management | 2013

Discovering health-related knowledge in social media using ensembles of heterogeneous features

Suppawong Tuarob; Conrad S. Tucker; Marcel Salathé; Nilam Ram

Social media is emerging as a powerful source of communication, information dissemination and mining. Being colloquial and ubiquitous in nature makes it easier for users to express their opinions and preferences in a seamless, dynamic manner. Epidemic surveillance systems that utilize social media to detect the emergence of diseases have been proposed in the literature. These systems mostly employ traditional document classification techniques that represent a document with a bag of N-grams. However, such techniques are not optimal for social media where sparsity and noise are norms. The authors address the limitations posed by the traditional N-gram based methods and propose to use features that represent different semantic aspects of the data in combination with ensemble machine learning techniques to identify health-related messages in a heterogenous pool of social media data. Furthermore, the results reveal significant improvement in identifying health related social media content which can be critical in the emergence of a novel, unknown disease epidemic.


ASME 2014 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference | 2014

Discovering Next Generation Product Innovations by Identifying Lead User Preferences Expressed Through Large Scale Social Media Data

Suppawong Tuarob; Conrad S. Tucker

An innovative consumer (a.k.a. a lead user) is a consumer of a product that faces needs unknown to the public. Innovative consumers play important roles in the product development process as their ideas tend to be innovatively unique and can be potentially useful for development of next generation, innovative products that better satisfy the market needs. Oftentimes, consumers portray their usage experience and opinions about products and product features through social networks such as Twitter and Facebook, making social media a viable, rich in information, and large-scale source for mining product related information. The authors of this work propose a data mining methodology to automatically identify innovative consumers from a heterogeneous pool of social media users. Specifically, a mathematical model is proposed to identify latent features (product features unknown to the public) from social media data. These latent features then serve as the key to discover innovative users from the ever increasing pool of social media users. A real-world case study, which identifies smartphone lead users in the pool of Twitter users, illustrates promising success of the proposed models.Copyright


international world wide web conferences | 2014

On the ground validation of online diagnosis with Twitter and medical records

Todd J. Bodnar; Victoria C. Barclay; Nilam Ram; Conrad S. Tucker; Marcel Salathé

Social media has been considered as a data source for tracking disease. However, most analyses are based on models that prioritize strong correlation with population-level disease rates over determining whether or not specific individual users are actually sick. Taking a different approach, we develop a novel system for social-media based disease detection at the individual level using a sample of professionally diagnosed individuals. Specifically, we develop a system for making an accurate influenza diagnosis based on an individuals publicly available Twitter data. We find that about half (17/35 = 48.57%) of the users in our sample that were sick explicitly discuss their disease on Twitter. By developing a meta classifier that combines text analysis, anomaly detection, and social network analysis, we are able to diagnose an individual with greater than 99% accuracy even if she does not discuss her health.


Engineering Optimization | 2010

A ReliefF attribute weighting and X-means clustering methodology for top-down product family optimization

Conrad S. Tucker; Harrison M. Kim; Douglas E. Barker; Yuanhui Zhang

This article proposes a top-down product family design methodology that enables product design engineers to identify the optimal number of product architectures directly from the customer preference data set by employing data mining attribute weighting and clustering techniques. The methodology also presents an efficient component sharing strategy to aid in product family commonality decisions. Two key data mining models are presented in this work to help guide the product design process: (1) the ReliefF attribute weighting technique that identifies and ranks product attributes, and (2) the X-means clustering approach that autonomously identifies the optimal number of candidate products. Product family commonality decisions are guided by once again employing the X-means clustering technique, this time to identify the components across product families that are most similar. A family of prototype aerodynamic air particle separators is used to evaluate the efficiency and validity of the proposed product family design methodology.

Collaboration


Dive into the Conrad S. Tucker's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gül E. Okudan Kremer

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Kevin Lesniak

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Nilam Ram

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Sung Woo Kang

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Sunghoon Lim

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Matthew L. Dering

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Sven G. Bilén

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Timothy W. Simpson

Pennsylvania State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge