Christopher C. Yang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christopher C. Yang is active.

Explore More

Publication

Featured researches published by Christopher C. Yang.

Journal of the Association for Information Science and Technology | 1998

A smart itsy bitsy spider for the web

Hsinchun Chen; Yi-Ming Chung; Marshall Ramsey; Christopher C. Yang

As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent agent approach to Web searching. In this experiment, we developed two Web personal spiders based on best first search and genetic algorithm techniques, respectively. These personal spiders can dynamically take a users selected starting homepages and search for the most closely related homepages in the Web, based on the links and keyword indexing. A graphical, dynamic, Java-based interface was developed and is available for Web access. A system architecture for implementing such an agent-based spider is presented, followed by detailed discussions of benchmark testing and user evaluation results. In benchmark testing, although the genetic algorithm spider did not outperform the best first search spider, we found both results to be comparable and complementary. In user evaluation, the genetic algorithm spider obtained significantly higher recall value than that of the best first search spider. However, their precision values were not statistically different. The mutation process introduced in genetic algorithm allows users to find other potential relevant homepages that cannot be explored via a conventional local search process. In addition, we found the Java-based interface to be a necessary component for design of a truly interactive and dynamic Web agent.

decision support systems | 2003

Visualization of large category map for internet browsing

Christopher C. Yang; Hsinchun Chen; Kay Hong

Information overload is a critical problem in World Wide Web. Category map developed based on Kohonens self-organizing map (SOM) has been proven to be a promising browsing tool for the Web. The SOM algorithm automatically categorizes a large Internet information space into manageable sub-spaces. It compresses and transforms a complex information space into a two-dimensional graphical representation. Such graphical representation provides a user-friendly interface for users to explore the automatically generated mental model. However, as the amount of information increases, it is expected to increase the size of the category map accordingly in order to accommodate the important concepts in the information space. It results in increasing of visual load of the category map. Large pool of information is packed closely together on a limited size of displaying window, where local details are difficult to be clearly seen. In this paper, we propose the fisheye views and fractal views to support the visualization of category map. Fisheye views are developed based on the distortion approach while fractal views are developed based on the information reduction approach. The purpose of fisheye views are to enlarge the regions of interest and diminish the regions that are further away while maintaining the global structure. On the other hand, fractal views are an approximation mechanism to abstract complex objects and control the amount of information to be displayed. We have developed a prototype system and conducted a user evaluation to investigate the performance of fisheye views and fractal views. The results show that both fisheye views and fractal views significantly increase the effectiveness of visualizing category map. In addition, fractal views are significantly better than fisheye views but the combination of fractal views and fisheye views do not increase the performance compared to each individual technique.

decision support systems | 1998

An intelligent personal spider (agent) for dynamic Internet/intranet searching

Chen Hsinchun; Chung Yi-Ming; Marshall Ramsey; Christopher C. Yang

Abstract As Internet services based on the World-Wide Web become more popular, information overload has become a pressing research problem. Difficulties with search on Internet will worsen as the amount of on-line information increases. A scalable approach to Internet search is critical to the success of Internet services and other current and future National Information Infrastructure (NII) applications. As part of the ongoing Illinois Digital Library Initiative project, this research proposes an intelligent personal spider (agent) approach to Internet searching. The approach, which is grounded on automatic textual analysis and general-purpose search algorithms, is expected to be an improvement over the current static and inefficient Internet searches. In this experiment, we implemented Internet personal spiders based on best first search and genetic algorithm techniques. These personal spiders can dynamically take a users selected starting homepages and search for the most closely related homepages in the web, based on the links and keyword indexing. A plain, static CGI/HTML-based interface was developed earlier, followed by a recent enhancement of a graphical, dynamic Java-based interface. Preliminary evaluation results and two working prototypes (available for Web access) are presented. Although the examples and evaluations presented are mainly based on Internet applications, the applicability of the proposed techniques to the potentially more rewarding Intranet applications should be obvious. In particular, we believe the proposed agent design can be used to locate organization-wide information, to gather new, time-critical organizational information, and to support team-building and communication in Intranets.

Proceedings of the 2012 international workshop on Smart health and wellbeing | 2012

Social media mining for drug safety signal detection

Christopher C. Yang; Haodong Yang; Ling Jiang; Mi Zhang

Adverse Drug Reactions (ADRs) represent a serious problem all over the world. They may complicate a patients medical conditions and increase the morbidity, even mortality. Drug safety currently depends heavily on post-marketing surveillance, because pre-marketing review process cannot identify all possible adverse drug reactions in that it is limited by scale and time span. However, current post-marketing surveillance is conducted through centralized volunteering reporting systems, and the reporting rate is low. Consequently, it is difficult to detect the adverse drug reactions signals in a timely manner. To solve this problem, many researchers have explored methods to detect ADRs in electronic health records. Nevertheless, we only have access to electronic health records form particular health units. Aggregating and integrating electronic health records from multiple sources is rather challenging. With the advance of Web 2.0 technologies and the popularity of social media, many health consumers are discussing and exchanging health-related information with their peers. Many of this online discussion involve adverse drug reactions. In this work, we propose to use association mining and Proportional Reporting Ratios to mine the associations between drugs and adverse reactions from the user contributed content in social media. We have conducted an experiment using ten drugs and five adverse drug reactions. The FDA alerts are used as the gold standard to test the performance of the proposed techniques. The result shows that the metrics leverage, lift, and PRR are all promising to detect the adverse drug reactions reported by FDA. However, PRR outperformed the other two metrics.

IEEE Transactions on Knowledge and Data Engineering | 2014

Identifying Features in Opinion Mining via Intrinsic and Extrinsic Domain Relevance

Zhen Hai; Kuiyu Chang; Jung-jae Kim; Christopher C. Yang

The vast majority of existing approaches to opinion feature extraction rely on mining patterns only from a single review corpus, ignoring the nontrivial disparities in word distributional characteristics of opinion features across different corpora. In this paper, we propose a novel method to identify opinion features from online reviews by exploiting the difference in opinion feature statistics across two corpora, one domain-specific corpus (i.e., the given review corpus) and one domain-independent corpus (i.e., the contrasting corpus). We capture this disparity via a measure called domain relevance (DR), which characterizes the relevance of a term to a text collection. We first extract a list of candidate opinion features from the domain review corpus by defining a set of syntactic dependence rules. For each extracted candidate feature, we then estimate its intrinsic-domain relevance (IDR) and extrinsic-domain relevance (EDR) scores on the domain-dependent and domain-independent corpora, respectively. Candidate features that are less generic (EDR score less than a threshold) and more domain-specific (IDR score greater than another threshold) are then confirmed as opinion features. We call this interval thresholding approach the intrinsic and extrinsic domain relevance (IEDR) criterion. Experimental results on two real-world review domains show the proposed IEDR approach to outperform several other well-established methods in identifying opinion features.

intelligence and security informatics | 2007

Terrorism and Crime Related Weblog Social Network: Link, Content Analysis and Information Visualization

Christopher C. Yang; Tobun Dorbin Ng

A Weblog is a Web site where entries are made in diary style, maintained by its sole author - a blogger, and displayed in a reverse chronological order. Due to the freedom and convenience of publishing in Weblogs, this form of media provides an ideal environment as a propaganda platform for terrorist groups to promote their ideologies and as an operation platform for organizing crimes. In this work, we present a framework to analyze and visualize Weblog social network embedded beneath relevant Weblogs gathered through topic-specific exploration. Link analysis uses the relationships between bloggers to construct the Weblog social network. Content analysis associates similar blog messages to unveil implicit relationships found in the semantics to further improve the Weblog social network analysis. Users can use different interactive information visualization techniques to explore various aspects of the underlying social network at different levels of abstraction. With the capability of analyzing and visualizing Weblog social networks in terrorist and crime related matters, intelligence agencies and law enforcement will be able to have an additional tools and means to ensure the national security.

Information Systems and E-business Management | 2010

Understanding what concerns consumers: a semantic approach to product feature extraction from consumer reviews

Chih-Ping Wei; Yen-Ming Chen; Chin-Sheng Yang; Christopher C. Yang

The Web has become an excellent source for gathering consumer opinions (more specifically, consumer reviews) about products. Consumer reviews are essential for retailers and product manufacturers to understand the general responses of customers to their products and improve their marketing campaigns or products accordingly. In addition, consumer reviews enable retailers to recognize the specific preferences of each customer, which facilitates effective marketing decisions. As the number of consumer reviews expands, it is essential and desirable to develop an efficient and effective sentiment analysis technique that is capable of extracting product features stated in consumer reviews (i.e., product feature extraction) and determining the sentiments (positive or negative semantic orientations) of consumers for these product features (i.e., opinion orientation identification). Product feature extraction is critical to sentiment analysis, because its effectiveness significantly affects the performance of opinion orientation identification, as well as the ultimate effectiveness of sentiment analysis. Therefore, this study concentrates on product feature extraction from consumer reviews. Specifically, we propose a semantic-based product feature extraction (SPE) technique that exploits a list of positive and negative adjectives defined in the General Inquirer to recognize opinion words semantically and subsequently extract product features expressed in consumer reviews. Using a prevalent product feature extraction technique and the SPE-GI technique (a variant of SPE) as performance benchmarks, our empirical evaluation shows that the proposed SPE technique outperforms both benchmark techniques.

systems man and cybernetics | 2009

Discovering Event Evolution Graphs From News Corpora

Christopher C. Yang; Xiaodong Shi; Chih-Ping Wei

Given the advance of Internet technologies, we can now easily extract hundreds or thousands of news stories of any ongoing incidents from newswires such as CNN.com, but the volume of information is too large for us to capture the blueprint. Information retrieval techniques such as topic detection and tracking are able to organize news stories as events, in a flat hierarchical structure, within a topic. However, they are incapable of presenting the complex evolution relationships between the events. We are interested to learn not only what the major events are but also how they develop within the topic. It is beneficial to identify the seminal events, the intermediary and ending events, and the evolution of these events. In this paper, we propose to utilize the event timestamp, event content similarity, temporal proximity, and document distributional proximity to model the event evolution relationships between events in an incident. An event evolution graph is constructed to present the underlying structure of events for efficient browsing and extracting of information. Case study and experiments are presented to illustrate and show the performance of our proposed technique. It is found that our proposed technique outperforms the baseline technique and other comparable techniques in previous work.

decision support systems | 2000

Intelligent internet searching agent based on hybrid simulated annealing

Christopher C. Yang; Jerome Yen; Hsinchun Chen

Abstract The World-Wide Web (WWW) based Internet services have become a major channel for information delivery. For the same reason, information overload also has become a serious problem to the users of such services. It has been estimated that the amount of information stored on the Internet doubled every 18 months. The speed of increase of homepages can be even faster, some people estimated that it doubled every 6 months. Therefore, a scalable approach to support Internet searching is critical to the success of Internet services and other current or future National Information Infrastructure (NII) applications. In this paper, we discuss a modified version of simulated annealing algorithm to develop an intelligent personal spider (agent), which is based on automatic textual analysis of the Internet documents and hybrid simulated annealing.

international world wide web conferences | 2003

Fractal summarization for mobile devices to access large documents on the web

Christopher C. Yang; Fu Lee Wang

Wireless access with mobile (or handheld) devices is a promising addition to the WWW and traditional electronic business. Mobile devices provide convenience and portable access to the huge information space on the Internet without requiring users to be stationary with network connection. However, the limited screen size, narrow network bandwidth, small memory capacity and low computing power are the shortcomings of handheld devices. Loading and visualizing large documents on handheld devices become impossible. The limited resolution restricts the amount of information to be displayed. The download time is intolerably long. In this paper, we introduce the fractal summarization model for document summarization on handheld devices. Fractal summarization is developed based on the fractal theory. It generates a brief skeleton of summary at the first stage, and the details of the summary on different levels of the document are generated on demands of users. Such interactive summarization reduces the computation load in comparing with the generation of the entire summary in one batch by the traditional automatic summarization, which is ideal for wireless access. Three-tier architecture with the middle-tier conducting the major computation is also discussed. Visualization of summary on handheld devices is also investigated.

Explore More