Han-Shen Huang
Academia Sinica
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Han-Shen Huang.
intelligent systems in molecular biology | 2008
Chun-Nan Hsu; Yu-Ming Chang; Cheng-Ju Kuo; Yu-Shi Lin; Han-Shen Huang; I-Fang Chung
Motivation: Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. Results: We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. Availability: Data sets, programs and an on-line service of our gene mention tagger can be accessed at http://aiia.iis.sinica.edu.tw/biocreative2.htm Contact: [email protected]
Machine Learning | 2009
Chun-Nan Hsu; Han-Shen Huang; Yu-Ming Chang; Yuh-Jye Lee
It has been established that the second-order stochastic gradient descent (SGD) method can potentially achieve generalization performance as well as empirical optimum in a single pass through the training examples. However, second-order SGD requires computing the inverse of the Hessian matrix of the loss function, which is prohibitively expensive for structured prediction problems that usually involve a very high dimensional feature space. This paper presents a new second-order SGD method, called Periodic Step-size Adaptation (PSA). PSA approximates the Jacobian matrix of the mapping function and explores a linear relation between the Jacobian and Hessian to approximate the Hessian, which is proved to be simpler and more effective than directly approximating Hessian in an on-line setting. We tested PSA on a wide variety of models and tasks, including large scale sequence labeling tasks using conditional random fields and large scale classification tasks using linear support vector machines and convolutional neural networks. Experimental results show that single-pass performance of PSA is always very close to empirical optimum.
international conference on data mining | 2005
Han-Shen Huang; Bou-Ho Yang; Chun-Nan Hsu
This paper presents the triple jump framework for accelerating the EM algorithm and other bound optimization methods. The idea is to extrapolate the third search point based on the previous two search points found by regular EM. As the convergence rate of regular EM becomes slower, the distance of the triple jump is longer, and thus provide higher speedup for data sets where EM converges slowly. Experimental results show that the triple jump framework significantly outperforms EM and other acceleration methods of EM for a variety of probabilistic models, especially when the data set is sparse. The results also show that the triple jump framework is particularly effective for cluster models.
Information Systems Frontiers | 2006
Jane Yung-jen Hsu; Kwei-Jay Lin; Tsung-Hsiang Chang; Chien-Ju Ho; Han-Shen Huang; Wan-rong Jih
Distributed trust management addresses the challenges of eliciting, evaluating and propagating trust for service providers on the distributed network. By delegating trust management to brokers, individual users can share their feedbacks for services without the overhead of maintaining their own ratings. This research proposes a two-tier trust hierarchy, in which a user relies on her broker to provide reputation rating about any service provider, while brokers leverage their connected partners in aggregating the reputation of unfamiliar service providers. Each broker collects feedbacks from its users on past transactions. To accommodate individual differences, personalized trust is modeled with a Bayesian network. Training strategies such as the expectation maximization (EM) algorithm can be deployed to estimate both server reputation and user bias. This paper presents the design and implementation of a distributed trust simulator, which supports experiments under different configurations. In addition, we have conducted experiments to show the following. 1) Personal rating error converges to below 5% consistently within 10,000 transactions regardless of the training strategy or bias distribution. 2) The choice of trust model has a significant impact on the performance of reputation prediction. 3) The two-tier trust framework scales well to distributed environments. In summary, parameter learning of trust models in the broker-based framework enables both aggregation of feedbacks and personalized reputation prediction.
congress on evolutionary computation | 2005
Koung-Lung Lin; Jane Yung-jen Hsu; Han-Shen Huang; Chun-Nan Hsu
Recommender systems are a powerful tool for promoting sales in electronic commerce. An effective shopping recommender system can help boost the retailers sales by reminding customers to purchase additional products originally not on their shopping lists. Existing recommender systems are designed to identify the top selling items, also called hot sellers, based on the stores sales data and customer purchase behaviors. It turns out that timely reminders for unsought products, which are cold sellers that the consumer either does not know about or does not normally think of buying, present great opportunities for significant sales growth. In this paper, we propose the framework and process of a recommender system that identifies potential customers of unsought products using boosting-SVM. The empirical results show that the proposed approach provides a promising solution to targeted advertisement for unsought products in an e-commerce environment.
international conference on data mining | 2007
Han-Shen Huang; Yu-Ming Chang; Chun-Nan Hsu
For applications with consecutive incoming training examples, on-line learning has the potential to achieve a likelihood as high as off-line learning without scanning all available training examples and usually has a much smaller memory footprint. To train CRFson-line, this paper presents the Periodic Step size Adaptation (PSA) method to dynamically adjust the learning rates in stochastic gradient descent. We applied our method to three large scale text mining tasks. Experimental results show that PSA outperforms the best off-line algorithm, L-BFGS, by many hundred times, and outperforms the best on-line algorithm, SMD, by an order of magnitude in terms of the number of passes required to scan the training data set.
international conference on robotics and automation | 1990
Han-Shen Huang; Ming-Hung Lin
A modified computed torque controller has been developed by N.H. McClamroch and D. Wang (1987). If the mathematical model of the robot is exact, the modified computed torque can simultaneously control the robot motion and contact force in an accurate way. However, there may exist uncertainties in the model, such as flexibility of joints and links, joint friction, and an inexact surface model. It is shown that the modified computed torque controller may result in an unstable closed-loop system for the system with uncertainties. This difficulty can be overcome by using a variable structure controller. The controller is robust in that it is insensitive to variations in the plant parameters and to external disturbances of contact force.<<ETX>>
international conference on data mining | 2006
Chun-Nan Hsu; Han-Shen Huang; Bo-Hou Yang
The expectation-maximization (EM) algorithm is one of the most popular algorithms for data mining from incomplete data. However, when applied to large data sets with a large proportion of missing data, the EM algorithm may converge slowly. The triple jump extrapolation method can effectively accelerate the EM algorithm by substantially reducing the number of iterations required for EM to converge. There are two options for the triple jump method, global extrapolation (TJEM) and componentwise extrapolation (CTJEM). We tried these two methods for a variety of probabilistic models and found that in general, global extraplolation yields a better performance, but there are cases where componentwise extrapolation yields very high speedup. In this paper, we investigate when componentwise extrapolation should be preferred. We conclude that, when the Jacobian of the EM mapping is diagonal or block diagonal, CTJEM should be preferred. We show how to determine whether a Jacobian is diagonal or block diagonal and experimentally confirm our claim. In particular, we show that CTJEM is especially effective for the semi-supervised Bayesian classifier model given a highly sparse data set.
international conference on robotics and automation | 1997
Han-Shen Huang; Li-Chen Fu; Jane Yung-jen Hsu
In this paper, firstly we discuss the architecture of a flexible automated production system under which all components are well-modularized. Then, a solution, EMFAK (event-driven multi-tasking flexible automation kernel) is proposed to speed up the construction of system control. Finally, a flexible assembly system using EMFAK is described to show how to apply it to a real flexible automated production system.
Data Mining and Knowledge Discovery | 2009
Han-Shen Huang; Bo-Hou Yang; Yu-Ming Chang; Chun-Nan Hsu
The triple jump extrapolation method is an effective approximation of Aitken’s acceleration that can accelerate the convergence of many algorithms for data mining, including EM and generalized iterative scaling (GIS). It has two options—global and componentwise extrapolation. Empirical studies showed that neither can dominate the other and it is not known which one is better under what condition. In this paper, we investigate this problem and conclude that, when the Jacobian is (block) diagonal, componentwise extrapolation will be more effective. We derive two hints to determine the block diagonality. The first hint is that when we have a highly sparse data set, the Jacobian of the EM mapping for training a Bayesian network will be block diagonal. The second is that the block diagonality of the Jacobian of the GIS mapping for training CRF is negatively correlated with the strength of feature dependencies. We empirically verify these hints with controlled and real-world data sets and show that our hints can accurately predict which method will be superior. We also show that both global and componentwise extrapolation can provide substantial acceleration. In particular, when applied to train large-scale CRF models, the GIS variant accelerated by componentwise extrapolation not only outperforms its global extrapolation counterpart, as our hint predicts, but can also compete with limited-memory BFGS (L-BFGS), the de facto standard for CRF training, in terms of both computational efficiency and F-scores. Though none of the above methods are as fast as stochastic gradient descent (SGD), careful tuning is required for SGD and the results given in this paper provide a useful foundation for automatic tuning.