Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Baichuan Li is active.

Publication


Featured researches published by Baichuan Li.


conference on information and knowledge management | 2010

Routing questions to appropriate answerers in community question answering services

Baichuan Li; Irwin King

Community Question Answering (CQA) service provides a platform for increasing number of users to ask and answer for their own needs but unanswered questions still exist within a fixed period. To address this, the paper aims to route questions to the right answerers who have a top rank in accordance of their previous answering performance. In order to rank the answerers, we propose a framework called Question Routing (QR) which consists of four phases: (1) performance profiling, (2) expertise estimation, (3) availability estimation, and (4) answerer ranking. Applying the framework, we conduct experiments with Yahoo! Answers dataset and the results demonstrate that on average each of 1,713 testing questions obtains at least one answer if it is routed to the top 20 ranked answerers.


international world wide web conferences | 2012

Analyzing and predicting question quality in community question answering services

Baichuan Li; Tan Jin; Michael R. Lyu; Irwin King; Barley Mak

Users tend to ask and answer questions in community question answering (CQA) services to seek information and share knowledge. A corollary is that myriad of questions and answers appear in CQA service. Accordingly, volumes of studies have been taken to explore the answer quality so as to provide a preliminary screening for better answers. However, to our knowledge, less attention has so far been paid to question quality in CQA. Knowing question quality provides us with finding and recommending good questions together with identifying bad ones which hinder the CQA service. In this paper, we are conducting two studies to investigate the question quality issue. The first study analyzes the factors of question quality and finds that the interaction between askers and topics results in the differences of question quality. Based on this finding, in the second study we propose a Mutual Reinforcement-based Label Propagation (MRLP) algorithm to predict question quality. We experiment with Yahoo!~Answers data and the results demonstrate the effectiveness of our algorithm in distinguishing high-quality questions from low-quality ones.


conference on information and knowledge management | 2011

Question routing in community question answering: putting category in its place

Baichuan Li; Irwin King; Michael R. Lyu

This paper investigates a ground-breaking incorporation of question category to Question Routing (QR) in Community Question Answering (CQA) services. The incorporation of question category was designed to estimate answerer expertise for routing questions to potential answerers. Two category-sensitive Language Models (LMs) were developed with large-scale real world data sets being experimented. Results demonstrated that higher accuracies of routing questions with lower computational costs were achieved, relative to traditional Query Likelihood LM (QLLM), state-of-the-art Cluster-Based LM (CBLM) and the mixture of Latent Dirichlet Allocation and QLLM (LDALM).


spoken language technology workshop | 2010

Collection of user judgments on spoken dialog system with crowdsourcing

Zhaojun Yang; Baichuan Li; Yi Zhu; Irwin King; Gina-Anne Levow; Helen M. Meng

This paper presents an initial attempt at the use of crowd-sourcing for collection of user judgments on spoken dialog systems (SDSs). This is implemented on Amazon Mechanical Turk (MTurk), where a Requester can design a human intelligence task (HIT) to be performed by a large number of Workers efficiently and cost-effectively. We describe a design methodology for two types of HITs - the first targets at fast rating of a large number of dialogs regarding some dimensions of the SDSs performance and the second aims to assess the reliability of Workers on MTurk through the variability in ratings across different Workers. A set of approval rules are also designed to control the quality of ratings from MTurk. At the end of the collection work, user judgments for about 8,000 dialogs rated by around 700Workers are collected in 45 days. We observe reasonable consistency between the manual MTurk ratings and an automatic categorization of dialogs in terms of task completion, which partially verifies the reliability of the approved ratings from MTurk. From the second type of HITs, we also observe moderate inter-rater agreement for ratings in task completion which provides support for the utilization of MTurk as a judgments collection platform. Further research on the exploration of SDS evaluation models could be developed based on the collected corpus.


Knowledge and Information Systems | 2015

A topic-biased user reputation model in rating systems

Baichuan Li; Rong-Hua Li; Irwin King; Michael R. Lyu; Jeffrey Xu Yu

In rating systems like Epinions and Amazon’s product review systems, users rate items on different topics to yield item scores. Traditionally, item scores are estimated by averaging all the ratings with equal weights. To improve the accuracy of estimated item scores, user reputation [a.k.a., user reputation (UR)] is incorporated. The existing algorithms on UR, however, have underplayed the role of topics in rating systems. In this paper, we first reveal that UR is topic-biased from our empirical investigation. However, existing algorithms cannot capture this characteristic in rating systems. To address this issue, we propose a topic-biased model (TBM) to estimate UR in terms of different topics as well as item scores. With TBM, we develop six topic-biased algorithms, which are subsequently evaluated with experiments using both real-world and synthetic data sets. Results of the experiments demonstrate that the topic-biased algorithms effectively estimate UR across different topics and produce more robust item scores than previous reputation-based algorithms, leading to potentially more robust rating systems.


international symposium on neural networks | 2012

Communities of Yahoo! Answers and Baidu Zhidao: Complementing or competing?

Baichuan Li; Michael R. Lyu; Irwin King

Community Question Answering (CQA) attracts increasing volume of research on question retrieval, high quality content discovery and experts finding. However, few studies are focused on community per se of CQA services and also provide an in-depth analysis of them. This paper aims to enrich our knowledge on two of these CQA services, namely Yahoo! Answers and Baidu Zhidao through reviewing their communities, comparing similarities and differences of the two communities, together with analyzing their influence on solving questions. Six data sets are employed for comparative analysis. In this paper: (1) We analyze the social network structures of Yahoo! Answers and Baidu Zhidao; (2) We compare the the social community characteristics of top contributors; (3) We reveal the behaviors of users in different categories in these two portals; (4) We reveal temporal trends of these characteristics; (5) We find that the community of Yahoo! Answers and Baidu Zhidao complement each other in efficiency and effectiveness of answering questions.


spoken language technology workshop | 2010

Using finite state machines for evaluating spoken dialog systems

Yi Zhu; Zhaojun Yang; Helen M. Meng; Baichuan Li; Gina-Anne Levow; Irwin King

Development of spoken dialog systems (SDSs) can be facilitated by better evaluation methods. Previous methods seldom consider the efficiency of the system, which is important to users. We study the problem of evaluating SDSs and propose a new framework by generalizing states from utterances of dialogs to build finite state machine (FSM). These states can be regarded as efficiency measurement of SDSs. The FSM framework models dialogs as paths in an FSM to combine efficiency measurement with regression models. The proposed FSM framework can be applied in conjunction with regression models to improve evaluation accuracy. We compare our FSM framework combined with three regression models in several experiments. We obtain promising results on a collection of dialogs from the “Lets Go!” system, with our approach outperforming regression models.


spoken language technology workshop | 2010

Collaborative filtering model for user satisfaction prediction in Spoken Dialog System evaluation

Zhaojun Yang; Baichuan Li; Yi Zhu; Irwin King; Gina-Anne Levow; Helen M. Meng

Developing accurate models to automatically predict user satisfaction about the overall quality of a Spoken Dialog System (SDS) is highly desirable for SDS evaluation. In the original PARADISE framework, a linear regression model is trained using measures drawn from rated dialogs as predictors with user satisfaction as the target. In this paper, we extend PARADISE by introducing a collaborative filtering (CF) model for user satisfaction prediction and its corresponding extension. This prediction model is drawn from the idea of CF in recommendation systems, which uses information from near neighbors of an unrated dialog to predict its user satisfaction. We also present the methodology of collecting user judgments on SDS quality with crowdsourcing through Amazon Mechanical Turk. Experimental results show that the CF approaches could distinctly improve the prediction accuracy of user satisfaction.


spoken language technology workshop | 2010

Predicting user evaluations of spoken dialog systems using semi-supervised learning

Baichuan Li; Zhaojun Yang; Yi Zhu; Helen M. Meng; Gina-Anne Levow; Irwin King

User evaluations of dialogs from a spoken dialog system (SDS) can be directly used to gauge the systems performance. However, it is costly to obtain manual evaluations of a large corpus of dialogs. Semi-supervised learning (SSL) provides a possible solution. This process learns from a small amount of manually labeled data, together with a large amount of unlabeled data, and can later be used to perform automatic labeling. We conduct comparative experiments among SSL approaches, classical regression and supervised learning in evaluation of dialogs from CMUs Lets Go Bus Information System. Two typical SSL methods, namely co-training and semi-supervised support vector machine (S3VM), are found to outperform the other approaches in automatically predicting user evaluations of unseen dialogs in the case of low training rate.


conference on information and knowledge management | 2011

Question identification on twitter

Baichuan Li; Xiance Si; Michael R. Lyu; Irwin King; Edward Y. Chang

Collaboration


Dive into the Baichuan Li's collaboration.

Top Co-Authors

Avatar

Irwin King

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Michael R. Lyu

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Helen M. Meng

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Yi Zhu

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zhaojun Yang

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Barley Mak

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Dan Hong

Hong Kong University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge