Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Xiaochuan Ni is active.

Publication


Featured researches published by Xiaochuan Ni.


international world wide web conferences | 2010

Cross-domain sentiment classification via spectral feature alignment

Sinno Jialin Pan; Xiaochuan Ni; Jian-Tao Sun; Qiang Yang; Zheng Chen

Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of users publishing sentiment data (e.g., reviews, blogs). Although traditional classification algorithms can be used to train sentiment classifiers from manually labeled text data, the labeling work can be time-consuming and expensive. Meanwhile, users often use some different words when they express sentiment in different domains. If we directly apply a classifier trained in one domain to other domains, the performance will be very low due to the differences between these domains. In this work, we develop a general solution to sentiment classification when we do not have any labels in a target domain but have some labeled data in a different domain, regarded as source domain. In this cross-domain sentiment classification setting, to bridge the gap between the domains, we propose a spectral feature alignment (SFA) algorithm to align domain-specific words from different domains into unified clusters, with the help of domain-independent words as a bridge. In this way, the clusters can be used to reduce the gap between domain-specific words of the two domains, which can be used to train sentiment classifiers in the target domain accurately. Compared to previous approaches, SFA can discover a robust representation for cross-domain data by fully exploiting the relationship between the domain-specific and domain-independent words via simultaneously co-clustering them in a common latent space. We perform extensive experiments on two real world datasets, and demonstrate that SFA significantly outperforms previous approaches to cross-domain sentiment classification.


web search and data mining | 2011

Cross lingual text classification by mining multilingual topics from wikipedia

Xiaochuan Ni; Jian-Tao Sun; Jian Hu; Zheng Chen

This paper investigates how to effectively do cross lingual text classification by leveraging a large scale and multilingual knowledge base, Wikipedia. Based on the observation that each Wikipedia concept is described by documents of different languages, we adapt existing topic modeling algorithms for mining multilingual topics from this knowledge base. The extracted topics have multiple types of representations, with each type corresponding to one language. In this work, we regard such topics extracted from Wikipedia documents as universal-topics, since each topic corresponds with same semantic information of different languages. Thus new documents of different languages can be represented in a space using a group of universal-topics. We use these universal-topics to do cross lingual text classification. Given the training data labeled for one language, we can train a text classifier to classify the documents of another language by mapping all documents of both languages into the universal-topic space. This approach does not require any additional linguistic resources, like bilingual dictionaries, machine translation tools, or labeling data for the target language. The evaluation results indicate that our topic modeling approach is effective for building cross lingual text classifier.


international joint conference on artificial intelligence | 2011

Distance metric learning under covariate shift

Bin Cao; Xiaochuan Ni; Jian-Tao Sun; Gang Wang; Qiang Yang

Learning distance metrics is a fundamental problem in machine learning. Previous distance-metric learning research assumes that the training and test data are drawn from the same distribution, which may be violated in practical applications. When the distributions differ, a situation referred to as covariate shift, the metric learned from training data may not work well on the test data. In this case the metric is said to be inconsistent. In this paper, we address this problem by proposing a novel metric learning framework known as consistent distance metric learning (CDML), which solves the problem under covariate shift situations. We theoretically analyze the conditions when the metrics learned under covariate shift are consistent. Based on the analysis, a convex optimization problem is proposed to deal with the CDML problem. An importance sampling method is proposed for metric learning and two importance weighting strategies are proposed and compared in this work. Experiments are carried out on synthetic and real world datasets to show the effectiveness of the proposed method.


conference on information and knowledge management | 2011

Unsupervised transactional query classification based on webpage form understanding

Yuchen Liu; Xiaochuan Ni; Jian-Tao Sun; Zheng Chen

Query type classification aims to classify search queries into categories like navigational, informational and transactional, etc., according to the type of information need behind the queries. Although this problem has drawn many research attentions, previous methods usually require editors to label queries as training data or need domain knowledge to edit rules for predicting query type. Also, the existing work has been mainly focusing on the classification of informational and navigational query types. Transactional query classification has not been well addressed. In this work, we propose an unsupervised approach for transactional query classification. This method is based on the observation that, after the transactional queries are issued to a search engine, many users will click the search result pages and then have interactions with Web forms on these pages. The interactions, e.g., typing in text box, making selections from dropdown list, clicking on a button to execute actions, are used to specify detailed information of the transaction. By mining toolbar search log data, which records the associations between queries and Web forms clicked by users, we can get a set of good quality transactional queries without using manual labeling efforts. By matching these automatically acquired transactional queries and their associated Web form contents, we can generalize these queries into patterns. These patterns can be used to classify queries which are not covered by search log. Our experiments indicate that transactional queries produced by this method have good quality. The pattern based classifier achieves 83% F1 classification result. This is very effective considering the fact that we do not adopt any labeling efforts to train the classifier.


international world wide web conferences | 2018

Conversational Query Understanding Using Sequence to Sequence Modeling

Gary Ren; Xiaochuan Ni; Manish Malik; Qifa Ke

Understanding conversations is crucial to enabling conversational search in technologies such as chatbots, digital assistants, and smart home devices that are becoming increasingly popular. Conventional search engines are powerful at answering open domain queries but are mostly capable of stateless search. In this paper, we define a conversational query as a query that depends on the context of the current conversation, and we formulate the conversational query understanding problem as context-aware query reformulation, where the goal is to reformulate the conversational query into a search engine friendly query in order to satisfy users» information needs in conversational settings. Such context-aware query reformulation problem lends itself to sequence to sequence modeling. We present a large scale open domain dataset of conversational queries and various sequence to sequence models that are learned from this dataset. The best model correctly reformulates over half of all conversational queries, showing the potential of sequence to sequence modeling for this task.


international world wide web conferences | 2009

Mining multilingual topics from wikipedia

Xiaochuan Ni; Jian-Tao Sun; Jian Hu; Zheng Chen


international world wide web conferences | 2007

Exploring in the weblog space by detecting informative and affective articles

Xiaochuan Ni; Gui-Rong Xue; Xiao Ling; Yong Yu; Qiang Yang


Archive | 2009

Opinion search engine

Jian-Tao Sun; Xiaochuan Ni; Peng Xu; Gang Wang; Ke Tang; Zheng Chen


Archive | 2010

Mining Multilingual Topics

Xiaochuan Ni; Jian-Tao Sun; Zheng Chen; Jian Hu


Archive | 2010

Mobile Query Suggestions With Time-Location Awareness

Xiaochuan Ni; Jian-Tao Sun; Zheng Chen

Collaboration


Dive into the Xiaochuan Ni's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qiang Yang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gui-Rong Xue

Shanghai Jiao Tong University

View shared research outputs
Top Co-Authors

Avatar

Yong Yu

Shanghai Jiao Tong University

View shared research outputs
Researchain Logo
Decentralizing Knowledge