Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Shiqi Zhao is active.

Publication


Featured researches published by Shiqi Zhao.


international joint conference on natural language processing | 2009

Application-driven Statistical Paraphrase Generation

Shiqi Zhao; Xiang Lan; Ting Liu; Sheng Li

Paraphrase generation (PG) is important in plenty of NLP applications. However, the research of PG is far from enough. In this paper, we propose a novel method for statistical paraphrase generation (SPG), which can (1) achieve various applications based on a uniform statistical model, and (2) naturally combine multiple resources to enhance the PG performance. In our experiments, we use the proposed method to generate paraphrases for three different applications. The results show that the method can be easily transformed from one application to another and generate valuable and interesting paraphrases.


empirical methods in natural language processing | 2016

Multi-view Response Selection for Human-Computer Conversation.

Xiangyang Zhou; Daxiang Dong; Hua Wu; Shiqi Zhao; Dianhai Yu; Hao Tian; Xuan Liu; Rui Yan

In this paper, we study the task of response selection for multi-turn human-computer conversation. Previous approaches take word as a unit and view context and response as sequences of words. This kind of approaches do not explicitly take each utterance as a unit, therefore it is difficult to catch utterancelevel discourse information and dependencies. In this paper, we propose a multi-view response selection model that integrates information from two different views, i.e., word sequence view and utterance sequence view. We jointly model the two views via deep neural networks. Experimental results on a public corpus for context-sensitive response selection demonstrate the effectiveness of the proposed multi-view model, which significantly outperforms other single-view baselines.


Natural Language Engineering | 2009

Extracting paraphrase patterns from bilingual parallel corpora

Shiqi Zhao; Haifeng Wang; Ting Liu; Sheng Li

Paraphrase patterns are semantically equivalent patterns, which are useful in both paraphrase recognition and generation. This paper presents a pivot approach for extracting paraphrase patterns from bilingual parallel corpora, whereby the paraphrase patterns in English are extracted using the patterns in another language as pivots. We make use of log-linear models for computing the paraphrase likelihood between pattern pairs and exploit feature functions based on maximum likelihood estimation (MLE), lexical weighting (LW), and monolingual word alignment (MWA). Using the presented method, we extract more than 1 million pairs of paraphrase patterns from about 2 million pairs of bilingual parallel sentences. The precision of the extracted paraphrase patterns is above 78%. Experimental results show that the presented method significantly outperforms a well-known method called discovery of inference rules from text (DIRT). Additionally, the log-linear model with the proposed feature functions are effective. The extracted paraphrase patterns are fully analyzed. Especially, we found that the extracted paraphrase patterns can be classified into five types, which are useful in multiple natural language processing (NLP) applications.


international conference on pervasive computing | 2010

Identification of Web Query Intent Based on Query Text and Web Knowledge

Dayong Wu; Yu Zhang; Shiqi Zhao; Ting Liu

In this paper, we propose a novel approach to identifying user intents of search engine queries. Specifically, we recast it as a classification problem, in which four types of features are adopted. The classification features are based on deep linguistic analysis of queries as well as search engine feedbacks. We evaluate the method with the real web query data. The results show that about 88% of the test queries can be correctly identified with the classification framework via combining all the 4 types of features.


meeting of the association for computational linguistics | 2007

HIT: Web based Scoring Method for English Lexical Substitution

Shiqi Zhao; Lin Zhao; Yu Zhang; Ting Liu; Sheng Li

This paper describes the HIT system and its participation in SemEval-2007 English Lexical Substitution Task. Two main steps are included in our method: candidate substitute extraction and candidate scoring. In the first step, candidate substitutes for each target word in a given sentence are extracted from WordNet. In the second step, the extracted candidates are scored and ranked using a web-based scoring method. The substitute ranked first is selected as the best substitute. For the multiword subtask, a simple WordNet-based approach is employed.


international joint conference on artificial intelligence | 2017

Learning to Explain Entity Relationships by Pairwise Ranking with Convolutional Neural Networks

Jizhou Huang; Wei Zhang; Shiqi Zhao; Shiqiang Ding; Haifeng Wang

Providing a plausible explanation for the relationship between two related entities is an important task in some applications of knowledge graphs, such as in search engines. However, most existing methods require a large number of manually labeled training data, which cannot be applied in large-scale knowledge graphs due to the expensive data annotation. In addition, these methods typically rely on costly handcrafted features. In this paper, we propose an effective pairwise ranking model by leveraging clickthrough data of a Web search engine to address these two problems. We first construct large-scale training data by leveraging the query-title pairs derived from clickthrough data of a Web search engine. Then, we build a pairwise ranking model which employs a convolutional neural network to automatically learn relevant features. The proposed model can be easily trained with backpropagation to perform the ranking task. The experiments show that our method significantly outperforms several strong baselines.


international world wide web conferences | 2015

Exploiting Collective Hidden Structures in Webpage Titles for Open Domain Entity Extraction

Wei Song; Shiqi Zhao; Chao Zhang; Hua Wu; Haifeng Wang; Lizhen Liu; Hanshi Wang

We present a novel method for open domain named entity extraction by exploiting the collective hidden structures in webpage titles. Our method uncovers the hidden textual structures shared by sets of webpage titles based on generalized URL patterns and a multiple sequence alignment technique. The highlights of our method include: 1) The boundaries of entities can be identified automatically in a collective way without any manually designed pattern, seed or class name. 2) The connections between entities are also discovered naturally based on the hidden structures, which makes it easy to incorporate distant or weak supervision. The experiments show that our method can harvest large scale of open domain entities with high precision. A large ratio of the extracted entities are long-tailed and complex and cover diverse topics. Given the extracted entities and their connections, we further show the effectiveness of our method in a weakly supervised setting. Our method can produce better domain specific entities in both precision and recall compared with the state-of-the-art approaches.


asia information retrieval symposium | 2006

Web mining for lexical context-specific paraphrasing

Shiqi Zhao; Ting Liu; Xincheng Yuan; Sheng Li; Yu Zhang

In most applications of paraphrasing, contextual information should be considered since a word may have different paraphrases in different contexts. This paper presents a method that automatically acquires lexical context-specific paraphrases from the web. The method includes two main stages, candidate paraphrase extraction and paraphrase validation. Evaluations were conducted on a news title corpus whereby the context-specific paraphrasing method was compared with the Chinese synonymous thesaurus. Results show that the precision of our method is above 60% and the recall is above 55%, which outperforms the thesaurus significantly.


ACM Transactions on Intelligent Systems and Technology | 2013

Introduction to special section on paraphrasing

Haifeng Wang; Bill Dolan; Idan Szpektor; Shiqi Zhao

Paraphrasing, conveying the same meaning in different ways, is an intrinsic part of natural languages. The research field of Automatic Paraphrasing encompasses the tasks of collecting, identifying, and generating paraphrases in an automatic or a computeraided manner. In addition, researchers have investigated the contribution of automatic paraphrasing techniques to many natural language applications, such as question answering (QA), information extraction (IE), multi-document summarization (MDS), and machine translation (MT). For example, in Machine Translation, paraphrases have been used for rewriting and simplifying input sentences, enlarging translation phrase tables, expanding human references for automatic evaluation, and so forth. This special section of ACM TIST is intended to cover state-of-the-art research in automatic paraphrasing. Especially, we highlight the applications of paraphrasing techniques in real-world systems, such as MT systems and search engines. Seven articles are included in the special section. One of them is about paraphrase extraction from monolingual corpora, while the other six discuss the applications of paraphrases, including paraphrasing for machine translation, sentence compression, word meaning computing, and plagiarism detection. There are three articles that focus on applying paraphrasing techniques for MT. These articles cover the three main research directions mentioned, namely, source sentence rewriting, phrase table enlargement, and human reference expansion. In “Using Targeted Paraphrasing and Monolingual Crowdsourcing to Improve Translation” by Philip Resnik, Olivia Buzek, Yakov Kronrod, Chang Hu, Alexander J. Quinn, and Benjamin B. Bederson, the authors propose enhancing the translation quality of an SMT system based on crowdsourcing. A remarkable advantage of the proposed method is that it involves only monolingual workers to identify target-side translation errors and supply source-side paraphrase, rather than relying on workers with bilingual expertise. The proposed solution has the potential of providing a more cost-effective approach to translation in scenarios where machine translation would be considered acceptable to use if only it were generally of high enough quality. It also has the potential to vastly reduce the burden of human effort for cases in which bilingual translators postedit machine translation output. In the article “Distributional Phrasal Paraphrase Generation for Statistical Machine Translation” by Yuval Marton, the author focuses on extracting paraphrases to improve the coverage of the translation model. The proposed method extracts paraphrases from large-scale monolingual corpora based on distributional similarity. The extracted paraphrases are then used to augment a translation phrase table with pairs not covered by the initial table. The novelty of the proposed method lies in it being languageindependent, and hence it does not rely on bitexts for generating paraphrases or new phrase pairs. In “Generating Targeted Paraphrases for Improved Translation” by Nitin Madnani and Bonnie Dorr, the authors adopt an approach that uses automatic paraphrase generation to tune parameters for an SMT system. Specifically, given a single reference translation, they build a paraphrase generation system that can produce several different semantically equivalent variants that can then be used as additional reference translations. Experimental results on several language pairs have demonstrated that the proposed approach can improve translation quality. Furthermore, this article presents


meeting of the association for computational linguistics | 2008

Pivot Approach for Extracting Paraphrase Patterns from Bilingual Corpora

Shiqi Zhao; Haifeng Wang; Ting Liu; Sheng Li

Collaboration


Dive into the Shiqi Zhao's collaboration.

Top Co-Authors

Avatar

Ting Liu

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yu Zhang

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Lin Zhao

Harbin Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jizhou Huang

Harbin Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge