Is this you? Create Your Porfile

Ziheng Lin

National University of Singapore

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ziheng Lin is active.

Explore More

Publication

Featured researches published by Ziheng Lin.

Natural Language Engineering | 2014

A PDTB-Styled End-to-End Discourse Parser

Ziheng Lin; Hwee Tou Ng; Min-Yen Kan

Since the release of the large discourse-level annotation of the Penn Discourse Treebank (PDTB), research work has been carried out on certain subtasks of this annotation, such as disambiguating discourse connectives and classifying Explicit or Implicit relations. We see a need to construct a full parser on top of these subtasks and propose a way to evaluate the parser. In this work, we have designed and developed an end-to-end discourse parser-to-parse free texts in the PDTB style in a fully data-driven approach. The parser consists of multiple components joined in a sequential pipeline architecture, which includes a connective classifier, argument labeler, explicit classifier, non-explicit classifier, and attribution span labeler. Our trained parser first identifies all discourse and non-discourse relations, locates and labels their arguments, and then classifies the sense of the relation between each pair of arguments. For the identified relations, the parser also determines the attribution spans, if any, associated with them. We introduce novel approaches to locate and label arguments, and to identify attribution spans. We also significantly improve on the current state-of-the-art connective classifier. We propose and present a comprehensive evaluation from both component-wise and error-cascading perspectives, in which we illustrate how each component performs in isolation, as well as how the pipeline performs with errors propagated forward. The parser gives an overall system F 1 score of 46.80 percent for partial matching utilizing gold standard parses, and 38.18 percent with full automation.

acm ieee joint conference on digital libraries | 2011

Product review summarization from a deeper perspective

Duy Khang Ly; Kazunari Sugiyama; Ziheng Lin; Min-Yen Kan

With product reviews growing in depth and becoming more numerous, it is growing challenge to acquire a comprehensive understanding of their contents, for both customers and product manufacturers. We built a system that automatically summarizes a large collection of product reviews to generate a concise summary. Importantly, our system not only extracts the review sentiments but also the underlying justification for their opinion. We solve this problem through a novel application of clustering and validate our approach through an empirical study, obtaining good performance as judged by F-measure (the harmonic mean of purity and inverse purity).

asia information retrieval symposium | 2010

Tuning Machine-Learning Algorithms for Battery-Operated Portable Devices

Ziheng Lin; Yan Gu; Samarjit Chakraborty

Machine learning algorithms in various forms are now increasingly being used on a variety of portable devices, starting from cell phones to PDAs. They often form a part of standard applications (e.g. for grammar-checking in email clients) that run on these devices and occupy a significant fraction of processor and memory bandwidth. However, most of the research within the machine learning community has ignored issues like memory usage and power consumption of processors running these algorithms. In this paper we investigate how machine learned models can be developed in a power-aware manner for deployment on resource-constrained portable devices. We show that by tolerating a small loss in accuracy, it is possible to dramatically improve the energy consumption and data cache behavior of these algorithms. More specifically, we explore a typical sequential labeling problem of part-of-speech tagging in natural language processing and show that a power-aware design can achieve up to 50% reduction in power consumption, trading off a minimal decrease in tagging accuracy of 3%.

empirical methods in natural language processing | 2009