Is this you? Create Your Porfile

Chu-Ren Huang

Hong Kong Polytechnic University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chu-Ren Huang is active.

Explore More

Publication

Featured researches published by Chu-Ren Huang.

international joint conference on natural language processing | 2009

A Framework of Feature Selection Methods for Text Categorization

Shoushan Li; Rui Xia; Chengqing Zong; Chu-Ren Huang

In text categorization, feature selection (FS) is a strategy that aims at making text classifiers more efficient and accurate. However, when dealing with a new task, it is still difficult to quickly select a suitable one from various FS methods provided by many previous studies. In this paper, we propose a theoretic framework of FS methods based on two basic measurements: frequency measurement and ratio measurement. Then six popular FS methods are in detail discussed under this framework. Moreover, with the guidance of our theoretical analysis, we propose a novel method called weighed frequency and odds (WFO) that combines the two measurements with trained weights. The experimental results on data sets from both topic-based and sentiment classification tasks show that this new method is robust across different tasks and numbers of selected features.

Language Sciences | 2003

Individuals, kinds and events: classifier coercion of nouns

Chu-Ren Huang; Kathleen Ahrens

This paper challenges the traditional view that nominal classifiers classify individuals. Instead, we suggest that classifiers coerce nouns to refer to kinds and events as well as to individuals. This finding argues against the view that nouns refer only to entities, and suggests that classifiers do not simply agree with a noun, but instead coerce a particular meaning from it. Moreover, the Mandarin classifier system creates a taxonomic system involving events, kinds and individuals respectively. Within each classifier type an independent classification system of the collocating noun type is created. These findings are important first because they emphasize that the understanding of the semantics of nouns involves more than simple reference to an individual entity. Second, it is the first time that the previously abstract semantic distinctions among kinds, individuals and events, as well as within kinds and within events, have been found to be instantiated in a particular system of a natural language grammar, namely, the classifier system.

meeting of the association for computational linguistics | 2000

Sinica Treebank: Design Criteria, Annotation Guidelines, and On-line Interface

Chu-Ren Huang; Fengyi Chen; Keh-Jiann Chen; Zhao-Ming Gao; Kuang-Yu Chen

This paper describes the design criteria and annotation guidelines of Sinica Treebank. The three design criteria are: Maximal Resource Sharing, Minimal Structural Complexity, and Optimal Semantic Information. One of the important design decisions following these criteria is the encoding of thematic role information. An on-line interface facilitating empirical studies of Chinese phrase structure is also described.

Archive | 2010

Ontology and the lexicon : a natural language processing perspective

Chu-Ren Huang; Nicoletta Calzolari; Aldo Gangemi; Alessandro Lenci; Alessandro Oltramari; Laurent Prévot

Part I. Fundamental Aspects: 1. Ontology and the lexicon: a multi-disciplinary perspective Laurent Prevot, Chu-Ren Huang, Nicoletta Calzolari, Aldo Gangemi, Alessandro Lenci and Alessandro Oltramari 2. Formal ontology as interlingua: the SUMO and WordNet linking project and GlobalWordNet Adam Pease and Christiane Fellbaum 3. Interfacing WordNet with DOLCE: towards OntoWordNet Aldo Gangemi, Nicola Guarino, Claudio Masolo and Alessandro Oltramari 4. Reasoning over natural language text by means of FrameNet and ontologies Jan Scheffczyk, Collin F. Baker and Srini Narayanan 5. Synergizing ontologies and the lexicon: a roadmap Alessandro Oltramari, Aldo Gangemi, Chu-Ren Huang, Nicoletta Calzolari, Alessandro Lenci and Laurent Prevot Part II. Discovery and Representation of Conceptual Systems: 6. Experiments of ontology construction with formal concept analysis SuJian Li, Qin Lu and Wenjie Li 7. Ontology, lexicon, and fact repository as leveraged to interpret events of change Marjorie McShane, Sergei Nirenburg and Stephen Beale 8. Hantology: conceptual system discovery based on orthographic convention Ya-Min Chou and Chu-Ren Huang 9. Whats in a schema? A formal metamodel for ECG and FrameNet Aldo Gangemi Part III. Interfacing Ontologies and Lexical Resources: 10. Interfacing ontologies and lexical resources Laurent Prevot, Stefano Borgo and Alessandro Oltramari 11. Sinica BOW (Bilingual Ontological WordNet): integration of BilingualWord-Net and SUMO Chu-Ren Huang, Ru-Yng Chang and Hsiang-bin Lee 12. Ontology-based semantic lexicons: mapping between terms and object descriptions Paul Buitelaar 13. Merging global and specialized linguistic ontologies Manuela Speranza and Bernardo Magnini Part IV. Learning and Using Ontological Knowledge: 14. The life cycle of knowledge Alessandro Lenci 15. The omega ontology Andrew Philpot, Eduard Hovy and Patrick Pantel 16. Automatic acquisition of lexico-semantic knowledge for question answering Lonneke van der Plas, Gosse Bouma and Jori Mur 17. Agricultural ontology construction and maintenance in Thai Asanee Kawtrakul and Aurawan Imsombut.The relation between ontologies and language is at the forefront of both natural language processing (NLP) and knowledge engineering. Ontologies, as widely used models in semantic technologies, have much in common with the lexicon. A lexicon organizes words as a conventional inventory of concepts, while an ontology formalizes concepts and their logical relations. A shared lexicon is the prerequisite for knowledge-sharing through language, and a shared ontology is the prerequisite for knowledge-sharing through information technology. In building models of language, computational linguists must be able to map accurately the relations between words and the concepts that they can be linked to. This book focuses on the integration of lexical resources and semantic technologies. It will be of interest to researchers and graduate students in NLP, computational linguistics and knowledge engineering, as well as in semantics, psycholinguistics, lexicology and morphology/syntax.

中文計算語言學期刊 | 2000

The Module-Attribute Representation of Verbal Semantics: From Semantics to Argument Structure

Chu-Ren Huang; Kathleen Ahrens; Li-Li Chang; Keh-Jiann Chen; Meichun Liu; Mei-Chih Tsai

In this paper, we set forth a theory of lexical knowledge. We propose two types of modules: event structure modules and role modules, as well as two sets of attributes: event-internal attributes and role-internal attributes, which are linked to the event structure module and role module, respectively. These module-attribute semantic representations have associated grammatical consequences. Our data is drawn from a comprehensive corpus-based study of Mandarin Chinese verbal semantics, and four particular case studies are presented.

international conference on computational linguistics | 1996

Segmentation standard for Chinese natural language processing

Chu-Ren Huang; Keh-Jiann Chen; Li-Li Chang

This paper proposes a segmentation standard for Chinese natural language processing. The standard is proposed to achieve linguistic felicity, computational feasibility, and data uniformity. Linguistic felicity is maintained by defining a segmentation unit to be equivalent to the theoretical definition of word, and by providing a set of segmentation principles that are equivalent to a functional definition of a word. Computational feasibility is ensured by the fact that the above functional definitions are procedural in nature and can be converted to segmentation algorithms, as well as by the implementable heuristic guidelines which deal with specific linguistic categories. Data uniformity is achieved by stratification of the standard itself and by defining a standard lexicon as part of the segmentation standard.

meeting of the association for computational linguistics | 2007

Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification

Chu-Ren Huang; Petr Šimon; Shu-Kai Hsieh; Laurent Prévot

This paper addresses two remaining challenges in Chinese word segmentation. The challenge in HLT is to find a robust segmentation method that requires no prior lexical knowledge and no extensive training to adapt to new types of data. The challenge in modelling human cognition and acquisition it to segment words efficiently without using knowledge of wordhood. We propose a radical method of word segmentation to meet both challenges. The most critical concept that we introduce is that Chinese word segmentation is the classification of a string of character-boundaries (CBs) into either word-boundaries (WBs) and non-word-boundaries. In Chinese, CBs are delimited and distributed in between two characters. Hence we can use the distributional properties of CB among the background character strings to predict which CBs are WBs.

Proceedings of the 4th Workshop on Linked Data in Linguistics: Resources and Applications | 2015

EVALution 1.0: an Evolving Semantic Dataset for Training and Evaluation of Distributional Semantic Models

Enrico Santus; Frances Yung; Alessandro Lenci; Chu-Ren Huang

In this paper, we introduce EVALution 1.0, a dataset designed for the training and the evaluation of Distributional Semantic Models (DSMs). This version consists of almost 7.5K tuples, instantiating several semantic relations between word pairs (including hypernymy, synonymy, antonymy, meronymy). The dataset is enriched with a large amount of additional information (i.e. relation domain, word frequency, word POS, word semantic field, etc.) that can be used for either filtering the pairs or performing an in-depth analysis of the results. The tuples were extracted from a combination of ConceptNet 5.0 and WordNet 4.0, and subsequently filtered through automatic methods and crowdsourcing in order to ensure their quality. The dataset is freely downloadable1. An extension in RDF format, including also scripts for data processing, is under development.

中文計算語言學期刊 | 1998

Towards a Representation of Verbal Semantics-An Approach Based on Near-Synonyms

Mei-Chih Tsai; Chu-Ren Huang; Keh-Jiann Chen; Kathleen Ahrens

In this paper we propose using the distributional differences in the syntactic patterns of near-synonyms to deduce the relevant components of verb meaning. Our method involves determining the distributional differences in syntactic patterns, deducing the semantic features from the syntactic phenomena, and testing the semantic features in new syntactic frames. We determine the distributional differences in syntactic patterns through the following five steps: First, we search for all instances of the verb in the corpus. Second, we classify each of these instances into its type of syntactic function. Third, we classify each of these instances into its argument structure type. Fourth, we determine the aspectual type that is associated with each verb. Lastly, we determine each verbs sentential type. Once the distributional differences have been determined, then the relevant semantic features are postulated. Our goal is to tease out the lexical semantic features as the explanation, and as the motivation of the syntactic contrasts.

international conference on computational linguistics | 2002

Translating lexical semantic relations: the first step towards multilingual wordnets

Chu-Ren Huang; I-Ju E. Tseng; Dylan B.S. Tsai

Establishing correspondences between wordnets of different languages is essential to both multilingual knowledge processing and for bootstrapping wordnets of low-density languages. We claim that such correspondences must be based on lexical semantic relations, rather than top ontology or word translations. In particular, we define a translation equivalence relation as a bilingual lexical semantic relation. Such relations can then be part of a logical entailment predicting whether source language semantic relations will hold in a target language or not. Our claim is tested with a study of 210 Chinese lexical lemmas and their possible semantic relations links bootstrapped from the Princeton WordNet. The results show that lexical semantic relation translations are indeed highly precise when they are logically inferable.

Explore More