Ashwin Ittoo | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ashwin Ittoo is active.

Explore More

Publication

Featured researches published by Ashwin Ittoo.

data and knowledge engineering | 2013

Minimally-supervised extraction of domain-specific part-whole relations using Wikipedia as knowledge-base

Ashwin Ittoo; Gosse Bouma

We present a minimally-supervised approach for learning part-whole relations from texts. Unlike previous techniques, we focused on sparse, domain-specific texts. The novelty in our approach lies in the use of Wikipedia as a knowledge-base, from which we first acquire a set of reliable patterns that express part-whole relations. This is achieved by a minimally-supervised algorithm. We then use the patterns acquired to extract part-whole relation triples from a collection of sparse, domain-specific texts. Our strategy, of learning in one domain and applying the knowledge in another domain is based upon the notion of domain-adaption. It allows us to overcome the challenges of learning the relations directly from the sparse, domain-specific corpus. Our experimental evaluations reveal that, despite its general-purpose nature, Wikipedia can be exploited as a source of knowledge for improving the performance of domain-specific part-whole relation extraction. As our other contributions, we propose a mechanism that mitigates the negative impact of semantic-drift on minimally-supervised algorithms. Also, we represent the patterns in the extracted relations using sophisticated syntactic structures that avoid the limitations of traditional surface string representations. In addition, we show that domain-specific part-whole relations cannot be conclusively classified in existing taxonomies.

Expert Systems With Applications | 2013

Term extraction from sparse, ungrammatical domain-specific documents

Ashwin Ittoo; Gosse Bouma

Existing term extraction systems have predominantly targeted large and well-written document collections, which provide reliable statistical and linguistic evidence to support term extraction. In this article, we address the term extraction challenges posed by sparse, ungrammatical texts with domain-specific contents, such as customer complaint emails and engineers repair notes. To this aim, we present ExtTerm, a novel term extraction system. Specifically, as our core innovations, we accurately detect rare (low frequency) terms, overcoming the issue of data sparsity. These rare terms may denote critical events, but they are often missed by extant TE systems. ExtTerm also precisely detects multi-word terms of arbitrarily lengths, e.g. with more than 2 words. This is achieved by exploiting fundamental theoretical notions underlying term formation, and by developing a technique to compute the collocation strength between any number of words. Thus, we address the limitation of existing TE systems, which are primarily designed to identify terms with 2 words. Furthermore, we show that open-domain (general) resources, such as Wikipedia, can be exploited to support domain-specific term extraction. Thus, they can be used to compensate for the unavailability of domain-specific knowledge resources. Our experimental evaluations reveal that ExtTerm outperforms a state-of-the-art baseline in extracting terms from a domain-specific, sparse and ungrammatical real-life text collection.

international conference natural language processing | 2011

Extracting explicit and implicit causal relations from sparse, domain-specific texts

Ashwin Ittoo; Gosse Bouma

Various supervised algorithms for mining causal relations from large corpora exist. These algorithms have focused on relations explicitly expressed with causal verbs, e.g. to cause. However, the challenges of extracting causal relations from domain-specific texts have been overlooked. Domain-specific texts are rife with causal relations that are implicitly expressed using verbal and non-verbal patterns, e.g. reduce, drop in, due to. Also, readily-available resources to support supervised algorithms are inexistent in most domains. To address these challenges, we present a novel approach for causal relation extraction. Our approach is minimally-supervised, alleviating the need for annotated data. Also, it identifies both explicit and implicit causal relations. Evaluation results revealed that our technique achieves state-of-the-art performance in extracting causal relations from domain-specific, sparse texts. The results also indicate that many of the domain-specific relations were unclassifiable in existing taxonomies of causality.

business information systems | 2010

Textractor: A framework for extracting relevant domain concepts from irregular corporate textual datasets

Ashwin Ittoo; Laura Maruster; Hans Wortmann; Gosse Bouma

Various information extraction (IE) systems for corporate usage exist. However, none of them target the product development and/or customer service domain, despite significant application potentials and benefits. This domain also poses new scientific challenges, such as the lack of external knowledge resources, and irregularities like ungrammatical constructs in textual data, which compromise successful information extraction. To address these issues, we describe the development of Textractor; an application for accurately extracting relevant concepts from irregular textual narratives in datasets of product development and/or customer service organizations. The extracted information can subsequently be fed to a host of business intelligence activities. We present novel algorithms, combining both statistical and linguistic approaches, for the accurate discovery of relevant domain concepts from highly irregular/ungrammatical texts. Evaluations on real-life corporate data revealed that Textractor extracts domain concepts, realized as single or multi-word terms in ungrammatical texts, with high precision.

applications of natural language to data bases | 2010

Extracting meronymy relationships from domain-specific, textual corporate databases

Ashwin Ittoo; Gosse Bouma; Laura Maruster; Hans Wortmann

Various techniques for learning meronymy relationships from open-domain corpora exist. However, extracting meronymy relationships from domain-specific, textual corporate databases has been overlooked, despite numerous application opportunities particularly in domains like product development and/or customer service. These domains also pose new scientific challenges, such as the absence of elaborate knowledge resources, compromising the performance of supervised meronymy-learning algorithms. Furthermore, the domain-specific terminology of corporate texts makes it difficult to select appropriate seeds for minimally-supervised meronymy-learning algorithms. To address these issues, we develop and present a principled approach to extract accurate meronymy relationships from textual databases of product development and/or customer service organizations by leveraging on reliable meronymy lexico-syntactic patterns harvested from an open-domain corpus. Evaluations on real-life corporate databases indicate that our technique extracts precise meronymy relationships that provide valuable operational insights on causes of product failures and customer dissatisfaction. Our results also reveal that the types of some of the domain-specific meronymy relationships, extracted from the corporate data, cannot be conclusively and unambiguously classified under wellknown taxonomies of relationships.

Quality and Reliability Engineering International | 2012

Improving Product Quality and Reliability with Customer Experience Data

Ac Aarnout Brombacher; Eva Hopma; Ashwin Ittoo; Yuan Lu; Ilse Luyk; Laura Maruster; Joel Ribeiro; Ton Weijters; Hans Wortmann

Advance technology development and wide use of the World Wide Web have made it possible for new product development organizations to access multi-sources of data-related customer complaints. However, the number of customer plaints of highly innovative consumer electronic products is still increasing; that is, product quality and reliability is at risk. This article aims to understand why existing solutions from literature as well as from industry to deal with these increasingly complex multiple data sources are not able to manage product quality and reliability. Three case studies in industry are discussed. On the basis of the case study results, this article also identifies a new research agenda that is needed to improve product quality and reliability under this circumstance. Copyright (c) 2011 John Wiley & Sons, Ltd.

data and knowledge engineering | 2013

Editorial: Minimally-supervised learning of domain-specific causal relations using an open-domain corpus as knowledge base

Ashwin Ittoo; Gosse Bouma

We propose a novel framework for overcoming the challenges in extracting causal relations from domain-specific texts. Our technique is minimally-supervised, alleviating the need for manually-annotated, expensive training data. As our main contribution, we show that open-domain corpora can be exploited as knowledge bases to overcome data sparsity issues posed by domain-specific relation extraction, and that they enable substantial performance gains. We also address longstanding challenges of extant minimally-supervised approaches. To suppress the negative impact of semantic drift, we propose a technique based on the Latent Relational Hypothesis. In addition, our approach discovers both explicit (e.g. to cause) and implicit (e.g. to destroy) causal patterns/relations. Unlike existing minimally-supervised techniques, we adopt a principled seed selection strategy, which enables us to discover a more diverse set of causal patterns/relations. Our experiments reveal that our approach outperforms a state-of-the-art baseline in discovering causal relations from a real-life, domain-specific corpus.

computer science and information engineering | 2009

Ensemble Similarity Measures for Clustering Terms

Ashwin Ittoo; Laura Maruster

Clustering semantically related terms is crucial for many applications such as document categorization, and word sense disambiguation. However, automatically identifying semantically similar terms is challenging. We present a novel approach for automatically determining the degree of relatedness between terms to facilitate their subsequent clustering. Using the analogy of ensemble classifiers in Machine Learning, we combine multiple techniques like contextual similarity and semantic relatedness to boost the accuracy of our computations. A new method, based on Yarowsky’s word sense disambiguation approach, to generate high-quality topic signatures for contextual similarity computations, is presented. A technique to measure semantic relatedness between multi-word terms, based on the work of Hirst and St. Onge is also proposed. Experimental evaluation reveals that our method outperforms similar related works. We also investigate the effects of assigning different importance levels to the different similarity measures based on the corpus characteristics.

industrial engineering and engineering management | 2009

Development of an RMS model based on colored object-oriented Petri nets with changeable structures

Linda L. Zhang; Ashwin Ittoo

Current research on reconflgurable manufacturing systems (RMSs) only highlights the isolated empirical study with a limited attempt to explore the modeling and design support issues surrounding this economically important class of system development problems. This paper proposes colored object-oriented Petri nets with changeable structures to shed light on how RMSs can dynamically reconfigure system components to accommodate production variations resulting from diverse changes in product design and order quantities. The proposed model is discussed in detail using a case example.

meeting of the association for computational linguistics | 2010