Yuk Wah Wong | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuk Wah Wong is active.

Explore More

Publication

Featured researches published by Yuk Wah Wong.

Artificial Intelligence in Medicine | 2005

Comparative experiments on learning information extractors for proteins and their interactions

Razvan C. Bunescu; Ruifang Ge; Rohit J. Kate; Edward M. Marcotte; Raymond J. Mooney; Arun K. Ramani; Yuk Wah Wong

OBJECTIVE Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in Medline. However, extraction efforts have been frustrated by the lack of conventions for describing human genes and proteins. We have developed and evaluated a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting information on interactions between the proteins. METHODS AND MATERIAL We used a variety of machine learning methods to automatically develop information extraction systems for extracting information on gene/protein name, function and interactions from Medline abstracts. We present cross-validated results on identifying human proteins and their interactions by training and testing on a set of approximately 1000 manually-annotated Medline abstracts that discuss human genes/proteins. RESULTS We demonstrate that machine learning approaches using support vector machines and maximum entropy are able to identify human proteins with higher accuracy than several previous approaches. We also demonstrate that various rule induction methods are able to identify protein interactions with higher precision than manually-developed rules. CONCLUSION Our results show that it is promising to use machine learning to automatically build systems for extracting information from biomedical text. The results also give a broad picture of the relative strengths of a wide variety of methods when tested on a reasonably large human-annotated corpus.

language and technology conference | 2006

Learning for Semantic Parsing with Statistical Machine Translation

Yuk Wah Wong; Raymond J. Mooney

We present a novel statistical approach to semantic parsing, WASP, for constructing a complete, formal meaning representation of a sentence. A semantic parser is learned given a set of sentences annotated with their correct meaning representations. The main innovation of WASP is its use of state-of-the-art statistical machine translation techniques. A word alignment model is used for lexical acquisition, and the parsing model itself can be seen as a syntax-based translation model. We show that WASP performs favorably in terms of both accuracy and coverage compared to existing learning methods requiring similar amount of supervision, and shows better robustness to variations in task complexity and word order.

knowledge discovery and data mining | 2004

A system for automated mapping of bill-of-materials part numbers

Jayant R. Kalagnanam; Moninder Singh; Sudhir Verma; Michael Patek; Yuk Wah Wong

Part numbers are widely used within an enterprise throughout the manufacturing process. The point of entry of such part numbers into this process is normally via a Bill of Materials, or BOM, sent by a contact manufacturer or supplier. Each line of the BOM provides information about one part such as the supplier part number, the BOM receivers corresponding internal part number, an unstructured textual part description, the supplier name, etc. However, in a substantial number of cases, the BOM receivers internal part number is absent. Hence, before this part can be incorporated into the receivers manufacturing process, it has to be mapped to an internal part (of the BOM receiver) based on the information of the part in the BOM. Historically, this mapping process has been done manually which is a highly time-consuming, labor intensive and error-prone process. This paper describes a system for automating the mapping of BOM part numbers. The system uses a two step modeling and mapping approach. First, the system uses historical BOM data, receivers part specifications data and receivers part taxonomic data along with domain knowledge to automatically learn classification models for mapping a given BOM part description to successively lower levels of the receivers part taxonomy to reduce the set of potential internal parts to which the BOM part could map to. Then, information about various part parameters is extracted from the BOM part description and compared to the specifications data of the potential internal parts to choose the final mapped internal part. Mappings done by the system are very accurate, and the system is currently being deployed within IBM for mapping BOMs received by the corporate procurement/manufacturing divisions.

meeting of the association for computational linguistics | 2007