Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kwong Bor Ng is active.

Publication


Featured researches published by Kwong Bor Ng.


Journal of the Association for Information Science and Technology | 2000

Predicting the effectiveness of Naïve data fusion on the basis of system characteristics

Kwong Bor Ng; Paul B. Kantor

Effective automation of the information retrieval task has long been an active area of research, leading to sophisticated retrieval models. With many IR schemes available, researchers have begun to investigate the benefits of combining the results of different IR schemes to improve performance, in the process called “data fusion.” There are many successful data fusion experiments reported in IR literature, but there are also cases in which it did not work well. Thus, if would be quite valuable to have a theory that can predict, in advance, whether fusion of two or more retrieval schemes will be worth doing. In previous study (Ng & Kantor, 1998), we identified two predictive variables for the effectiveness of fusion: (a) a list‐based measure of output dissimilarity, and (b) a pair‐wise measure of the similarity of performance of the two schemes. In this article we investigate the predictive power of these two variables in simple symmetrical data fusion. We use the IR systems participating in the TREC 4 routing task to train a model that predicts the effectiveness of data fusion, and use the IR systems participating in the TREC 5 routing task to test that model. The model asks, “when will fusion perform better than an oracle who uses the best scheme from each pair?” We explore statistical techniques for fitting the model to the training data and use the receiver operating characteristic curve of signal detection theory to represent the power of the resulting models. The trained prediction methods predict whether fusion will beat an oracle, at levels much higher than could be achieved by chance.


Journal of the Association for Information Science and Technology | 1999

A comparison of the two traditions of metadata development

Kathleen Burnett; Kwong Bor Ng; Soyeon Park

Metadata has taken on a more significant role than ever before in the emerging digital library context because the effective organization of networked information clearly depends on the effective management and organization of metadata. The issue of metadata has been approached variously by different intellectual communities. The two main approaches may be characterized as: (1) the bibliographic control approach (origins and major proponents in library science); and (2) data management approach (origins and major proponents in computer science). This article examines the different conceptual foundations and orientations of the two major approaches contributing to the metadata discussion. An examination of the on-going efforts to establish metadata standards, and comparison of different metadata formats, supports a proposal for an integrated concept of metadata to facilitate the merging of the two approaches.


north american chapter of the association for computational linguistics | 2003

Automatically predicting information quality in news documents

Rong Tang; Kwong Bor Ng; Tomek Strzalkowski; Paul B. Kantor

We report here empirical results of a series of studies aimed at automatically predicting information quality in news documents. Multiple research methods and data analysis techniques enabled a good level of machine prediction of information quality. Procedures regarding user experiments and statistical analysis are described.


Proceedings of The Asist Annual Meeting | 2005

Toward machine understanding of information quality

Rong Tang; Kwong Bor Ng; Tomek Strzalkowski; Paul B. Kantor

In this paper we report preliminary results of a study to develop, and subsequently to automate, new metrics for assessment of information quality in text documents, particularly in news. Through focus group studies, quality judgment experiments, and textual feature extraction and analysis, we were able to generate nine quality aspects and apply them in human assessments. Experts and students participated quality experiments, during which 1000 TREC documents were evaluated by participants from two sites–Albany and Rutgers. Data showed good interjudge agreement between judges from both sites. Principal component analysis revealed that the nine aspects form clusters of “content” and “presentation.” Automatic quality prediction has been derived based on statistical analysis on the association between textual features and human quality judgments.


Proceedings of The Asist Annual Meeting | 2005

Identification of Effective Predictive Variables for Document Qualities.

Kwong Bor Ng; Rong Tang; Sharon G. Small; Tomek Strzalkowski; Paul B. Kantor; Robert Rittman; Peng Song; Ying Sun; Nina Wacholder

We analyzed textual properties of documents to identify predictive variables for various document qualities by means of statistical and linguistic methods. We have created a collection of 1000 documents, each document has been judged in terms of nine document qualities (accuracy, reliability, objectivity, depth, author/producer credibility, readability, verbosity and conciseness, grammatical correctness, one-sided or multiview.) Employing statistical analyses, we considered a kind of linear combination, asking (1) if it was possible to combine textual features linearly to predict document qualities; (2) what textual features had good predictive power; (3) what textual features were minimally required for prediction with a detection rate much better than the false alarm rate. We present several promising results, indicating that with a few number of textual features, we can predict various document qualities much better than chance.


Journal of the Association for Information Science and Technology | 2006

Automated judgment of document qualities

Kwong Bor Ng; Paul B. Kantor; Tomek Strzalkowski; Nina Wacholder; Rong Tang; Bing Bai; Robert Rittman; Peng Song; Ying Sun

The authors report on a series of experiments to automate the assessment of document qualities such as depth and objectivity. The primary purpose is to develop a quality-sensitive functionality, orthogonal to relevance, to select documents for an interactive question-answering system. The study consisted of two stages. In the classifier construction stage, nine document qualities deemed important by information professionals were identified and classifiers were developed to predict their values. In the confirmative evaluation stage, the performance of the developed methods was checked using a different document collection. The quality prediction methods worked well in the second stage. The results strongly suggest that the best way to predict document qualities automatically is to construct classifiers on a person-by-person basis.


Proceedings of The Asist Annual Meeting | 2005

Adjectives as indicators of subjectivity in documents

Robert Rittman; Nina Wacholder; Paul B. Kantor; Kwong Bor Ng; Tomek Strzalkowski; Ying Sun

The goal of this research is to automatically predict human judgments of document qualities such as subjectivity, verbosity and depth. In this paper, we explore the behavior of adjectives as indicators of subjectivity in documents. Specifically, we test whether a subset of automatically derived subjective adjectives (Wiebe, 2000b), selected a priori, behaves differently than other adjectives. 3,200 documents were ranked by 100 subjects as being high or low in nine document qualities (Tang, Ng, Strzalkowski, & Kantor, 2003). We report a statistically significant correlation between the occurrence of adjectives in documents and human judgments of subjectivity. More importantly, we find that the subset of subjective adjectives is more strongly correlated with subjectivity than adjectives in general. These results can be used to identify document qualities for use in information retrieval and question-answering systems.


Proceedings of The Asist Annual Meeting | 2005

Exploration of a Geometric Model of Data Fusion.

Ulukbek Ibraev; Kwong Bor Ng; Paul B. Kantor

Some aspects of Data Fusion (DF) for Information Retrieval (IR) are explored using a set of data from the Fifth International Conference on Text Retrieval, TREC5. It has been observed from time to time that DF applied to a pair of systems or schemes for IR may yield results that are better than those of either participating scheme. It has been conjectured that this occurs only rarely, or occurs only when poor schemes are being combined, or occurs only for problems in which there are so few relevant documents that the results are probably due to statistical fluctuation. Based on a geometrical model of DF, we derive an equation for effective DF. This equation shows that in the ideal case the performance of a pair of IR schemes may be aproximated by a quadratic polynomial. We statistically test this assumption for TREC5 Routing data. Results of the regression analysis shows that our equation for the effect of DF is generally valid.


Proceedings of The Asist Annual Meeting | 2005

The institutional dimension of document quality judgments

Bing Bai; Kwong Bor Ng; Ying Sun; Paul B. Kantor; Tomek Strzalkowski

In addition to relevance, there are other factors that contribute to the utility of a document. For examples, content properties like depth of analysis and multiplicity of viewpoints, and presentational properties like readability and verbosity, all will affect the usefulness of a document. These kinds of relevance-independent properties are difficult to determine, as their estimations are more likely to be affected by personal aspects of ones knowledge structure. Reliability of judgments on those properties may decrease when one moves from the personal level to the global level. In this paper, we report several experiments on document qualities. In our experiments, we explore the correlation of judgments on nine different document qualities to see if we can generate fewer dimensions underlying the judgments. We also investigate the issue of reliability of the judgments and discuss its theoretical implications. We find that, between the global level of agreement of judgment and the inter-personal level of agreement of judgment, there is an intermediate level of agreement, which can be characterized as an inter-institutional level.


Archive | 2013

Recent developments in the design, construction, and evaluation of digital libraries : case studies

Colleen Cool; Kwong Bor Ng

It is no secret that the world of libraries has rapidly evolved into an environment which will quickly be largely digitised. However, this digital shift has brought with it a unique set of challenges and issues for scholars and librarians to handle. Recent Developments in the Design, Construction and Evaluation of Digital Libraries not only addresses the challenges with digital libraries, but it also describes the recent developments in the design, construction, and evaluation of these libraries in various environments. This leading publication compiles research from a wide array of specialists into a unified and comprehensive manner. Librarians, researchers, scholars, and professionals in this field will find the reference source beneficial in order to deepen their understanding of this continually growing field.

Collaboration


Dive into the Kwong Bor Ng's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rong Tang

The Catholic University of America

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bing Bai

Princeton University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge