S. K. M. Wong
University of Regina
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by S. K. M. Wong.
International Journal of Human-computer Studies \/ International Journal of Man-machine Studies | 1992
Yiyu Yao; S. K. M. Wong
This paper explores the implications of approximating a concept based on the Bayesian decision procedure, which provides a plausible unification of the fuzzy set and rough set approaches for approximating a concept. We show that if a given concept is approximated by one set, the same result given by the α-cut in the fuzzy set theory is obtained. On the other hand, if a given concept is approximated by two sets, we can derive both the algebraic and probabilistic rough set approximations. Moreover, based on the well known principle of maximum (minimum) entropy, we give a useful interpretation of fuzzy intersection and union. Our results enhance the understanding and broaden the applications of both fuzzy and rough sets.
international acm sigir conference on research and development in information retrieval | 1985
S. K. M. Wong; Wojciech Ziarko; Patrick C. N. Wong
In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The main difficulty with this approach is that the explicit representation of term vectors is not known a priori. For this reason, the vector space model adopted by Salton for the SMART system treats the terms as a set of orthogonal vectors. In such a model it is often necessary to adopt a separate, corrective procedure to take into account the correlations between terms. In this paper, we propose a systematic method (the generalized vector space model) to compute term correlations directly from automatic indexing scheme. We also demonstrate how such correlations can be included with minimal modification in the existing vector based information retrieval systems. The preliminary experimental results obtained from the new model are very encouraging.
Journal of the Association for Information Science and Technology | 1986
Vijay V. Raghavan; S. K. M. Wong
Notations and definitions necessary to identify the concepts and relationships that are important in modelling information retrieval objects and processes in the context of vector spaces are presented. Earlier work on the use of vector model is evaluated in terms of the concepts introduced and certain problems and inconsistencies are identified. More importantly, this investigation should lead to a clear understanding of the issues and problems in using the vector space model in information retrieval.
ACM Transactions on Information Systems | 1995
S. K. M. Wong; Yiyu Yao
This article examines and extends the logical models of information retrieval in the context of probability theory. The fundamental notions of term weights and relevance are given probabilistic interpretations. A unified framework is developed for modeling the retrieval process with probabilistic inference. This new approach provides a common conceptual and mathematical basis for many retrieval models, such as the Boolean, fuzzy set, vector space, and conventional probabilistic models. Within this framework, the underlying assumptions employed by each model are identified, and the inherent relationships between these models are analyzed. Although this article is mainly a theoretical analysis of probabilistic inference for information retrieval, practical methods for estimating the required probabilities are provided by simple examples.
Linux Journal | 1997
Yiyu Yao; S. K. M. Wong; Tsau Young Lin
Since introduction of the theory of rough set in early eighties, considerable work has been done on the development and application of this new theory. The paper provides a review of the Pawlak rough set model and its extensions, with emphasis on the formulation, characterization, and interpretation of various rough set models.
ACM Transactions on Database Systems | 1987
S. K. M. Wong; Wojciech Ziarko; Vijay V. Raghavan; P. C.N. Wong
The Vector Space Model (VSM) has been adopted in information retrieval as a means of coping with inexact representation of documents and queries, and the resulting difficulties in determining the relevance of a document relative to a given query. The major problem in employing this approach is that the explicit representation of term vectors is not known a priori. Consequently, earlier researchers made the assumption that the vectors corresponding to terms are pairwise orthogonal. Such an assumption is clearly unrealistic. Although attempts have been made to compensate for this assumption by some separate, corrective steps, such methods are ad hoc and, in most cases, formally inconsistent. In this paper, a generalization of the VSM, called the GVSM, is advanced. The developments provide a solution not only for the computation of a measure of similarity (correlation) between terms, but also for the incorporation of these similarities into the retrieval process. The major strength of the GVSM derives from the fact that it is theoretically sound and elegant. Furthermore, experimental evaluation of the model on several test collections indicates that the performance is better than that of the VSM. Experiments have been performed on some variations of the GVSM, and all these results have also been compared to those of the VSM, based on inverse document frequency weighting. These results and some ideas for the efficient implementation of the GVSM are discussed.
Fuzzy Sets and Systems | 1987
S. K. M. Wong; Wojciech Ziarko
Abstract It is shown that the generalized notion (probabilistic approximate classification) of rough sets can be conveniently described by the concept of fuzzy sets. A discussion of the proper choice of the definition for the membership function of the intersection (union) of fuzzy sets is also presented. However, from the point of view of the probabilistic approximation space, it is argued that there does not exist a universal definition for the fuzzy intersection (union) operation.
ACM Transactions on Mathematical Software | 1988
S. J. Wan; S. K. M. Wong; P. Prusinkiewicz
A new divisive algorithm for multidimensional data clustering is suggested. Based on the minimization of the sum-of-squared-errors, the proposed method produces much smaller quantization errors than the median-cut and mean-split algorithms. It is also observed that the solutions obtained from our algorithm are close to the local optimal ones derived by the k-means iterative procedure.
International Journal of Human-computer Studies \/ International Journal of Man-machine Studies | 1986
S. K. M. Wong; Wojciech Ziarko; R.Li Ye
Quinlan suggested an inductive algorithm based on the statistical theory of information originally proposed by Shannon. Recently Pawlak showed that the principles of inductive learning (learning from examples) can be precisely formulated on the basis of the theory of rough sets. These two approaches are apparently very different, although in both methods objects in the knowledge base are assumed to be characterized by “features” (attributes and attribute values). The main objective of this paper is to show that the concept of “approximate classification” of a set is closely related to the statistical approach. In fact, in the design of inductive programs, the criterion for selecting dominant attributes based on the concept of rough sets is a special case of the statistical method if equally probable distribution of objects in the “doubtful region” of the approximation space is assumed.
systems man and cybernetics | 2000
S. K. M. Wong; Cory J. Butz; Dan Wu
The implication problem is to test whether a given set of independencies logically implies another independency. This problem is crucial in the design of a probabilistic reasoning system. We advocate that Bayesian networks are a generalization of standard relational databases. On the contrary, it has been suggested that Bayesian networks are different from the relational databases because the implication problem of these two systems does not coincide for some classes of probabilistic independencies. This remark, however, does not take into consideration one important issue, namely, the solvability of the implication problem. In this comprehensive study of the implication problem for probabilistic conditional independencies, it is emphasized that Bayesian networks and relational databases coincide on solvable classes of independencies. The present study suggests that the implication problem for these two closely related systems differs only in unsolvable classes of independencies. This means there is no real difference between Bayesian networks and relational databases, in the sense that only solvable classes of independencies are useful in the design and implementation of these knowledge systems. More importantly, perhaps, these results suggest that many current attempts to generalize Bayesian networks can take full advantage of the generalizations made to standard relational databases.