Martin Ester
Simon Fraser University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Martin Ester.
Bioinformatics | 2010
Nancy Y. Yu; James R. Wagner; Matthew R. Laird; Gabor Melli; Sébastien Rey; Ray mond Lo; Phuong Dao; S. Cenk Sahinalp; Martin Ester; Leonard J. Foster; Fiona S. L. Brinkman
Motivation: PSORTb has remained the most precise bacterial protein subcellular localization (SCL) predictor since it was first made available in 2003. However, the recall needs to be improved and no accurate SCL predictors yet make predictions for archaea, nor differentiate important localization subcategories, such as proteins targeted to a host cell or bacterial hyperstructures/organelles. Such improvements should preferably be encompassed in a freely available web-based predictor that can also be used as a standalone program. Results: We developed PSORTb version 3.0 with improved recall, higher proteome-scale prediction coverage, and new refined localization subcategories. It is the first SCL predictor specifically geared for all prokaryotes, including archaea and bacteria with atypical membrane/cell wall topologies. It features an improved standalone program, with a new batch results delivery system complementing its web interface. We evaluated the most accurate SCL predictors using 5-fold cross validation plus we performed an independent proteomics analysis, showing that PSORTb 3.0 is the most accurate but can benefit from being complemented by Proteome Analyst predictions. Availability: http://www.psort.org/psortb (download open source software or use the web interface). Contact: psort-mail@sfu.ca Supplementary Information: Supplementary data are availableat Bioinformatics online.
Data Mining and Knowledge Discovery | 1998
Jörg Sander; Martin Ester; Hans-Peter Kriegel; Xiaowei Xu
The clustering algorithm DBSCAN relies on a density-based notion of clusters and is designed to discover clusters of arbitrary shape as well as to distinguish noise. In this paper, we generalize this algorithm in two important directions. The generalized algorithm—called GDBSCAN—can cluster point objects as well as spatially extended objects according to both, their spatial and their nonspatial attributes. In addition, four applications using 2D points (astronomy), 3D points (biology), 5D points (earth science) and 2D polygons (geography) are presented, demonstrating the applicability of GDBSCAN to real-world problems.
conference on recommender systems | 2010
Mohsen Jamali; Martin Ester
Recommender systems are becoming tools of choice to select the online information relevant to a given user. Collaborative filtering is the most popular approach to building recommender systems and has been successfully employed in many applications. With the advent of online social networks, the social network based approach to recommendation has emerged. This approach assumes a social network among users and makes recommendations for a user based on the ratings of the users that have direct or indirect social relations with the given user. As one of their major benefits, social network based approaches have been shown to reduce the problems with cold start users. In this paper, we explore a model-based approach for recommendation in social networks, employing matrix factorization techniques. Advancing previous work, we incorporate the mechanism of trust propagation into the model. Trust propagation has been shown to be a crucial phenomenon in the social sciences, in social network analysis and in trust-based recommendation. We have conducted experiments on two real life data sets, the public domain Epinions.com dataset and a much larger dataset that we have recently crawled from Flixster.com. Our experiments demonstrate that modeling trust propagation leads to a substantial increase in recommendation accuracy, in particular for cold start users.
Bioinformatics | 2005
Jennifer L. Gardy; Matthew R. Laird; Fei Chen; Sébastien Rey; C. J. Walsh; Martin Ester; Fiona S. L. Brinkman
MOTIVATION PSORTb v.1.1 is the most precise bacterial localization prediction tool available. However, the programs predictive coverage and recall are low and the method is only applicable to Gram-negative bacteria. The goals of the present work are as follows: increase PSORTbs coverage while maintaining the existing precision level, expand it to include Gram-positive bacteria and then carry out a comparative analysis of localization. RESULTS An expanded database of proteins of known localization and new modules using frequent subsequence-based support vector machines was introduced into PSORTb v.2.0. The program attains a precision of 96% for Gram-positive and Gram-negative bacteria and predictive coverage comparable to other tools for whole proteome analysis. We show that the proportion of proteins at each localization is remarkably consistent across species, even in species with varying proteome size. AVAILABILITY Web-based version: http://www.psort.org/psortb. Standalone version: Available through the website under GNU General Public License. CONTACT psort-mail@sfu.ca, brinkman@sfu.ca SUPPLEMENTARY INFORMATION http://www.psort.org/psortb/supplementaryinfo.html.
knowledge discovery and data mining | 2009
Mohsen Jamali; Martin Ester
Collaborative filtering is the most popular approach to build recommender systems and has been successfully employed in many applications. However, it cannot make recommendations for so-called cold start users that have rated only a very small number of items. In addition, these methods do not know how confident they are in their recommendations. Trust-based recommendation methods assume the additional knowledge of a trust network among users and can better deal with cold start users, since users only need to be simply connected to the trust network. On the other hand, the sparsity of the user item ratings forces the trust-based approach to consider ratings of indirect neighbors that are only weakly trusted, which may decrease its precision. In order to find a good trade-off, we propose a random walk model combining the trust-based and the collaborative filtering approach for recommendation. The random walk model allows us to define and to measure the confidence of a recommendation. We performed an evaluation on the Epinions dataset and compared our model with existing trust-based and collaborative filtering methods.
knowledge discovery and data mining | 2002
Florian W. Beil; Martin Ester; Xiaowei Xu
Text clustering methods can be used to structure large sets of text or hypertext documents. The well-known methods of text clustering, however, do not really address the special problems of text clustering: very high dimensionality of the data, very large size of the databases and understandability of the cluster description. In this paper, we introduce a novel approach which uses frequent item (term) sets for text clustering. Such frequent sets can be efficiently discovered using algorithms for association rule mining. To cluster based on frequent term sets, we measure the mutual overlap of frequent sets with respect to the sets of supporting documents. We present two algorithms for frequent term-based text clustering, FTC which creates flat clusterings and HFTC for hierarchical clustering. An experimental evaluation on classical text documents as well as on web documents demonstrates that the proposed algorithms obtain clusterings of comparable quality significantly more efficiently than state-of-the- art text clustering algorithms. Furthermore, our methods provide an understandable description of the discovered clusters by their frequent term sets.
Lecture Notes in Computer Science | 1997
Martin Ester; Hans-Peter Kriegel; Jörg Sander
Knowledge discovery in databases (KDD) is an important task in spatial databases since both, the number and the size of such databases are rapidly growing. This paper introduces a set of basic operations which should be supported by a spatial database system (SDBS) to express algorithms for KDD in SDBS. For this purpose, we introduce the concepts of neighborhood graphs and paths and a small set of operations for their manipulation. We argue that these operations are sufficient for KDD algorithms considering spatial neighborhood relations by presenting the implementation of four typical spatial KDD algorithms based on the proposed operations. Furthermore, the efficient support of operations on large neighborhood graphs and on large sets of neighborhood paths by the SDBS is discussed. Neighborhood indices are introduced to materialize selected neighborhood graphs in order to speed up the processing of the proposed operations.
international conference on data engineering | 1998
Xiaowei Xu; Martin Ester; Hans-Peter Kriegel; Jörg Sander
The problem of detecting clusters of points belonging to a spatial point process arises in many applications. In this paper, we introduce the new clustering algorithm DBCLASD (Distribution-Based Clustering of LArge Spatial Databases) to discover clusters of this type. The results of experiments demonstrate that DBCLASD, contrary to partitioning algorithms such as CLARANS (Clustering Large Applications based on RANdomized Search), discovers clusters of arbitrary shape. Furthermore, DBCLASD does not require any input parameters, in contrast to the clustering algorithm DBSCAN (Density-Based Spatial Clustering of Applications with Noise) requiring two input parameters, which may be difficult to provide for large databases. In terms of efficiency, DBCLASD is between CLARANS and DBSCAN, close to DBSCAN. Thus, the efficiency of DBCLASD on large spatial databases is very attractive when considering its nonparametric nature and its good quality for clusters of arbitrary shape.
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases | 1995
Martin Ester; Hans-Peter Kriegel; Xiaowei Xu
Both, the number and the size of spatial databases are rapidly growing because of the large amount of data obtained from satellite images, X-ray crystallography or other scientific equipment. Therefore, automated knowledge discovery becomes more and more important in spatial databases. So far, most of the methods for knowledge discovery in databases (KDD) have been based on relational database systems. In this paper, we address the task of class identification in spatial databases using clustering techniques. We put special emphasis on the integration of the discovery methods with the DB interface, which is crucial for the efficiency of KDD on large databases. The key to this integration is the use of a well-known spatial access method, the R*-tree. The focusing component of a KDD system determines which parts of the database are relevant for the knowledge discovery task. We present several strategies for focusing: selecting representatives from a spatial database, focusing on the relevant clusters and retrieving all objects of a given cluster. We have applied the proposed techniques to real data from a large protein database used for predicting protein-protein docking. A performance evaluation on this database indicates that clustering on large spatial databases can be performed, both, efficiently and effectively.
knowledge discovery and data mining | 1999
Mihael Ankerst; Christian Elsen; Martin Ester; Hans-Peter Kriegel
Satisfying the basic requirements of accuracy and understandability of a classifier, decision tree classifiers have become very popular. Instead of constructing the decision tree by a so F histicated al orithm, in eractive met od K we introduce a fully based on a multidimensional visualization techni interaction capabilities. T R ue and appropriate us domam knowledge of an expert can be prof&ably included in the tree construction phase. Furthermore, after the interactive construction of a decision tree the user has a much deeper understandin o