Kadim Tasdemir
Rice University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Kadim Tasdemir.
IEEE Transactions on Neural Networks | 2009
Kadim Tasdemir; Erzsébet Merényi
The self-organizing map (SOM) is a powerful method for visualization, cluster extraction, and data mining. It has been used successfully for data of high dimensionality and complexity where traditional methods may often be insufficient. In order to analyze data structure and capture cluster boundaries from the SOM, one common approach is to represent the SOMs knowledge by visualization methods. Different aspects of the information learned by the SOM are presented by existing methods, but data topology, which is present in the SOMs knowledge, is greatly underutilized. We show in this paper that data topology can be integrated into the visualization of the SOM and thereby provide a more elaborate view of the cluster structure than existing schemes. We achieve this by introducing a weighted Delaunay triangulation (a connectivity matrix) and draping it over the SOM. This new visualization, CONNvis, also shows both forward and backward topology violations along with the severity of forward ones, which indicate the quality of the SOM learning and the data complexity. CONNvis greatly assists in detailed identification of cluster boundaries. We demonstrate the capabilities on synthetic data sets and on a real 8D remote sensing spectral image.
urban remote sensing joint event | 2007
Erzsenet Merenyi; Beata Csatho; Kadim Tasdemir
With all the exciting advances in sensor fusion and data interpretation technologies in recent years, including co-registration, 3-D surface reconstruction, object recognition, spatial reasoning, and more, high-quality detailed and precise segmentation of remote sensing spectral images remains a much needed key component in the comprehensive analysis and understanding of surfaces. Urban surfaces are no exception. In fact, urban surfaces can represent more challenge than many other types because of the very large variety of materials concentrated in relatively small areas. Segmentation (unsupervised clustering) or supervised classification based on spectral signatures from multi-and hyperspectral imagery, or based on other multidimensional signatures from stacked disparate (multi-source) imagery, provide delineation of materials with various compositional and physical properties in a scene. Such a cluster or classification map lends critical support to further reasoning for accurate identification of surface objects and conditions. It is, therefore, imperative to develop methods whose data exploitation power matches that of the discriminating power of the data acquisition instrument. We present a study of unsupervised segmentation, comparing the performances of ISODATA clustering and self-organized manifold learning on an urban image from a Daedalus multi-spectral scanner and on an AVIRIS hyperspectral image.
systems man and cybernetics | 2011
Kadim Tasdemir; Erzsébet Merényi
Evaluation of how well the extracted clusters fit the true partitions of a data set is one of the fundamental challenges in unsupervised clustering because the data structure and the number of clusters are unknown a priori. Cluster validity indices are commonly used to select the best partitioning from different clustering results; however, they are often inadequate unless clusters are well separated or have parametrical shapes. Prototype-based clustering (finding of clusters by grouping the prototypes obtained by vector quantization of the data), which is becoming increasingly important for its effectiveness in the analysis of large high-dimensional data sets, adds another dimension to this challenge. For validity assessment of prototype-based clusterings, previously proposed indexes-mostly devised for the evaluation of point-based clusterings-usually perform poorly. The poor performance is made worse when the validity indexes are applied to large data sets with complicated cluster structure. In this paper, we propose a new index, Conn_Index, which can be applied to data sets with a wide variety of clusters of different shapes, sizes, densities, or overlaps. We construct Conn_Index based on inter- and intra-cluster connectivities of prototypes. Connectivities are defined through a “connectivity matrix”, which is a weighted Delaunay graph where the weights indicate the local data distribution. Experiments on synthetic and real data indicate that Conn_Index outperforms existing validity indices, used in this paper, for the evaluation of prototype-based clustering results.
Similarity-Based Clustering | 2009
Erzsébet Merényi; Kadim Tasdemir; Lili Zhang
In this paper we elaborate on the challenges of learning manifolds that have many relevant clusters, and where the clusters can have widely varying statistics. We call such data manifolds highly structured . We describe approaches to structure identification through self-organized learning, in the context of such data. We present some of our recently developed methods to show that self-organizing neural maps contain a great deal of information that can be unleashed and put to use to achieve detailed and accurate learning of highly structured manifolds, and we also offer some comparisons with existing clustering methods on real data.
international symposium on neural networks | 2007
Kadim Tasdemir; Erzsébet Merényi
One of the fundamental challenges of clustering is how to evaluate, without auxiliary information, to what extent the obtained clusters fit the natural partitions of the data s et. A common approach for evaluation of clustering results is to use validity indices. We propose a new validity index, Conn Index, for prototype based clustering. Conn Index is applicable to data sets with a wide variety of cluster characteristics (di fferent shapes, sizes, densities, overlaps). We construct Conn Index based on inter- and intra-cluster connectivities of prototypes, which are found through a weighted Delaunay triangulation called connectivity matrix (1), where the weights indicate the data distribution. We compare the performance of Conn Index to commonly used indices on synthetic and real data sets. I. I NTRODUCTION Clustering means splitting a data set into groups such that the data samples within a group are more similar to each other than to the data samples in other groups. Clustering is done with many methods which can be categorized in several ways where the two major ones are partitioning and hierarchical clustering. For any method, clustering the da ta directly becomes computationally heavy as the size of the data set increases. In order to significantly reduce the com- putational cost, two-step algorithms have been proposed (2), (3), (4), (5). Two-step algorithms (prototype based cluste ring) first find the quantization prototypes of data, and then clust er the prototypes. Using the prototypes instead of data can also reduce noise because the prototypes are the local averages of the data. A widely and successfully used neural paradigm for find- ing prototypes is the Self-Organizing Map (SOM). The SOM is a spatially ordered quantization of a data space where the quantization prototypes are adaptively determined for optimal approximation of the (unknown) distribution of the data. The SOM also facilitates visualization of the structu re of a higher-dimensional data space in one or two dimensions, which can guide semi-manual clustering. Thus, the SOM is a powerful aid in capturing clusters in high-dimensional intricate data sets (1), (2), (3), (6). With any clustering method, whether clustering the data itself or its prototypes, the main problems are to determine the number of clusters and to evaluate the validity of the clusters. A validity measure of the clustering ideally show s
Proceedings of SPIE, the International Society for Optical Engineering | 2008
Erzsébet Merényi; Kadim Tasdemir; William H. Farrand
Effective scientific exploration of remote targets such as solar system objects increasingly calls for autonomous data analysis and decision making on-board. Today, robots in space missions are programmed to traverse from one location to another without regard to what they might be passing by. By not processing data as they travel, they can miss important discoveries, or will need to travel back if scientists on Earth find the data warrant backtracking. This is a suboptimal use of resources even on relatively close targets such as the Moon or Mars. The farther mankind ventures into space, the longer the delay in communication, due to which interesting findings from data sent back to Earth are made too late to command a (roving, floating, or orbiting) robot to further examine a given location. However, autonomous commanding of robots in scientific exploration can only be as reliable as the scientific information extracted from the data that is collected and provided for decision making. In this paper, we focus on the discovery scenario, where information extraction is accomplished with unsupervised clustering. For high-dimensional data with complicated structure, detailed segmentation that identifies all significant groups and discovers the small, surprising anomalies in the data, is a challenging task at which conventional algorithms often fail. We approach the problem with precision manifold learning using self-organizing neural maps with non-standard features developed in the course of our research. We demonstrate the effectiveness and robustness of this approach on multi-spectral imagery from the Mars Exploration Rovers Pancam, and on synthetic hyperspectral imagery.
discovery science | 2008
Kadim Tasdemir; Erzsébet Merényi
The Self-Organizing Map (SOM), a powerful method for clustering and knowledge discovery, has been used effectively for remote sensing spectral images which often have high-dimensional feature vectors (spectra) and many meaningful clusters with varying statistics. However, a learned SOM needs postprocessing to identify the clusters, which is typically done interactively from various visualizations. What aspects of the SOMs knowledge are presented by a visualization has great importance for cluster capture. We present our recent scheme, CONNvis, which achieves detailed delineation of cluster boundaries by rendering data topology on the SOM lattice. We show discovery through CONNvis clustering in a remote sensing spectral image from the Mars Exploration Rover Spirit.
international conference on system of systems engineering | 2007
Erzsébet Merényi; Lili Zhang; Kadim Tasdemir
Fast identification of critical information in a changing environment is difficult yet it is key to dynamical decision support, in general. Finding critical information in large and complex data volumes is a challenge real systems, and systems of systems, pose increasingly. Moreover, many of these real systems are desired to operate highly autonomously, using extracted critical information and discovered and distilled knowledge directly, for decisions. Spacecraft or rover navigation based on scientific findings from continuously collected data by onboard computation, is one example. This highlights the importance of the quality of information extraction. The knowledge discovery process must be intelligent enough to produce useful details; reliable; robust; and fast. This paper focuses on the first three of these quality aspects through precision manifold learning, in an onboard decision making scenario of a space mission.
the european symposium on artificial neural networks | 2006
Kadim Tasdemir; Erzsébet Merényi
Computers and Electronics in Agriculture | 2012
Kadim Tasdemir; Pavel Milenov; Brooke Tapsall