Manjeet Rege
Rochester Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Manjeet Rege.
Knowledge and Information Systems | 2008
Yanhua Chen; Manjeet Rege; Ming Dong; Jing Hua
Traditional clustering algorithms are inapplicable to many real-world problems where limited knowledge from domain experts is available. Incorporating the domain knowledge can guide a clustering algorithm, consequently improving the quality of clustering. In this paper, we propose SS-NMF: a semi-supervised non-negative matrix factorization framework for data clustering. In SS-NMF, users are able to provide supervision for clustering in terms of pairwise constraints on a few data objects specifying whether they “must” or “cannot” be clustered together. Through an iterative algorithm, we perform symmetric tri-factorization of the data similarity matrix to infer the clusters. Theoretically, we show the correctness and convergence of SS-NMF. Moveover, we show that SS-NMF provides a general framework for semi-supervised clustering. Existing approaches can be considered as special cases of it. Through extensive experiments conducted on publicly available datasets, we demonstrate the superior performance of SS-NMF for clustering.
international conference on data mining | 2006
Manjeet Rege; Ming Dong; Farshad Fotouhi
In this paper, we present a novel graph theoretic approach to the problem of document-word co-clustering. In our approach, documents and words are modeled as the two vertices of a bipartite graph. We then propose isoperimetric co-clustering algorithm (ICA) - a new method for partitioning the document-word bipartite graph. ICA requires a simple solution to a sparse system of linear equations instead of the eigenvalue or SVD problem in the popular spectral co-clustering approach. Our extensive experiments performed on publicly available datasets demonstrate the advantages of ICA over spectral approach in terms of the quality, efficiency and stability in partitioning the document-word bipartite graph.
international world wide web conferences | 2008
Manjeet Rege; Ming Dong; Jing Hua
With the explosive growth of Web and the recent development in digital media technology, the number of images on the Web has grown tremendously. Consequently, Web image clustering has emerged as an important application. Some of the initial efforts along this direction revolved around clustering Web images based on the visual features of images or textual features by making use of the text surrounding the images. However, not much work has been done in using multimodal information for clustering Web images. In this paper, we propose a graph theoretical framework for simultaneously integrating visual and textual features for efficient Web image clustering. Specifically, we model visual features, images and words from surrounding text using a tripartite graph. Partitioning this graph leads to clustering of the Web images. Although, graph partitioning approach has been adopted before, the main contribution of this work lies in a new algorithm that we propose - Consistent Isoperimetric High-order Co-clustering (CIHC), for partitioning the tripartite graph. Computationally, CIHC is very quick as it requires a simple solution to a sparse system of linear equations. Our theoretical analysis and extensive experiments performed on real Web images demonstrate the performance of CIHC in terms of the quality, efficiency and scalability in partitioning the visual feature-image-word tripartite graph.
international conference on web services | 2010
Qi Yu; Manjeet Rege
Efficient and accurate discovery of user desired Web services is a key component for achieving the full potential of service computing. However, service discovery is a non-trivial task considering the large and fast growing service space. Meanwhile, Web services are typically autonomous and a priori unknown. This further complicates the service discovery problem. We propose a service community learning algorithm that can generate homogeneous communities from the heterogeneous service space. This can greatly facilitate the service discovery process as the users only need to search within their desired service communities. A key ingredient of the community learning algorithm is a co-clustering scheme that leverages the duality relationship between services and operations. Experimental results on both synthetic and real Web services demonstrate the effectiveness of the proposed service community learning algorithm.
IEEE Transactions on Knowledge and Data Engineering | 2012
Lijun Wang; Manjeet Rege; Ming Dong; Yongsheng Ding
Traditional clustering techniques are inapplicable to problems where the relationships between data points evolve over time. Not only is it important for the clustering algorithm to adapt to the recent changes in the evolving data, but it also needs to take the historical relationship between the data points into consideration. In this paper, we propose ECKF, a general framework for evolutionary clustering large-scale data based on low-rank kernel matrix factorization. To the best of our knowledge, this is the first work that clusters large evolutionary data sets by the amalgamation of low-rank matrix approximation methods and matrix factorization-based clustering. Since the low-rank approximation provides a compact representation of the original matrix, and especially, the near-optimal low-rank approximation can preserve the sparsity of the original data, ECKF gains computational efficiency and hence is applicable to large evolutionary data sets. Moreover, matrix factorization-based methods have been shown to effectively cluster high-dimensional data in text mining and multimedia data analysis. From a theoretical standpoint, we mathematically prove the convergence and correctness of ECKF, and provide detailed analysis of its computational efficiency (both time and space). Through extensive experiments performed on synthetic and real data sets, we show that ECKF outperforms the existing methods in evolutionary clustering.
service-oriented computing and applications | 2010
Qi Yu; Manjeet Rege; Athman Bouguettaya; Brahim Medjahed; Mourad Ouzzani
Service-oriented computing is gaining momentum as the next technological tool to leverage the huge investments in Web application development. The expected large number of Web services poses a set of new challenges for efficiently accessing these services. We propose an integrated service query framework that facilitates users in accessing their desired services. The framework incorporates a service query model and a two-phase optimization strategy. The query model defines service communities that are used to organize the large and heterogeneous service space. The service communities allow users to use declarative queries to retrieve their desired services without worrying about the underlying technical details. The two-phase optimization strategy automatically generates feasible service execution plans and selects the plan with the best user-desired quality. In particular, we present an evolutionary algorithm that is able to “co-evolve” multiple feasible execution plans simultaneously and allows them to compete with each other to generate the best plan. We conduct a set of experiments to assess the performance of the proposed algorithms.
Data Mining and Knowledge Discovery | 2008
Manjeet Rege; Ming Dong; Farshad Fotouhi
Data co-clustering refers to the problem of simultaneous clustering of two data types. Typically, the data is stored in a contingency or co-occurrence matrix C where rows and columns of the matrix represent the data types to be co-clustered. An entry Cij of the matrix signifies the relation between the data type represented by row i and column j. Co-clustering is the problem of deriving sub-matrices from the larger data matrix by simultaneously clustering rows and columns of the data matrix. In this paper, we present a novel graph theoretic approach to data co-clustering. The two data types are modeled as the two sets of vertices of a weighted bipartite graph. We then propose Isoperimetric Co-clustering Algorithm (ICA)—a new method for partitioning the bipartite graph. ICA requires a simple solution to a sparse system of linear equations instead of the eigenvalue or SVD problem in the popular spectral co-clustering approach. Our theoretical analysis and extensive experiments performed on publicly available datasets demonstrate the advantages of ICA over other approaches in terms of the quality, efficiency and stability in partitioning the bipartite graph.
international conference on web services | 2010
Xumin Liu; Chunmei Liu; Manjeet Rege; Athman Bouguettaya
We propose an integrated framework that manages changes in long term composed services. The main procedure of change reaction is presented. One of the most challenging research issues of change management is how to automate the process of change reaction. To address this issue, we propose a semantic support, which centers around a tree-structured Web service ontology. The ontology is expected to provide sufficient semantic for change reaction. We propose a set of algorithms for efficiently querying semantics from the ontology. We conduct a set of experiments to evaluate the performance of the proposed algorithms.
international conference on image processing | 2006
Manjeet Rege; Ming Dong; Farshad Fotouhi
In this paper, we present a novel idea of co-clustering image features and semantic concepts. We accomplish this by modelling user feedback logs and low-level features using a bipartite graph. Our experiments demonstrate that (1) incorporating semantic information achieves better image clustering and (2) feature selection in co-clustering narrows the semantic gap, thus enabling efficient image retrieval.
international symposium on neural networks | 2011
Nathan Green; Manjeet Rege; Xumin Liu; Reynold J. Bailey
Co-clustering is the problem of deriving sub-matrices from the larger data matrix by simultaneously clustering rows and columns of the data matrix. Traditional co-clustering techniques are inapplicable to problems where the relationship between the instances (rows) and features (columns) evolve over time. Not only is it important for the clustering algorithm to adapt to the recent changes in the evolving data, but it also needs to take the historical relationship between the instances and features into consideration. We present ESCC, a general framework for evolutionary spectral co-clustering. We are able to efficiently co-cluster evolving data by incorporation of historical clustering results. Under the proposed framework, we present two approaches, Respect To the Current (RTC), and Respect To Historical (RTH). The two approaches differ in the way the historical cost is computed. In RTC, the present clustering quality is of most importance and historical cost is calculated with only one previous time-step. RTH, on the other hand, attempts to keep instances and features tied to the same clusters between time-steps. Extensive experiments performed on synthetic and real world data, demonstrate the effectiveness of the approach.