Vivekanand Gopalkrishnan
Nanyang Technological University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Vivekanand Gopalkrishnan.
ACM Transactions on Intelligent Systems and Technology | 2011
Bin Li; Steven C. H. Hoi; Vivekanand Gopalkrishnan
Machine learning techniques have been adopted to select portfolios from financial markets in some emerging intelligent business applications. In this article, we propose a novel learning-to-trade algorithm termed CORrelation-driven Nonparametric learning strategy (CORN) for actively trading stocks. CORN effectively exploits statistical relations between stock market windows via a nonparametric learning approach. We evaluate the empirical performance of our algorithm extensively on several large historical and latest real stock markets, and show that it can easily beat both the market index and the best stock in the market substantially (without or with small transaction costs), and also surpass a variety of state-of-the-art techniques significantly.
database systems for advanced applications | 2010
Hoang Vu Nguyen; Hock Hee Ang; Vivekanand Gopalkrishnan
Outlier detection has many practical applications, especially in domains that have scope for abnormal behavior. Despite the importance of detecting outliers, defining outliers in fact is a nontrivial task which is normally application-dependent. On the other hand, detection techniques are constructed around the chosen definitions. As a consequence, available detection techniques vary significantly in terms of accuracy, performance and issues of the detection problem which they address. In this paper, we propose a unified framework for combining different outlier detection algorithms. Unlike existing work, our approach combines non-compatible techniques of different types to improve the outlier detection accuracy compared to other ensemble and individual approaches. Through extensive empirical studies, our framework is shown to be very effective in detecting outliers in the real-world context.
systems man and cybernetics | 1998
Vivekanand Gopalkrishnan; Qing Li; Kamalakar Karlapalem
Data warehouses contain a vast amount of data, often from different sources. In order to support complex queries of various decision support systems, we need to store materialized views of data. These views represent integrated data based on complex aggregate queries, and should be available consistently and instantaneously, which would not be possible if those queries had to be invoked each time on the varied data sources. Conventional approaches have been based on the relational model where materialized views are stored as tables in the warehouse. This also means that semantics of the data warehouse structure are hidden. Using object-oriented methodology, we can explicitly represent the semantics and reuse view (class) definitions based on the ISA hierarchy and the class composition hierarchies, thereby resulting in a more efficient view mechanism. This paper deals with the issues concerned with providing an object-relational view (ORV) mechanism for the data warehouse. Primary issues are: providing object views for the relational data and for aggregate/summary data. These include translating the relational data structures into a class hierarchy, defining class structures for the summary views, supporting object ids for object instances of the views (classes) generated handling those classes with respect to maintenance and providing links to other classes in the hierarchy, and accessing and querying these view classes.
international conference on data mining | 2006
Kelvin Sim; Jinyan Li; Vivekanand Gopalkrishnan; Guimei Liu
We introduce an unsupervised process to co-cluster groups of stocks and financial ratios, so that investors can gain more insight on how they are correlated. Our idea for the co-clustering is based on a graph concept called maximal quasi-bicliques, which can tolerate erroneous or/and missing information that are common in the stock and financial ratio data. Compared to previous works, our maximal quasi-bicliques require the errors to be evenly distributed, which enable us to capture more meaningful co-clusters. We develop a new algorithm that can efficiently enumerate maximal quasi-bicliques from an undirected graph. The concept of maximal quasi-bicliques is domain-independent; it can be extended to perform co-clustering on any set of data that are modeled by graphs.
ACM Transactions on Knowledge Discovery From Data | 2013
Bin Li; Steven C. H. Hoi; Peilin Zhao; Vivekanand Gopalkrishnan
Online portfolio selection has been attracting increasing attention from the data mining and machine learning communities. All existing online portfolio selection strategies focus on the first order information of a portfolio vector, though the second order information may also be beneficial to a strategy. Moreover, empirical evidence shows that relative stock prices may follow the mean reversion property, which has not been fully exploited by existing strategies. This article proposes a novel online portfolio selection strategy named Confidence Weighted Mean Reversion (CWMR). Inspired by the mean reversion principle in finance and confidence weighted online learning technique in machine learning, CWMR models the portfolio vector as a Gaussian distribution, and sequentially updates the distribution by following the mean reversion trading principle. CWMR’s closed-form updates clearly reflect the mean reversion trading idea. We also present several variants of CWMR algorithms, including a CWMR mixture algorithm that is theoretical universal. Empirically, CWMR strategy is able to effectively exploit the power of mean reversion for online portfolio selection. Extensive experiments on various real markets show that the proposed strategy is superior to the state-of-the-art techniques. The experimental testbed including source codes and data sets is available online.
knowledge discovery and data mining | 2009
Ardian Kristanto Poernomo; Vivekanand Gopalkrishnan
Fault-tolerant frequent itemsets (FTFI) are variants of frequent itemsets for representing and discovering generalized knowledge. However, despite growing interest in this field, no previous approach mines proportional FTFIs with their exact support (FT-support). This problem is difficult because of two concerns: (a) non anti-monotonic property of FT-support when relaxation is proportional, and (b) difficulty in computing FT-support. Previous efforts on this problem either simplify the general problem by adding constraints, or provide approximate solutions without any error guarantees. In this paper, we address these concerns in the general FTFI mining problem. We limit the search space by providing provably correct anti monotone bounds for FT-support and develop practically efficient means of achieving them. Besides, we also provide an efficient and exact FT-support counting procedure. Extensive experiments using real datasets validate that our solution is reasonably efficient for completely mining FTFIs. Implementations for the algorithms are available from www.cais.ntu.edu.sg/~vivek/pubs/ftfim09.
international conference on data mining | 2010
Kelvin Sim; Zeyar Aung; Vivekanand Gopalkrishnan
Subspace clusters represent useful information in high-dimensional data. However, mining significant subspace clusters in continuous-valued 3D data such as stock-financial ratio-year data, or gene-sample-time data, is difficult. Firstly, typical metrics either find subspaces with very few objects, or they find too many insignificant subspaces – those which exist by chance. Besides, typical 3D subspace clustering approaches abound with parameters, which are usually set under biased assumptions, making the mining process a ‘guessing game’. We address these concerns by proposing an information theoretic measure, which allows us to identify 3D subspace clusters that stand out from the data. We also develop a highly effective, efficient and parameter-robust algorithm, which is a hybrid of information theoretical and statistical techniques, to mine these clusters. From extensive experimentations, we show that our approach can discover significant 3D subspace clusters embedded in 110 synthetic datasets of varying conditions. We also perform a case study on real-world stock datasets, which shows that our clusters can generate higher profits compared to those mined by other approaches.
european conference on machine learning | 2009
Nguyen Hoang Vu; Vivekanand Gopalkrishnan
Outlier detection finds many applications, especially in domains that have scope for abnormal behavior. In this paper, we present a new technique for detecting distance-based outliers, aimed at reducing execution time associated with the detection process. Our approach operates in two phases and employs three pruning rules. In the first phase, we partition the data into clusters, and make an early estimate on the lower bound of outlier scores. Based on this lower bound, the second phase then processes relevant clusters using the traditional block nested-loop algorithm. Here two efficient pruning rules are utilized to quickly discard more non-outliers and reduce the search space. Detailed analysis of our approach shows that the additional overhead of the first phase is offset by the reduction in cost of the second phase. We also demonstrate the superiority of our approach over existing distance-based outlier detection methods by extensive empirical studies on real datasets.
european conference on machine learning | 2008
Hock Hee Ang; Vivekanand Gopalkrishnan; Steven C. H. Hoi; Wee Keong Ng
The goal of distributed learning in P2P networks is to achieve results as close as possible to those from centralized approaches. Learning models of classification in a P2P network faces several challenges like scalability, peer dynamism, asynchronism and data privacy preservation. In this paper, we study the feasibility of building SVM classifiers in a P2P network. We show how cascading SVM can be mapped to a P2P network of data propagation. Our proposed P2P SVM provides a method for constructing classifiers in P2P networks with classification accuracy comparable to centralized classifiers and better than other distributed classifiers. The proposed algorithm also satisfies the characteristics of P2P computing and has an upper bound on the communication overhead. Extensive experimental results confirm the feasibility and attractiveness of this approach.
database systems for advanced applications | 2011
Hoang Vu Nguyen; Vivekanand Gopalkrishnan; Ira Assent
Traditional outlier detection techniques usually fail to work efficiently on high-dimensional data due to the curse of dimensionality. This work proposes a novel method for subspace outlier detection, that specifically deals with multidimensional spaces where feature relevance is a local rather than a global property. Different from existing approaches, it is not grid-based and dimensionality unbiased. Thus, its performance is impervious to grid resolution as well as the curse of dimensionality. In addition, our approach ranks the outliers, allowing users to select the number of desired outliers, thus mitigating the issue of high false alarm rate. Extensive empirical studies on real datasets show that our approach efficiently and effectively detects outliers, even in highdimensional spaces.