Vivekanand Gopalkrishnan

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Vivekanand Gopalkrishnan is active.

Explore More

Publication

Featured researches published by Vivekanand Gopalkrishnan.

ACM Transactions on Intelligent Systems and Technology | 2011

CORN: Correlation-driven nonparametric learning approach for portfolio selection

Bin Li; Steven C. H. Hoi; Vivekanand Gopalkrishnan

Machine learning techniques have been adopted to select portfolios from financial markets in some emerging intelligent business applications. In this article, we propose a novel learning-to-trade algorithm termed CORrelation-driven Nonparametric learning strategy (CORN) for actively trading stocks. CORN effectively exploits statistical relations between stock market windows via a nonparametric learning approach. We evaluate the empirical performance of our algorithm extensively on several large historical and latest real stock markets, and show that it can easily beat both the market index and the best stock in the market substantially (without or with small transaction costs), and also surpass a variety of state-of-the-art techniques significantly.

database systems for advanced applications | 2010

Mining outliers with ensemble of heterogeneous detectors on random subspaces

Hoang Vu Nguyen; Hock Hee Ang; Vivekanand Gopalkrishnan

Outlier detection has many practical applications, especially in domains that have scope for abnormal behavior. Despite the importance of detecting outliers, defining outliers in fact is a nontrivial task which is normally application-dependent. On the other hand, detection techniques are constructed around the chosen definitions. As a consequence, available detection techniques vary significantly in terms of accuracy, performance and issues of the detection problem which they address. In this paper, we propose a unified framework for combining different outlier detection algorithms. Unlike existing work, our approach combines non-compatible techniques of different types to improve the outlier detection accuracy compared to other ensemble and individual approaches. Through extensive empirical studies, our framework is shown to be very effective in detecting outliers in the real-world context.

systems man and cybernetics | 1998

Issues of object-relational view design in data warehousing environment

Vivekanand Gopalkrishnan; Qing Li; Kamalakar Karlapalem

Data warehouses contain a vast amount of data, often from different sources. In order to support complex queries of various decision support systems, we need to store materialized views of data. These views represent integrated data based on complex aggregate queries, and should be available consistently and instantaneously, which would not be possible if those queries had to be invoked each time on the varied data sources. Conventional approaches have been based on the relational model where materialized views are stored as tables in the warehouse. This also means that semantics of the data warehouse structure are hidden. Using object-oriented methodology, we can explicitly represent the semantics and reuse view (class) definitions based on the ISA hierarchy and the class composition hierarchies, thereby resulting in a more efficient view mechanism. This paper deals with the issues concerned with providing an object-relational view (ORV) mechanism for the data warehouse. Primary issues are: providing object views for the relational data and for aggregate/summary data. These include translating the relational data structures into a class hierarchy, defining class structures for the summary views, supporting object ids for object instances of the views (classes) generated handling those classes with respect to maintenance and providing links to other classes in the hierarchy, and accessing and querying these view classes.

international conference on data mining | 2006

Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment

Kelvin Sim; Jinyan Li; Vivekanand Gopalkrishnan; Guimei Liu

We introduce an unsupervised process to co-cluster groups of stocks and financial ratios, so that investors can gain more insight on how they are correlated. Our idea for the co-clustering is based on a graph concept called maximal quasi-bicliques, which can tolerate erroneous or/and missing information that are common in the stock and financial ratio data. Compared to previous works, our maximal quasi-bicliques require the errors to be evenly distributed, which enable us to capture more meaningful co-clusters. We develop a new algorithm that can efficiently enumerate maximal quasi-bicliques from an undirected graph. The concept of maximal quasi-bicliques is domain-independent; it can be extended to perform co-clustering on any set of data that are modeled by graphs.

ACM Transactions on Knowledge Discovery From Data | 2013

Confidence Weighted Mean Reversion Strategy for Online Portfolio Selection

Bin Li; Steven C. H. Hoi; Peilin Zhao; Vivekanand Gopalkrishnan

Online portfolio selection has been attracting increasing attention from the data mining and machine learning communities. All existing online portfolio selection strategies focus on the first order information of a portfolio vector, though the second order information may also be beneficial to a strategy. Moreover, empirical evidence shows that relative stock prices may follow the mean reversion property, which has not been fully exploited by existing strategies. This article proposes a novel online portfolio selection strategy named Confidence Weighted Mean Reversion (CWMR). Inspired by the mean reversion principle in finance and confidence weighted online learning technique in machine learning, CWMR models the portfolio vector as a Gaussian distribution, and sequentially updates the distribution by following the mean reversion trading principle. CWMR’s closed-form updates clearly reflect the mean reversion trading idea. We also present several variants of CWMR algorithms, including a CWMR mixture algorithm that is theoretical universal. Empirically, CWMR strategy is able to effectively exploit the power of mean reversion for online portfolio selection. Extensive experiments on various real markets show that the proposed strategy is superior to the state-of-the-art techniques. The experimental testbed including source codes and data sets is available online.

knowledge discovery and data mining | 2009

Towards efficient mining of proportional fault-tolerant frequent itemsets

Ardian Kristanto Poernomo; Vivekanand Gopalkrishnan

Fault-tolerant frequent itemsets (FTFI) are variants of frequent itemsets for representing and discovering generalized knowledge. However, despite growing interest in this field, no previous approach mines proportional FTFIs with their exact support (FT-support). This problem is difficult because of two concerns: (a) non anti-monotonic property of FT-support when relaxation is proportional, and (b) difficulty in computing FT-support. Previous efforts on this problem either simplify the general problem by adding constraints, or provide approximate solutions without any error guarantees. In this paper, we address these concerns in the general FTFI mining problem. We limit the search space by providing provably correct anti monotone bounds for FT-support and develop practically efficient means of achieving them. Besides, we also provide an efficient and exact FT-support counting procedure. Extensive experiments using real datasets validate that our solution is reasonably efficient for completely mining FTFIs. Implementations for the algorithms are available from www.cais.ntu.edu.sg/~vivek/pubs/ftfim09.

international conference on data mining | 2010

Discovering Correlated Subspace Clusters in 3D Continuous-Valued Data

Kelvin Sim; Zeyar Aung; Vivekanand Gopalkrishnan

Subspace clusters represent useful information in high-dimensional data. However, mining significant subspace clusters in continuous-valued 3D data such as stock-financial ratio-year data, or gene-sample-time data, is difficult. Firstly, typical metrics either find subspaces with very few objects, or they find too many insignificant subspaces – those which exist by chance. Besides, typical 3D subspace clustering approaches abound with parameters, which are usually set under biased assumptions, making the mining process a ‘guessing game’. We address these concerns by proposing an information theoretic measure, which allows us to identify 3D subspace clusters that stand out from the data. We also develop a highly effective, efficient and parameter-robust algorithm, which is a hybrid of information theoretical and statistical techniques, to mine these clusters. From extensive experimentations, we show that our approach can discover significant 3D subspace clusters embedded in 110 synthetic datasets of varying conditions. We also perform a case study on real-world stock datasets, which shows that our clusters can generate higher profits compared to those mined by other approaches.

european conference on machine learning | 2009

Efficient Pruning Schemes for Distance-Based Outlier Detection

Nguyen Hoang Vu; Vivekanand Gopalkrishnan

Outlier detection finds many applications, especially in domains that have scope for abnormal behavior. In this paper, we present a new technique for detecting distance-based outliers, aimed at reducing execution time associated with the detection process. Our approach operates in two phases and employs three pruning rules. In the first phase, we partition the data into clusters, and make an early estimate on the lower bound of outlier scores. Based on this lower bound, the second phase then processes relevant clusters using the traditional block nested-loop algorithm. Here two efficient pruning rules are utilized to quickly discard more non-outliers and reduce the search space. Detailed analysis of our approach shows that the additional overhead of the first phase is offset by the reduction in cost of the second phase. We also demonstrate the superiority of our approach over existing distance-based outlier detection methods by extensive empirical studies on real datasets.

european conference on machine learning | 2008

Cascade RSVM in Peer-to-Peer Networks

Hock Hee Ang; Vivekanand Gopalkrishnan; Steven C. H. Hoi; Wee Keong Ng

The goal of distributed learning in P2P networks is to achieve results as close as possible to those from centralized approaches. Learning models of classification in a P2P network faces several challenges like scalability, peer dynamism, asynchronism and data privacy preservation. In this paper, we study the feasibility of building SVM classifiers in a P2P network. We show how cascading SVM can be mapped to a P2P network of data propagation. Our proposed P2P SVM provides a method for constructing classifiers in P2P networks with classification accuracy comparable to centralized classifiers and better than other distributed classifiers. The proposed algorithm also satisfies the characteristics of P2P computing and has an upper bound on the communication overhead. Extensive experimental results confirm the feasibility and attractiveness of this approach.

database systems for advanced applications | 2011

An unbiased distance-based outlier detection approach for high-dimensional data

Hoang Vu Nguyen; Vivekanand Gopalkrishnan; Ira Assent

Traditional outlier detection techniques usually fail to work efficiently on high-dimensional data due to the curse of dimensionality. This work proposes a novel method for subspace outlier detection, that specifically deals with multidimensional spaces where feature relevance is a local rather than a global property. Different from existing approaches, it is not grid-based and dimensionality unbiased. Thus, its performance is impervious to grid resolution as well as the curse of dimensionality. In addition, our approach ranks the outliers, allowing users to select the number of desired outliers, thus mitigating the issue of high false alarm rate. Extensive empirical studies on real datasets show that our approach efficiently and effectively detects outliers, even in highdimensional spaces.

Explore More