Featured Researches

Statistical Finance

Checking account activity and credit default risk of enterprises: An application of statistical learning methods

The existence of asymmetric information has always been a major concern for financial institutions. Financial intermediaries such as commercial banks need to study the quality of potential borrowers in order to make their decision on corporate loans. Classical methods model the default probability by financial ratios using the logistic regression. As one of the major commercial banks in France, we have access to the the account activities of corporate clients. We show that this transactional data outperforms classical financial ratios in predicting the default event. As the new data reflects the real time status of cash flow, this result confirms our intuition that liquidity plays an important role in the phenomenon of default. Moreover, the two data sets are supplementary to each other to a certain extent: the merged data has a better prediction power than each individual data. We have adopted some advanced machine learning methods and analyzed their characteristics. The correct use of these methods helps us to acquire a deeper understanding of the role of central factors in the phenomenon of default, such as credit line violations and cash inflows.

Read more
Statistical Finance

Choosing News Topics to Explain Stock Market Returns

We analyze methods for selecting topics in news articles to explain stock returns. We find, through empirical and theoretical results, that supervised Latent Dirichlet Allocation (sLDA) implemented through Gibbs sampling in a stochastic EM algorithm will often overfit returns to the detriment of the topic model. We obtain better out-of-sample performance through a random search of plain LDA models. A branching procedure that reinforces effective topic assignments often performs best. We test methods on an archive of over 90,000 news articles about S&P 500 firms.

Read more
Statistical Finance

Christmas Jump in LIBOR

A short-term pattern in LIBOR dynamics was discovered. Namely, 2-month LIBOR experiences a jump after Xmas. The sign and size of the jump depend on the data trend on 21 days before Xmas.

Read more
Statistical Finance

Cluster analysis of stocks using price movements of high frequency data from National Stock Exchange

This paper aims to develop new techniques to describe joint behavior of stocks, beyond regression and correlation. For example, we want to identify the clusters of the stocks that move together. Our work is based on applying Kernel Principal Component Analysis(KPCA) and Functional Principal Component Analysis(FPCA) to high frequency data from NSE. Since we dealt with high frequency data with a tick size of 30 seconds, FPCA seems to be an ideal choice. FPCA is a functional variant of PCA where each sample point is considered to be a function in Hilbert space L^2. On the other hand, KPCA is an extension of PCA using kernel methods. Results obtained from FPCA and Gaussian Kernel PCA seems to be in synergy but with a lag. There were two prominent clusters that showed up in our analysis, one corresponding to the banking sector and another corresponding to the IT sector. The other smaller clusters were seen from the automobile industry and the energy sector. IT sector was seen interacting with these small clusters. The learning gained from these interactions is substantial as one can use it significantly to develop trading strategies for intraday traders.

Read more
Statistical Finance

Clustering patterns in efficiency and the coming-of-age of the cryptocurrency market

The efficient market hypothesis has far-reaching implications for financial trading and market stability. Whether or not cryptocurrencies are informationally efficient has therefore been the subject of intense recent investigation. Here, we use permutation entropy and statistical complexity over sliding time-windows of price log returns to quantify the dynamic efficiency of more than four hundred cryptocurrencies. We consider that a cryptocurrency is efficient within a time-window when these two complexity measures are statistically indistinguishable from their values obtained on randomly shuffled data. We find that 37% of the cryptocurrencies in our study stay efficient over 80% of the time, whereas 20% are informationally efficient in less than 20% of the time. Our results also show that the efficiency is not correlated with the market capitalization of the cryptocurrencies. A dynamic analysis of informational efficiency over time reveals clustering patterns in which different cryptocurrencies with similar temporal patterns form four clusters, and moreover, younger currencies in each group appear poised to follow the trend of their 'elders'. The cryptocurrency market thus already shows notable adherence to the efficient market hypothesis, although data also reveals that the coming-of-age of digital currencies is in this regard still very much underway.

Read more
Statistical Finance

Co-existence of Trend and Value in Financial Markets: Estimating an Extended Chiarella Model

Trend and Value are pervasive anomalies, common to all financial markets. We address the problem of their co-existence and interaction within the framework of Heterogeneous Agent Based Models (HABM). More specifically, we extend the Chiarella (1992) model by adding noise traders and a non-linear demand of fundamentalists. We use Bayesian filtering techniques to calibrate the model on time series of prices across a variety of asset classes since 1800. The fundamental value is an output of the calibration, and does not require the use of an external pricing model. Our extended model reproduces many empirical observations, including the non-monotonic relation between past trends and future returns. The destabilizing activity of trend-followers leads to a qualitative change of mispricing distribution, from unimodal to bimodal, meaning that some markets tend to be over- (or under-) valued for long periods of time.

Read more
Statistical Finance

Co-jumping of Treasury Yield Curve Rates

We study the role of co-jumps in the interest rate futures markets. To disentangle continuous part of quadratic covariation from co-jumps, we localize the co-jumps precisely through wavelet coefficients and identify statistically significant ones. Using high frequency data about U.S. and European yield curves we quantify the effect of co-jumps on their correlation structure. Empirical findings reveal much stronger co-jumping behavior of the U.S. yield curves in comparison to the European one. Further, we connect co-jumping behavior to the monetary policy announcements, and study effect of 103 FOMC and 119 ECB announcements on the identified co-jumps during the period from January 2007 to December 2017.

Read more
Statistical Finance

Cointegration in high frequency data

In this paper, we consider a framework adapting the notion of cointegration when two asset prices are generated by a driftless Itô-semimartingale featuring jumps with infinite activity, observed regularly and synchronously at high frequency. We develop a regression based estimation of the cointegrated relations method and show the related consistency and central limit theory when there is cointegration within that framework. We also provide a Dickey-Fuller type residual based test for the null of no cointegration against the alternative of cointegration, along with its limit theory. Under no cointegration, the asymptotic limit is the same as that of the original Dickey-Fuller residual based test, so that critical values can be easily tabulated in the same way. Finite sample indicates adequate size and good power properties in a variety of realistic configurations, outperforming original Dickey-Fuller and Phillips-Perron type residual based tests, whose sizes are distorted by non ergodic time-varying variance and power is altered by price jumps. Two empirical examples consolidate the Monte-Carlo evidence that the adapted tests can be rejected while the original tests are not, and vice versa.

Read more
Statistical Finance

Combination of window-sliding and prediction range method based on LSTM model for predicting cryptocurrency

The present study aims to establish the model of the cryptocurrency price trend based on financial theory using the LSTM model with multiple combinations between the window length and the predicting horizons, the random walk model is also applied with different parameter settings.

Read more
Statistical Finance

Company classification using machine learning

The recent advancements in computational power and machine learning algorithms have led to vast improvements in manifold areas of research. Especially in finance, the application of machine learning enables both researchers and practitioners to gain new insights into financial data and well-studied areas such as company classification. In our paper, we demonstrate that unsupervised machine learning algorithms can be used to visualize and classify company data in an economically meaningful and effective way. In particular, we implement the data-driven dimension reduction and visualization tool t-distributed stochastic neighbor embedding (t-SNE) in combination with spectral clustering. The resulting company groups can then be utilized by experts in the field for empirical analysis and optimal decision making. By providing an exemplary out-of-sample study within a portfolio optimization framework, we show that the application of t-SNE and spectral clustering improves the overall portfolio performance. Therefore, we introduce our approach to the financial community as a valuable technique in the context of data analysis and company classification.

Read more

Ready to get started?

Join us today