[PDF] Image Processing Tools for Financial Time Series Classification

Abstract

The application of deep learning to time series forecasting is one of the major challenges in present machine learning. We propose a novel methodology that combines machine learning and image processing methods to define and predict market states with intraday financial data. A wavelet transform is applied to the log-return of stock prices for both image extraction and denoising. A convolutional neural network then extracts patterns from denoised wavelet images to classify daily time series, i.e. a market state is associated with the binary prediction of the daily close price movement based on the wavelet image constructed from the price changes in the first hours of the day. This method overcomes the low signal-to-noise ratio problem in financial time series and gets a competitive prediction accuracy of the market states 'Up' and 'Down' of financial data as tested on the S&P 500.

Full PDF

II MAGE P ROCESSING T OOLS FOR F INANCIAL T IME S ERIES C LASSIFICATION

A P

REPRINT

Bairui Du

Department of Computer ScienceFaculty of EngineeringUniversity College LondonGower Street, London, WC1E 6BT, UK. [email protected]

Delmiro Fernandez-Reyes

Department of Computer ScienceFaculty of EngineeringUniversity College LondonGower Street, London, WC1E 6BT, UK. [email protected]

Paolo Barucca ∗ Department of Computer ScienceFaculty of EngineeringUniversity College LondonGower Street, London, WC1E 6BT, UK. [email protected]

August 8, 2020 A BSTRACT

The application of deep learning to time series forecasting is one of the major challenges in presentmachine learning. We propose a novel methodology that combines machine learning and imageprocessing methods to deﬁne and predict market states with intraday ﬁnancial data. A wavelettransform is applied to the log-return of stock prices for both image extraction and denoising. Aconvolutional neural network then extracts patterns from denoised wavelet images to classify dailytime series, i.e. a market state is associated with the binary prediction of the daily close pricemovement based on the wavelet image constructed from the price changes in the ﬁrst hours of theday. This method overcomes the low signal-to-noise ratio problem in ﬁnancial time series and get acompetitive prediction accuracy of the market states ‘Up’ and ’Down’ of ﬁnancial data as tested onthe S&P 500. K eywords Continuous and discrete wavelet transform | Image processing | Financial computing | Financial time seriesclassiﬁcation | Convolutional neural network

Time series prediction is a challenge for many complex systems, yet in ﬁnance predictions are hindered by the verynature of how ﬁnancial markets work. In efﬁcient markets, the opportunities for stock price predictions leading toproﬁtable trades are supposed to rapidly disappear. In the growing industry of high-frequency trading, the competitionover extracting predictions on stock prices from the increasing amount of available information for performing proﬁtabletrades is becoming more and more severe. With the development of big data analysis and advanced deep learningmethodologies, traders hope to fruitfully analyse market information, e.g. price time series, through machine learning.Spot prices of stocks provide a simple snapshot representation of a ﬁnancial market. Stock prices ﬂuctuate over time,affected by numerous factors, and the prediction of their changes is at the core of both long-term and short-term ﬁnancialinvesting. The collective patterns of price movements are generally referred to as market states. As a paramount ∗ Corresponding Author: Paolo Barucca, Financial Computing and Analytics Group, Department of Computer Science, UniversityCollege London, Gower Street, London, WC1E 6BT, UK. Email: [email protected] a r X i v : . [ q -f i n . C P ] A ug PREPRINT - A

UGUST

8, 2020example, when stock prices follow an upward trend, it is called a bull market, and when stock prices follow a downwardtrend is called a bear market [1].However, both in bullish and bearish market trends, there are lots of noisy oscillation, requiring analysts to applynoise reduction methodologies to extract meaningful predictions over trends. The objective of this study is to test ageneral time-series prediction model which extracts a denoised wavelet image from time series in order to leverageover convolutional neural network (CNN) architectures. We apply continuous wavelet transforms to the log return ofﬁnancial time series and convert it to a greyscale wavelet transform spectrum. Then we build one shallow and one deepconvolutional neural network (CNN) model and train them with spectrum image as input to capture hidden patterns.The main novelty and contribution of our study is to deﬁne stock market states based on intraday ﬁnancial time serieswhilst also providing accurate predictions, demonstrating the ability of image processing tools to overcome the lowsignal-to-noise ratio of ﬁnancial time series, and providing a promising toolbox for the analysis of noisy time seriesfound in many complex systems.

Market states forecasting is based on the analysis of historical data [2], yet as summarised in [3], the quest for predictionsin ﬁnancial data needs to take into account a series of empirical laws of ﬁnancial markets: "Market action discountseverything; Prices move in trends; History tends to repeat itself". In this paper, we look at different ﬁnancial indicatorsand analyse them to identify patterns, trends, periods or cycles. Compared to other time series, ﬁnancial time seriesdisplay a signiﬁcant amount of uncertainty and unpredictability [4]. As the raw price time series will often contain atrend, using log-returns instead of prices is an established method to transform raw data [5, 1, 6], returns being a goodscale-free summary of the outcome of investment decisions withing a given time interval. In most quantitative ﬁnancialresearch and applications, log-returns are regarded as more tractable, as having more robust and characteristic statisticalproperties, e.g. probability distribution over a given period, and, from a practical point of view, it is possible to quicklyproduce multi-periods returns from single period ones [1, 4].Financial forecasting can be framed as a signal processing problem [5] on which neural networks can be applied toprovide testable solutions. In [5] the authors consider log-returns of stock prices, denoise the log-return time series,and apply self-organizing map (SOM) - an unsupervised learning method to learn the distribution of a set of patternswithout any class information - to make predictions.More recently, researchers have tried to apply deep learning methods such as convolutional and recurrent neuralnetworks to predict the behavior of stock markets. Advanced machine learning methods have a high representationpower for empirical asset pricing, being able to re-create arbitrarily complex non-linear multi-variate functions and notrequiring an arbitrary feature selection pre-processing that could dilute the information content of the original timeseries [7]. In [7] the authors perform a comparative analysis of different methods such as simple linear, penalisedlinear principal components regression (PCR), partial least squares (PLS), regression trees and random forests [7],and ﬁnd that regression trees have the best prediction accuracy [7]. They consider a continuous variable regressionproblem rather than a classiﬁcation problem. At odds with results in image and bio-metric pattern recognition wherethe deeper the neural network the better, they ﬁnd that shallow learning outperforms deeper learning, the reasons forthis phenomenon being (1) the low signal-to-noise ratio of ﬁnancial data and (2) the comparative scarcity of the data forthe price prediction problem [7, 6]. They also proved that, compared to the traditional prediction methods, machinelearning provides an improved description of the behavior of expected returns.In order to reduce the complexity and improve the accuracy of the forecast, we consider the stock forecasting problemnot as a regression problem, but as a classiﬁcation problem for determining a market state [1]. In [1] the authors give amodel-based clustering method which clusters ﬁnancial time series via a maximum-likelihood model. Their clusteringprocedure uses a likelihood measure adjusted for temporal coherence. This procedure is shown to be numericallyefﬁcient and suitable for high dimensional datasets, alternating (a) the update of the network structure constructed bythe TMFG-LoGo algorithm and (b) the assignment of points to clusters in a time-consistent manner through dynamicprogramming, i.e. using the Viterbi algorithm [1]. The model both identiﬁes the current market state, i.e. bull or bearmarket, and yields predictions for future market states. The ﬁnal values of accuracy for these two cases are above 50%.Our study combines the time series prediction task explored in deep learning applications in ﬁnance and the marketstate identiﬁcation problem investigated within statistical approaches to derive a predictive classiﬁcation of the intradaybehavior of ﬁnancial indices, introducing a methodology for time series analysis applicable in the broader context ofcomplex systems. 2

PREPRINT - A

UGUST

8, 2020Figure 1: Flowchart of using wavelet transform and CNN to do ﬁnancial time series prediction

Fourier transform is a powerful data analysis tool that represents any complex signal as a sum of sines and cosines andtransforms the signal from the time domain to the frequency domain [8]. Nevertheless, Fourier transform can showwhich frequencies are present in the signal, whilst it cannot show when these frequencies appear. The short-time Fouriertransform divides original signal into several parts using a sliding window to ﬁx this problem[9].Wavelet transform is a more suitable method for analysing dynamic signals, as it identiﬁes the existing frequencies inthe original signal and also when these frequencies appear and disappear by controlling the scale change of the wavelet.Therefore, wavelet transform yields an high resolution in both the frequency domain and time domain. A wavelet is arapidly decaying, wave-like oscillation that has zero mean and is localised in both time and frequency space[10]. Unlikesines waves, which extend and repeat to inﬁnity, a wavelet exists for a ﬁnite duration.Real world signals do not always change slowly, and they often oscillate or expose transient changes. In ﬁnancialtime series analysis, these abrupt changes can be associated with turning points which can be crucial for stock marketforecasting. However, high frequency noise may hinder the detection of these turning points. In most cases, noise ismodelled as Gaussian white noise [11, 1]. Fourier transform does not represent these abrupt changes efﬁciently, becauseit does not consider time-dependence in the signal decomposition. Daubechies points out that the wavelet transform canbe used to analyse time series that contain non-stationary power at many different frequencies[12].

The wavelet transform threshold denoising method was ﬁrst proposed in [13]. The basic idea of wavelet thresholddenoising is that after the signal is transformed (e.g. using Mallat algorithm) [14, 15], it is further decomposed intoapproximate coefﬁcients and detail coefﬁcients. The detail coefﬁcients are also called wavelet coefﬁcients. The waveletcoefﬁcients with larger amplitudes are assumed to be signiﬁcant for representing the original signal, while coefﬁcientswith smaller amplitudes are generally associated with noise[14]. The threshold denoising method ﬁnds a suitablethreshold, retains the wavelet coefﬁcients larger than the threshold, ﬁlters the wavelet coefﬁcients smaller than thethreshold accordingly, and then restores the denoised signal according to the processed wavelet coefﬁcients [13].3

PREPRINT - A

UGUST

8, 2020Figure 2: DWT decomposition into approximation coefﬁcients and details coefﬁcients (wavelet cofﬁcients)Wavelet denoising can also be regarded as a low-pass ﬁlter. It removes high-frequency noise while retaining thecharacteristics of the low-frequency components of the signal. Hence, wavelet denoising is a combination of featureextraction and low-pass ﬁltering. Wavelet transform has good time-frequency localization characteristics, which canpreserve relevant signal spikes and sudden changes[12, 16]. Therefore, the wavelet transform is suitable for removingtransient signals, as well as suppressing the interference of high-frequency noise, and effectively distinguishinglow-frequency information from high-frequency noise.

Decomposition x j +1 ,L [ n ] = K − (cid:88) k =0 x j,L [2 n − k ] g [ k ] x j +1 ,H [ n ] = K − (cid:88) k =0 x j,L [2 n − k ] h [ k ] (1) x [ n ] is Discrete input signal, length N. g [ n ] is a low pass ﬁlter can ﬁlter out the high frequency part of the input signaland output the low frequency part. h [ n ] is High pass ﬁlter can ﬁlters out the low frequency part and outputs the highfrequency part. The theoretical maximum decomposition level for walevet transforms is j = (cid:98) log n (cid:99) , where n is thesignal length. The larger the decomposition level, the more obvious the different characteristics of noise and signalperformance, and the more conducive to the separation of signal and noise, yet for reconstruction, the higher thenumber of decomposition levels, the greater the reconstruction error. The available maximum decomposition level isrelated to the signal-to-noise ratio (SNR) of the original signal, but the SNR cannot be obtained from the measured data.In order to avoid the loss of signal distortion and achieve the best noise reduction effect, verbatim noise veriﬁcation isperformed on the DWT detail coefﬁcients. The Daubechies wavelets, based on the work of Ingrid Daubechies [17],are a family of orthogonal wavelets deﬁning a discrete wavelet transform and characterized by a maximal number ofvanishing moments for some given support[12]. We use the db4 (Daubechies wavelet of order 4) wavelet in Fig 7 (c) asmother wavelet to do level 5 DWT. The approximation and detail coefﬁcients are shown in Figure2. The approximationcoefﬁcients represent the output of the low pass ﬁlter (averaging ﬁlter) of the DWT. The detail coefﬁcients representthe output of the high pass ﬁlter (difference ﬁlter) of the DWT. The difference between CWT and DWT is that DWTuses discrete values for the scale a and translation factor b. The DWT is only discrete in the scale and translationdomain, not in the time-domain. On the left, we can see a schematic representation of DWT apply on the signal as a lowpass ﬁlter at each level. The detail coefﬁcients of Level 1 and 2 are more in line with white Gaussian noise characteristics. Threshold Denosing and Reconstruction

Threshold denoising performs nonlinear thresholding on the wavelet transform coefﬁcients of the measured signal. Thehigh-frequency coefﬁcients of each layer from the st to the N th layer are ﬁltered by the threshold function, and the4 PREPRINT - A

UGUST

8, 2020Figure 3: 2D Wavelet Transform Spectrum with COI (Cone of Inﬂuence) after Denoisinglow-frequency coefﬁcients of each layer are left unchanged. The hard threshold 8is discontinuous at the threshold λ ,which causes the denoised signal to oscillate around it, yet a hard threshold function can perform better than a softthreshold function in terms of mean square error. In this study we apply soft thresholding, prioritising the overallcontinuity of the wavelet coefﬁcients ensured by a continuous function around the threshold 9.In the threshold processing function, the selection of the threshold λ directly affects the effect of denoising. Thereare four types of thresholds for the wavelet transform threshold denoising method: general threshold rules, minimummaximum variance threshold, Stein’s Unbiased Risk Estimate (SURE) Rules, and heuristic threshold rules. In this studywe use the rigrsure threshold method, which is an adaptive threshold selection based on the principle of Stein’s unbiasedlikelihood estimation (quadratic equation). It ﬁrst estimates the likelihood of different λ values, and then minimizes itto get the selected threshold.Then, we reconstruct the signal from the ﬁltered wavelet coefﬁcients. The signal is reconstructed based on the low-frequency coefﬁcients of the Nth layer of wavelet decomposition and the processed high-frequency coefﬁcients of theﬁrst N − layers, so as to obtain new denoised values of the original signal. The blue signals in Figure 2 Figure 3and are the original signal, and the orange signal is the signal after noise reduction. Finally, we perform the log-returntransformation on the price signal. In Figure 3 Different applications may require different mother wavelets and there are two important wavelet transform concepts tobe considered: scaling and shifting. Given a signal

Ψ ( t ) , scaling refers to the process of stretching or shrinking thesignal in time [18], which can be expressed in the following equation, Ψ (cid:18) ts (cid:19) s > (2) s is the scaling factor that represents how much the signal is rescaled in time. The scale factor is inversely proportional tofrequency. In a wavelet, there is a reciprocal relationship between scale and frequency with a constant of proportionality(COP) [19]. The mother wavelet has a characteristic frequency band. Mathematically, the equivalent frequency isdeﬁned as follows, F eq = C f sδt (3)where C f represents the center frequency, s the wavelet scale, and δt the sampling interval.5 PREPRINT - A

UGUST

8, 2020Continuous and Discrete Wavelet Transforms are two major wavelet analysis methods. CWT is mainly used in time-frequency analysis, and ﬁltering Of time localized frequency components. DWT is ideal for denoising and compressingsignals and images, as it helps represent many naturally occurring signals and images with fewer coefﬁcients[18]. Thedifference between CWT and DWT is how they discretize the scale and the translation parameters [20]. The CWT of adiscrete sequence x n is deﬁned as the convolution of X with a scaled and translated version of wavelet ψ ( η ) [18]: W n ( s ) = N − (cid:88) n (cid:48) =0 x n (cid:48) ψ ∗ (cid:20) ( n (cid:48) − n ) δts (cid:21) (4)Wavelets in Figure 7 are some of the well-known mother wavelets, and they have different sizes and shapes. We useMorlet wavelet to generate power spectrum of the denoised signals. Equation 5 can express the Morlet wavelet used inthis thesis. Morlet is a plane wave modulated by a Gaussian where the ω is non-dimensional frequency, and t is anon-dimensional "time" parameter. Ψ ( t ) = π − / e iω t e − t / (5)The output of CWT are coefﬁcients, which are a function of scale, frequency and time[20]. The Higher the numberof scales per octave the ﬁner the scale discretization. When we do CWT, each scaled wavelet is shifted in time andcompared with the original. Then repeating this process for all the scales results in coefﬁcients that are a function ofthe scales and shift distance of wavelet [18]. For example, a signal with 10000 samples analyzed with 50 scales willgenerate in 500,000 coefﬁcients. In this way, oscillatory behaviour in signals can be characterised in more detail. We consider 505 stocks in the tickers list of S&P 500, widely regarded as the best single gauge of large-cap U.S.equities[21]. First, we calculate the adjusted closing price and then we clean the data. The raw data of S&P 500 indexhave ’Open’, ’Close’, ’High’, ’Low’ prices and ’Volume’. Given that only one-year out of ten includes ’Volume’ data,we could not include it as a feature in our analysis. Some days, when the U.S. market closes early or delays opening,only have half-day data points. These points and weekends and holidays have been cleaned and deleted. Further, on afull trading day, we should have 390 data points, but 127 intraday data turn out to be incomplete. At this point, wehave to deal with invalid values and missing values, which are due to the absence of transaction data for some minutes.The easiest way to do this is to replace the invalid and missing values with the sample mean, median, or mode of avariable. This method is simple but does not adequately consider the information already in the data, and the errormay be signiﬁcant. Sequential data in ﬁnance are signiﬁcantly time-dependent. Therefore, for days when the intradaydata is missing less than 20 data points, we ﬁll NA/NaN values using the forward ﬁlling methods that propagate lastvalid observation forward to next valid. And if one-day data is missing more than 20 data points, we decide to simplynot consider the intraday data from that day. This ensures that the input variables are consistent in time. The cleanedclosing price is shown in Figure 4, the blue part of ten years price data is used as the training data, and the red part of1-year data is used as the test set. Eight of the stocks cannot be downloaded for reasons related to corporate reorganisation and name changing PREPRINT - A

UGUST

8, 2020Figure 4: Wavelet spectrum in timelineThen we consider log-returns by taking the difference of log-transformed prices at two time consequent points in timefrom the raw price time series [1][5][6]. Figure 3 (a,b) show the raw price time series and log return time series.

Figure 3 shows the wavelet power spectrum. The abscissa displays time (390 minutes) and the y-axis here is log-scaledcause of the wide range of power spectrum values. The shaded region in the image is the cone of inﬂuence (COI). Thescalogram is potentially affected by edge-effect artifacts and the unshaded region is a conﬁdence area that should not beinﬂuenced by edge effects [22, 18]. Wavelet transform is time-sensitive and provides an image representation fromwhich the convolutional neural network can learn to recognise and extract hidden patterns regarding the underlyingmarket state.

We have generated different y labels and shown in the Figure 8and Figure 5. Figure 5 is the log return between theaverage price from 1 to 360 minutes in the windows and the price at the last minute of the day. The reason we choose y mean as the classiﬁcation label, is that compared with other labels, it yields a broader distribution, which shouldtranslate into a greater margin for a good prediction from the convolutional neural network. Moreover, this label ismore practical as, compared with the forecast of y , this label has more potential investment value. Feature selection refers to the identiﬁcation of a set of prominent features for the task under study, selected according toa-priori criteria and preliminary investigations. We used the Maximum Information Coefﬁcient (MIC) method to selectthe top ﬁve training indicators as the selected input features. The mutual information (MI) of two random variablesis a measure of the mutual dependence between the two variables[23]. The formula for calculating the maximum7

PREPRINT - A

UGUST

8, 2020Figure 5: Return frequency distribution histogram of y m ean . X-axis is the return value and Y-axis is countsinformation coefﬁcient is as follows: MIC[ x ; y ] = max | X || Y

EMA

RSI MA CORREL

Table 1: The top ﬁve indicator selected by the MIC method (PPirceChangeRatio)

In this study, the market price predicting problem is treated as a classiﬁcation problem with two classes, so that theoutput of the model is simply given by two labels, ’Up’ or ’Down’, that provide a prediction for the price movementduring the last interval of a trading day. The trading time of a US stock market on a normal trading day is 390 minutes.The 30 minutes after the opening and before the close are the most dramatic and uncertain 30 minutes[24]. In thehalf-hour before the close, traders may need to close their positions, make sure they execute an order, process newinformation from the day and act more or less rationally based on the daily trend, making predictions over the last pricemovements very challenging. In our experiment we use data from the ﬁrst 360 minutes in a trading day as input topredict the closing market states, jumping beyond the unpredictable 30 minutes before the closing time. The label to’Up’ and ’Down’ is calculated by comparing the average price at 360 minutes and the closing price of the stock market. label y =  , ( (cid:80) i =1 Ψ( i )) / < Ψ(390)0 , ( (cid:80) i =1 Ψ( i )) / ≥ Ψ(390) (7)Where Ψ is the price signal. The reasons we choose CNN over other neural networks for image classiﬁcation are three speciﬁc properties. The ﬁrstproperty is locality, some patterns are much smaller than the whole image, and a set of neurons does not have to see thewhole image to discover the pattern. The second property is parameter sharing, as same patterns may appear in differentregions and these patterns may have the same shape and also yield the same parameters, i.e. network weights and biases.In CNNs neurons can share parameters to reduce their overall number. Finally, image recognition subsampling. We cansubsample the pixels reducing the number of parameters needed to process the image.8

PREPRINT - A

UGUST

8, 2020Figure 6: The prediction result of our model (the left picture) and random prediction (the right picture). Greencorresponds to a correct prediction, and Red to a wrong prediction.

From the Table2 3 below, we can see that this algorithm has a high accuracy rate compared to random prediction. Figure6 more clearly shows the prediction results of our method and random prediction on the test set. The green correspondsto correct predictions, and the red corresponds to wrong predictions.Although the true negative rate (TNR) of the denoised signal decreases, compared with the original signal, the overallprediction accuracy and F1 score improves. Taking into account the noise of ﬁnancial data and the variability of samples,different experimental designs and label choices will affect the accuracy. The performance of this model on the S&P500index is competitive.

Denoised signal Raw signalActual 1 Actual 0 Actual 1 Actual 0Predicted 1 TP = 104 FP = 67 Predicted 1 TP = 70 FP = 50Predicted 0 FN = 43 TN = 45 Predicted 0 FN = 77 TN = 62Loss 0.722588 Loss 0.893730Accuracy 0.577220 Accuracy 0.507772TPR 0.707483 TPR 0.476190TNR 0.401786 TNR 0.553571F1 score 0.654088 F1 score 0.524345

Table 2: Confusion matrix and accuracy.

Random predictionActual 1 Actual 0Predicted 1 TP = 58 FP = 50Predicted 0 FN = 89 TN = 62Loss /Accuracy 0.463320TPR 0.394558TNR 0.553571F1 score 0.454902

Table 3: Random prediction

The methodology developed in this study classiﬁes the stock market on a given day into two basic market states, ’Up’and ’Down’, providing better-than random predictions for the market state, measured as the price movement in thelast time interval of a given trading day. This study deﬁnes and addresses a speciﬁc classiﬁcation task where machinelearning can achieve superhuman performances in ﬁnance, i.e. the prediction of the ﬁnal price movement in a givenmarket day based on the return time series observed earlier in the day, denoised and wavelet transformed in order to beprocessed as an image by a convolutional neural network.The model uses discrete wavelet transform to reduce noise. Then a continuous wavelet transform is applied to thedenoised signal to generate a spectrogram. We performed the above processing for multiple indicators to obtain multiplespectrograms, and then convert them into a multi-channel 2D image as the input of a convolutional neural networkwhich predicts the ﬁnal market state of the given trading day. The model provides accurate predictions on the S&Pindex when compared with a random null model. 9

PREPRINT - A

UGUST

8, 2020The promising results observed in this challenging ﬁnancial context - with hardly predictable data and a limited setof relevant features - constitute a solid basis for further applications of this method to other noisy sequential datacharacterising complex systems. The method has been shown to overcome limitations for noisy time series with lowpredictability and could outperform other methodologies for more predictable data, such as biological and medical data,e.g. ECG signals, or weather and climate data, e.g wind speed and temperature time series.

Figure 7 shows the four mother wavelets. The Haar wavelet(a) and (c) Daubechies wavelet of order 4 are discretewavelets. (b) Gaussian wavelet of order 1 and (d) Morlet wavelet are continuous wavelets.Figure 7: Four commonly used mother: wavelets (a) Haar wavelet, (b) Gaussian wavelet of order 1, (c) Daubechieswavelet of order 4, and (d) Morlet wavelet.[25]

Hard Thresholding σ Hλ ( w ) = (cid:26) w, | w | ≥ λ , | w | < λ (8)Soft Thresholding σ sλ ( w ) = (cid:26) [sgn( w )( | w | − λ )] , | w | ≥ λ , | w | < λ (9)The w is the wavelet coefﬁcient (detail coefﬁcient). The λ is the selected threshold. Rigrsure threshold is an adaptive threshold selection using the principle of Stein’s Unbiased Risk Estimate (SURE).(1)Take the absolute value of the elements in the signal Ψ[ t ] , and then sort from small to large, square each element toget a new signal sequence f(k)[26]. f ( k ) = (sort( | Ψ | )) , ( k = 0 , , . . . , N − (10)(2)If the threshold is the square root of the element of the k-th element of f ( k ) , λ k = (cid:112) f ( k ) , ( k = 0 , , . . . , N − (11)the risk generated by the threshold is Rish( k ) = (cid:34) N − k + k (cid:88) i =1 f ( j ) + ( N − k ) f ( N − k ) (cid:35) /N (12)10 PREPRINT - A

UGUST

8, 2020(3)According to the obtained risk curve

Risk ( k ) , take the k m in corresponding to the minimum risk point lock, then therigrsure threshold is λ k = (cid:112) f ( k min ) (13) Figure 8 shows the return distributions for four different label designs. y − : the 360th minute price compare withthe 361th minute price. y − mean : the 360th minute price compare the average price of the last 30 minutes. y − :the 360th minute price compare with the 390th minutes market closing price. y mean − mean : the average price from 1stminutes to 360th minutes compare with the average price of the last 30 minutesFigure 8: The return frequency distributions another four label designs. References [1] Pier Francesco Procacci and Tomaso Aste. Forecasting market states.

Quantitative Finance , 19(9):1491–1498,2019.[2] Sreelekshmy Selvin, R Vinayakumar, EA Gopalakrishnan, Vijay Krishna Menon, and KP Soman. Stock priceprediction using lstm, rnn and cnn-sliding window model. In , pages 1643–1647. IEEE, 2017.[3] James J Murphy.

Technical Analysis of the Financial Markets . New York Institute of Finance, 1999.[4] Ruey Tsay.

Financial Time Series and Their Characteristics , pages 1–27. Wiley, Hoboken New Jersey, 08 2010.[5] C Lee Giles, Steve Lawrence, and Ah Chung Tsoi. Noisy time series prediction using recurrent neural networksand grammatical inference.

Machine learning , 44(1-2):161–183, 2001.[6] Ashwin Siripurapu. Convolutional networks for stock trading.

Stanford Univ Dep Comput Sci , 1(2):1–6, 2014.[7] Shihao Gu, Bryan Kelly, and Dacheng Xiu. Empirical asset pricing via machine learning. Technical report,National Bureau of Economic Research, 2018.[8] M Portnoff. Time-frequency representation of digital signals and systems based on short-time fourier analysis.

IEEE Transactions on Acoustics, Speech, and Signal Processing , 28(1):55–69, 1980.[9] Daniel Grifﬁn and Jae Lim. Signal estimation from modiﬁed short-time fourier transform.

IEEE Transactions onAcoustics, Speech, and Signal Processing , 32(2):236–243, 1984.[10] Marie Farge. Wavelet transforms and their applications to turbulence.

Annual review of ﬂuid mechanics , 24(1):395–458, 1992.[11] Francis X Diebold.

Elements of forecasting . Citeseer, 1998.11

PREPRINT - A

UGUST

8, 2020[12] Ingrid Daubechies. The wavelet transform, time-frequency localization and signal analysis.

IEEE transactions oninformation theory , 36(5):961–1005, 1990.[13] David L Donoho and Jain M Johnstone. Ideal spatial adaptation by wavelet shrinkage. biometrika , 81(3):425–455,1994.[14] Mark J Shensa. The discrete wavelet transform: wedding the a trous and mallat algorithms.

IEEE Transactions onsignal processing , 40(10):2464–2482, 1992.[15] David L Donoho and Iain M Johnstone. Adapting to unknown smoothness via wavelet shrinkage.

Journal of theamerican statistical association , 90(432):1200–1224, 1995.[16] Z Tufekci and John N Gowdy. Feature extraction using discrete wavelet transform for speech recognition. In

Proceedings of the IEEE SoutheastCon 2000.’Preparing for The New Millennium’(Cat. No. 00CH37105) , pages116–123. IEEE, 2000.[17] Ingrid Daubechies. Orthonormal bases of compactly supported wavelets.

Communications on pure and appliedmathematics , 41(7):909–996, 1988.[18] Christopher Torrence and Gilbert P Compo. A practical guide to wavelet analysis.

Bulletin of the AmericanMeteorological society , 79(1):61–78, 1998.[19] Prabhishek Singh and Raj Shree. Statistical quality analysis of wavelet based sar images in despeckling process.

Asian J. Electrical Sci.(AJES) , 6(2):1–18, 2017.[20] Marc Antonini, Michel Barlaud, Pierre Mathieu, and Ingrid Daubechies. Image coding using wavelet transform.

IEEE Transactions on image processing , 1(2):205–220, 1992.[21] S&P Dow Jones Indices. S&p us indices methodology, 2019.[22] Aslak Grinsted, John C Moore, and Svetlana Jevrejeva. Application of the cross wavelet transform and waveletcoherence to geophysical time series.

Nonlinear processes in geophysics , 11(5/6):561–566, 2004.[23] Ehsan Asgarian, Mohsen Kahani, and Shahla Shariﬁ. The impact of sentiment features on the sentiment polarityclassiﬁcation in persian reviews.

Cognitive Computation , 10(1):117–135, 2018.[24] Jean-Philippe Bouchaud, Marc Mézard, Marc Potters, et al. Statistical properties of stock order books: empiricalresults and models.

Quantitative ﬁnance , 2(4):251–256, 2002.[25] Jack W Baker. Quantitative classiﬁcation of near-fault ground motions using wavelet analysis.

Bulletin of theSeismological Society of America , 97(5):1486–1501, 2007.[26] Daniel Valencia, David Orejuela, Jeferson Salazar, and Jose Valencia. Comparison analysis between rigrsure,sqtwolog, heursure and minimaxi techniques using hard and soft thresholding methods. In2016 XXI Symposiumon Signal Processing, Images and Artiﬁcial Vision (STSIVA)

Related Researches

Variational Autoencoders: A Hands-Off Approach to Volatility

by Maxime Bergeron

Deep Hedging under Rough Volatility

by Blanka Horvath

A deep learning model for gas storage optimization

by Nicolas Curin

Deep Equal Risk Pricing of Financial Derivatives with Multiple Hedging Instruments

by Alexandre Carbonneau

A Stochastic Time Series Model for Predicting Financial Trends using NLP

by Pratyush Muthukumar

Multilayer heat equations: application to finance

by A. Itkin

Portfolio Performance Attribution via Shapley Value

by Nicholas Moehle

Surrogate Monte Carlo

by A. Christian Silva

Flashot: A Snapshot of Flash Loan Attack on DeFi Ecosystem

by Yixin Cao

Improved ACD-based financial trade durations prediction leveraging LSTM networks and Attention Mechanism

by Yong Shi

Optimal control of the decumulation of a retirement portfolio with variable spending and dynamic asset allocation

by Peter A. Forsyth

Sample path generation of the stochastic volatility CGMY process and its application to path-dependent option pricing

by Young Shin Kim

Extensive networks would eliminate the demand for pricing formulas

by Jaegi Jeon

Least Squares Monte Carlo applied to Dynamic Monetary Utility Functions

by Hampus Engsner

Day-ahead electricity prices prediction applying hybrid models of LSTM-based deep learning methods and feature selection algorithms under consideration of market coupling

by Wei Li

Pricing spread option with liquidity adjustments

by Kevin Shuai Zhang

A Deep Learning Approach for Dynamic Balance Sheet Stress Testing

by Anastasios Petropoulos

Semi-analytic pricing of double barrier options with time-dependent barriers and rebates at hit

by Andrey Itkin

Weak error rates for option pricing under the rough Bergomi model

by Christian Bayer

Explicit solution simulation method for the 3/2 model

by Iro René Kouarfate

Short dated smile under Rough Volatility: asymptotics and numerics

by Peter K. Friz

Solving the Optimal Trading Trajectory Problem Using Simulated Bifurcation

by Kyle Steinhauer

Mean-variance portfolio selection with tracking error penalization

by William Lefebvre

Using Machine Learning and Alternative Data to Predict Movements in Market Risk

by Thomas Dierckx

The impact of social influence in Australian real-estate: market forecasting with a spatial agent-based model

by Benjamin Patrick Evans

«

1

2

3

4

»

Submitted on 13 Aug 2020 (v1), last revised 18 Aug 2020 (this version, v2) Updated

arXiv.org Original Source

NASA ADS

Google Scholar

Semantic Scholar