Image Processing Tools for Financial Time Series Classification
II MAGE P ROCESSING T OOLS FOR F INANCIAL T IME S ERIES C LASSIFICATION
A P
REPRINT
Bairui Du
Department of Computer ScienceFaculty of EngineeringUniversity College LondonGower Street, London, WC1E 6BT, UK. [email protected]
Delmiro Fernandez-Reyes
Department of Computer ScienceFaculty of EngineeringUniversity College LondonGower Street, London, WC1E 6BT, UK. [email protected]
Paolo Barucca ∗ Department of Computer ScienceFaculty of EngineeringUniversity College LondonGower Street, London, WC1E 6BT, UK. [email protected]
August 8, 2020 A BSTRACT
The application of deep learning to time series forecasting is one of the major challenges in presentmachine learning. We propose a novel methodology that combines machine learning and imageprocessing methods to define and predict market states with intraday financial data. A wavelettransform is applied to the log-return of stock prices for both image extraction and denoising. Aconvolutional neural network then extracts patterns from denoised wavelet images to classify dailytime series, i.e. a market state is associated with the binary prediction of the daily close pricemovement based on the wavelet image constructed from the price changes in the first hours of theday. This method overcomes the low signal-to-noise ratio problem in financial time series and get acompetitive prediction accuracy of the market states ‘Up’ and ’Down’ of financial data as tested onthe S&P 500. K eywords Continuous and discrete wavelet transform | Image processing | Financial computing | Financial time seriesclassification | Convolutional neural network
Time series prediction is a challenge for many complex systems, yet in finance predictions are hindered by the verynature of how financial markets work. In efficient markets, the opportunities for stock price predictions leading toprofitable trades are supposed to rapidly disappear. In the growing industry of high-frequency trading, the competitionover extracting predictions on stock prices from the increasing amount of available information for performing profitabletrades is becoming more and more severe. With the development of big data analysis and advanced deep learningmethodologies, traders hope to fruitfully analyse market information, e.g. price time series, through machine learning.Spot prices of stocks provide a simple snapshot representation of a financial market. Stock prices fluctuate over time,affected by numerous factors, and the prediction of their changes is at the core of both long-term and short-term financialinvesting. The collective patterns of price movements are generally referred to as market states. As a paramount ∗ Corresponding Author: Paolo Barucca, Financial Computing and Analytics Group, Department of Computer Science, UniversityCollege London, Gower Street, London, WC1E 6BT, UK. Email: [email protected] a r X i v : . [ q -f i n . C P ] A ug PREPRINT - A
UGUST
8, 2020example, when stock prices follow an upward trend, it is called a bull market, and when stock prices follow a downwardtrend is called a bear market [1].However, both in bullish and bearish market trends, there are lots of noisy oscillation, requiring analysts to applynoise reduction methodologies to extract meaningful predictions over trends. The objective of this study is to test ageneral time-series prediction model which extracts a denoised wavelet image from time series in order to leverageover convolutional neural network (CNN) architectures. We apply continuous wavelet transforms to the log return offinancial time series and convert it to a greyscale wavelet transform spectrum. Then we build one shallow and one deepconvolutional neural network (CNN) model and train them with spectrum image as input to capture hidden patterns.The main novelty and contribution of our study is to define stock market states based on intraday financial time serieswhilst also providing accurate predictions, demonstrating the ability of image processing tools to overcome the lowsignal-to-noise ratio of financial time series, and providing a promising toolbox for the analysis of noisy time seriesfound in many complex systems.
Market states forecasting is based on the analysis of historical data [2], yet as summarised in [3], the quest for predictionsin financial data needs to take into account a series of empirical laws of financial markets: "Market action discountseverything; Prices move in trends; History tends to repeat itself". In this paper, we look at different financial indicatorsand analyse them to identify patterns, trends, periods or cycles. Compared to other time series, financial time seriesdisplay a significant amount of uncertainty and unpredictability [4]. As the raw price time series will often contain atrend, using log-returns instead of prices is an established method to transform raw data [5, 1, 6], returns being a goodscale-free summary of the outcome of investment decisions withing a given time interval. In most quantitative financialresearch and applications, log-returns are regarded as more tractable, as having more robust and characteristic statisticalproperties, e.g. probability distribution over a given period, and, from a practical point of view, it is possible to quicklyproduce multi-periods returns from single period ones [1, 4].Financial forecasting can be framed as a signal processing problem [5] on which neural networks can be applied toprovide testable solutions. In [5] the authors consider log-returns of stock prices, denoise the log-return time series,and apply self-organizing map (SOM) - an unsupervised learning method to learn the distribution of a set of patternswithout any class information - to make predictions.More recently, researchers have tried to apply deep learning methods such as convolutional and recurrent neuralnetworks to predict the behavior of stock markets. Advanced machine learning methods have a high representationpower for empirical asset pricing, being able to re-create arbitrarily complex non-linear multi-variate functions and notrequiring an arbitrary feature selection pre-processing that could dilute the information content of the original timeseries [7]. In [7] the authors perform a comparative analysis of different methods such as simple linear, penalisedlinear principal components regression (PCR), partial least squares (PLS), regression trees and random forests [7],and find that regression trees have the best prediction accuracy [7]. They consider a continuous variable regressionproblem rather than a classification problem. At odds with results in image and bio-metric pattern recognition wherethe deeper the neural network the better, they find that shallow learning outperforms deeper learning, the reasons forthis phenomenon being (1) the low signal-to-noise ratio of financial data and (2) the comparative scarcity of the data forthe price prediction problem [7, 6]. They also proved that, compared to the traditional prediction methods, machinelearning provides an improved description of the behavior of expected returns.In order to reduce the complexity and improve the accuracy of the forecast, we consider the stock forecasting problemnot as a regression problem, but as a classification problem for determining a market state [1]. In [1] the authors give amodel-based clustering method which clusters financial time series via a maximum-likelihood model. Their clusteringprocedure uses a likelihood measure adjusted for temporal coherence. This procedure is shown to be numericallyefficient and suitable for high dimensional datasets, alternating (a) the update of the network structure constructed bythe TMFG-LoGo algorithm and (b) the assignment of points to clusters in a time-consistent manner through dynamicprogramming, i.e. using the Viterbi algorithm [1]. The model both identifies the current market state, i.e. bull or bearmarket, and yields predictions for future market states. The final values of accuracy for these two cases are above 50%.Our study combines the time series prediction task explored in deep learning applications in finance and the marketstate identification problem investigated within statistical approaches to derive a predictive classification of the intradaybehavior of financial indices, introducing a methodology for time series analysis applicable in the broader context ofcomplex systems. 2
PREPRINT - A
UGUST
8, 2020Figure 1: Flowchart of using wavelet transform and CNN to do financial time series prediction
Fourier transform is a powerful data analysis tool that represents any complex signal as a sum of sines and cosines andtransforms the signal from the time domain to the frequency domain [8]. Nevertheless, Fourier transform can showwhich frequencies are present in the signal, whilst it cannot show when these frequencies appear. The short-time Fouriertransform divides original signal into several parts using a sliding window to fix this problem[9].Wavelet transform is a more suitable method for analysing dynamic signals, as it identifies the existing frequencies inthe original signal and also when these frequencies appear and disappear by controlling the scale change of the wavelet.Therefore, wavelet transform yields an high resolution in both the frequency domain and time domain. A wavelet is arapidly decaying, wave-like oscillation that has zero mean and is localised in both time and frequency space[10]. Unlikesines waves, which extend and repeat to infinity, a wavelet exists for a finite duration.Real world signals do not always change slowly, and they often oscillate or expose transient changes. In financialtime series analysis, these abrupt changes can be associated with turning points which can be crucial for stock marketforecasting. However, high frequency noise may hinder the detection of these turning points. In most cases, noise ismodelled as Gaussian white noise [11, 1]. Fourier transform does not represent these abrupt changes efficiently, becauseit does not consider time-dependence in the signal decomposition. Daubechies points out that the wavelet transform canbe used to analyse time series that contain non-stationary power at many different frequencies[12].
The wavelet transform threshold denoising method was first proposed in [13]. The basic idea of wavelet thresholddenoising is that after the signal is transformed (e.g. using Mallat algorithm) [14, 15], it is further decomposed intoapproximate coefficients and detail coefficients. The detail coefficients are also called wavelet coefficients. The waveletcoefficients with larger amplitudes are assumed to be significant for representing the original signal, while coefficientswith smaller amplitudes are generally associated with noise[14]. The threshold denoising method finds a suitablethreshold, retains the wavelet coefficients larger than the threshold, filters the wavelet coefficients smaller than thethreshold accordingly, and then restores the denoised signal according to the processed wavelet coefficients [13].3
PREPRINT - A
UGUST
8, 2020Figure 2: DWT decomposition into approximation coefficients and details coefficients (wavelet cofficients)Wavelet denoising can also be regarded as a low-pass filter. It removes high-frequency noise while retaining thecharacteristics of the low-frequency components of the signal. Hence, wavelet denoising is a combination of featureextraction and low-pass filtering. Wavelet transform has good time-frequency localization characteristics, which canpreserve relevant signal spikes and sudden changes[12, 16]. Therefore, the wavelet transform is suitable for removingtransient signals, as well as suppressing the interference of high-frequency noise, and effectively distinguishinglow-frequency information from high-frequency noise.
Decomposition x j +1 ,L [ n ] = K − (cid:88) k =0 x j,L [2 n − k ] g [ k ] x j +1 ,H [ n ] = K − (cid:88) k =0 x j,L [2 n − k ] h [ k ] (1) x [ n ] is Discrete input signal, length N. g [ n ] is a low pass filter can filter out the high frequency part of the input signaland output the low frequency part. h [ n ] is High pass filter can filters out the low frequency part and outputs the highfrequency part. The theoretical maximum decomposition level for walevet transforms is j = (cid:98) log n (cid:99) , where n is thesignal length. The larger the decomposition level, the more obvious the different characteristics of noise and signalperformance, and the more conducive to the separation of signal and noise, yet for reconstruction, the higher thenumber of decomposition levels, the greater the reconstruction error. The available maximum decomposition level isrelated to the signal-to-noise ratio (SNR) of the original signal, but the SNR cannot be obtained from the measured data.In order to avoid the loss of signal distortion and achieve the best noise reduction effect, verbatim noise verification isperformed on the DWT detail coefficients. The Daubechies wavelets, based on the work of Ingrid Daubechies [17],are a family of orthogonal wavelets defining a discrete wavelet transform and characterized by a maximal number ofvanishing moments for some given support[12]. We use the db4 (Daubechies wavelet of order 4) wavelet in Fig 7 (c) asmother wavelet to do level 5 DWT. The approximation and detail coefficients are shown in Figure2. The approximationcoefficients represent the output of the low pass filter (averaging filter) of the DWT. The detail coefficients representthe output of the high pass filter (difference filter) of the DWT. The difference between CWT and DWT is that DWTuses discrete values for the scale a and translation factor b. The DWT is only discrete in the scale and translationdomain, not in the time-domain. On the left, we can see a schematic representation of DWT apply on the signal as a lowpass filter at each level. The detail coefficients of Level 1 and 2 are more in line with white Gaussian noise characteristics. Threshold Denosing and Reconstruction
Threshold denoising performs nonlinear thresholding on the wavelet transform coefficients of the measured signal. Thehigh-frequency coefficients of each layer from the st to the N th layer are filtered by the threshold function, and the4 PREPRINT - A
UGUST
8, 2020Figure 3: 2D Wavelet Transform Spectrum with COI (Cone of Influence) after Denoisinglow-frequency coefficients of each layer are left unchanged. The hard threshold 8is discontinuous at the threshold λ ,which causes the denoised signal to oscillate around it, yet a hard threshold function can perform better than a softthreshold function in terms of mean square error. In this study we apply soft thresholding, prioritising the overallcontinuity of the wavelet coefficients ensured by a continuous function around the threshold 9.In the threshold processing function, the selection of the threshold λ directly affects the effect of denoising. Thereare four types of thresholds for the wavelet transform threshold denoising method: general threshold rules, minimummaximum variance threshold, Stein’s Unbiased Risk Estimate (SURE) Rules, and heuristic threshold rules. In this studywe use the rigrsure threshold method, which is an adaptive threshold selection based on the principle of Stein’s unbiasedlikelihood estimation (quadratic equation). It first estimates the likelihood of different λ values, and then minimizes itto get the selected threshold.Then, we reconstruct the signal from the filtered wavelet coefficients. The signal is reconstructed based on the low-frequency coefficients of the Nth layer of wavelet decomposition and the processed high-frequency coefficients of thefirst N − layers, so as to obtain new denoised values of the original signal. The blue signals in Figure 2 Figure 3and are the original signal, and the orange signal is the signal after noise reduction. Finally, we perform the log-returntransformation on the price signal. In Figure 3 Different applications may require different mother wavelets and there are two important wavelet transform concepts tobe considered: scaling and shifting. Given a signal
Ψ ( t ) , scaling refers to the process of stretching or shrinking thesignal in time [18], which can be expressed in the following equation, Ψ (cid:18) ts (cid:19) s > (2) s is the scaling factor that represents how much the signal is rescaled in time. The scale factor is inversely proportional tofrequency. In a wavelet, there is a reciprocal relationship between scale and frequency with a constant of proportionality(COP) [19]. The mother wavelet has a characteristic frequency band. Mathematically, the equivalent frequency isdefined as follows, F eq = C f sδt (3)where C f represents the center frequency, s the wavelet scale, and δt the sampling interval.5 PREPRINT - A
UGUST
8, 2020Continuous and Discrete Wavelet Transforms are two major wavelet analysis methods. CWT is mainly used in time-frequency analysis, and filtering Of time localized frequency components. DWT is ideal for denoising and compressingsignals and images, as it helps represent many naturally occurring signals and images with fewer coefficients[18]. Thedifference between CWT and DWT is how they discretize the scale and the translation parameters [20]. The CWT of adiscrete sequence x n is defined as the convolution of X with a scaled and translated version of wavelet ψ ( η ) [18]: W n ( s ) = N − (cid:88) n (cid:48) =0 x n (cid:48) ψ ∗ (cid:20) ( n (cid:48) − n ) δts (cid:21) (4)Wavelets in Figure 7 are some of the well-known mother wavelets, and they have different sizes and shapes. We useMorlet wavelet to generate power spectrum of the denoised signals. Equation 5 can express the Morlet wavelet used inthis thesis. Morlet is a plane wave modulated by a Gaussian where the ω is non-dimensional frequency, and t is anon-dimensional "time" parameter. Ψ ( t ) = π − / e iω t e − t / (5)The output of CWT are coefficients, which are a function of scale, frequency and time[20]. The Higher the numberof scales per octave the finer the scale discretization. When we do CWT, each scaled wavelet is shifted in time andcompared with the original. Then repeating this process for all the scales results in coefficients that are a function ofthe scales and shift distance of wavelet [18]. For example, a signal with 10000 samples analyzed with 50 scales willgenerate in 500,000 coefficients. In this way, oscillatory behaviour in signals can be characterised in more detail. We consider 505 stocks in the tickers list of S&P 500, widely regarded as the best single gauge of large-cap U.S.equities[21]. First, we calculate the adjusted closing price and then we clean the data. The raw data of S&P 500 indexhave ’Open’, ’Close’, ’High’, ’Low’ prices and ’Volume’. Given that only one-year out of ten includes ’Volume’ data,we could not include it as a feature in our analysis. Some days, when the U.S. market closes early or delays opening,only have half-day data points. These points and weekends and holidays have been cleaned and deleted. Further, on afull trading day, we should have 390 data points, but 127 intraday data turn out to be incomplete. At this point, wehave to deal with invalid values and missing values, which are due to the absence of transaction data for some minutes.The easiest way to do this is to replace the invalid and missing values with the sample mean, median, or mode of avariable. This method is simple but does not adequately consider the information already in the data, and the errormay be significant. Sequential data in finance are significantly time-dependent. Therefore, for days when the intradaydata is missing less than 20 data points, we fill NA/NaN values using the forward filling methods that propagate lastvalid observation forward to next valid. And if one-day data is missing more than 20 data points, we decide to simplynot consider the intraday data from that day. This ensures that the input variables are consistent in time. The cleanedclosing price is shown in Figure 4, the blue part of ten years price data is used as the training data, and the red part of1-year data is used as the test set. Eight of the stocks cannot be downloaded for reasons related to corporate reorganisation and name changing PREPRINT - A
UGUST
8, 2020Figure 4: Wavelet spectrum in timelineThen we consider log-returns by taking the difference of log-transformed prices at two time consequent points in timefrom the raw price time series [1][5][6]. Figure 3 (a,b) show the raw price time series and log return time series.
Figure 3 shows the wavelet power spectrum. The abscissa displays time (390 minutes) and the y-axis here is log-scaledcause of the wide range of power spectrum values. The shaded region in the image is the cone of influence (COI). Thescalogram is potentially affected by edge-effect artifacts and the unshaded region is a confidence area that should not beinfluenced by edge effects [22, 18]. Wavelet transform is time-sensitive and provides an image representation fromwhich the convolutional neural network can learn to recognise and extract hidden patterns regarding the underlyingmarket state.
We have generated different y labels and shown in the Figure 8and Figure 5. Figure 5 is the log return between theaverage price from 1 to 360 minutes in the windows and the price at the last minute of the day. The reason we choose y mean as the classification label, is that compared with other labels, it yields a broader distribution, which shouldtranslate into a greater margin for a good prediction from the convolutional neural network. Moreover, this label ismore practical as, compared with the forecast of y , this label has more potential investment value. Feature selection refers to the identification of a set of prominent features for the task under study, selected according toa-priori criteria and preliminary investigations. We used the Maximum Information Coefficient (MIC) method to selectthe top five training indicators as the selected input features. The mutual information (MI) of two random variablesis a measure of the mutual dependence between the two variables[23]. The formula for calculating the maximum7
PREPRINT - A
UGUST
8, 2020Figure 5: Return frequency distribution histogram of y m ean . X-axis is the return value and Y-axis is countsinformation coefficient is as follows: MIC[ x ; y ] = max | X || Y
EMA
RSI MA CORREL
Table 1: The top five indicator selected by the MIC method (PPirceChangeRatio)
In this study, the market price predicting problem is treated as a classification problem with two classes, so that theoutput of the model is simply given by two labels, ’Up’ or ’Down’, that provide a prediction for the price movementduring the last interval of a trading day. The trading time of a US stock market on a normal trading day is 390 minutes.The 30 minutes after the opening and before the close are the most dramatic and uncertain 30 minutes[24]. In thehalf-hour before the close, traders may need to close their positions, make sure they execute an order, process newinformation from the day and act more or less rationally based on the daily trend, making predictions over the last pricemovements very challenging. In our experiment we use data from the first 360 minutes in a trading day as input topredict the closing market states, jumping beyond the unpredictable 30 minutes before the closing time. The label to’Up’ and ’Down’ is calculated by comparing the average price at 360 minutes and the closing price of the stock market. label y = , ( (cid:80) i =1 Ψ( i )) / < Ψ(390)0 , ( (cid:80) i =1 Ψ( i )) / ≥ Ψ(390) (7)Where Ψ is the price signal. The reasons we choose CNN over other neural networks for image classification are three specific properties. The firstproperty is locality, some patterns are much smaller than the whole image, and a set of neurons does not have to see thewhole image to discover the pattern. The second property is parameter sharing, as same patterns may appear in differentregions and these patterns may have the same shape and also yield the same parameters, i.e. network weights and biases.In CNNs neurons can share parameters to reduce their overall number. Finally, image recognition subsampling. We cansubsample the pixels reducing the number of parameters needed to process the image.8
PREPRINT - A
UGUST
8, 2020Figure 6: The prediction result of our model (the left picture) and random prediction (the right picture). Greencorresponds to a correct prediction, and Red to a wrong prediction.
From the Table2 3 below, we can see that this algorithm has a high accuracy rate compared to random prediction. Figure6 more clearly shows the prediction results of our method and random prediction on the test set. The green correspondsto correct predictions, and the red corresponds to wrong predictions.Although the true negative rate (TNR) of the denoised signal decreases, compared with the original signal, the overallprediction accuracy and F1 score improves. Taking into account the noise of financial data and the variability of samples,different experimental designs and label choices will affect the accuracy. The performance of this model on the S&P500index is competitive.
Denoised signal Raw signalActual 1 Actual 0 Actual 1 Actual 0Predicted 1 TP = 104 FP = 67 Predicted 1 TP = 70 FP = 50Predicted 0 FN = 43 TN = 45 Predicted 0 FN = 77 TN = 62Loss 0.722588 Loss 0.893730Accuracy 0.577220 Accuracy 0.507772TPR 0.707483 TPR 0.476190TNR 0.401786 TNR 0.553571F1 score 0.654088 F1 score 0.524345
Table 2: Confusion matrix and accuracy.
Random predictionActual 1 Actual 0Predicted 1 TP = 58 FP = 50Predicted 0 FN = 89 TN = 62Loss /Accuracy 0.463320TPR 0.394558TNR 0.553571F1 score 0.454902
Table 3: Random prediction
The methodology developed in this study classifies the stock market on a given day into two basic market states, ’Up’and ’Down’, providing better-than random predictions for the market state, measured as the price movement in thelast time interval of a given trading day. This study defines and addresses a specific classification task where machinelearning can achieve superhuman performances in finance, i.e. the prediction of the final price movement in a givenmarket day based on the return time series observed earlier in the day, denoised and wavelet transformed in order to beprocessed as an image by a convolutional neural network.The model uses discrete wavelet transform to reduce noise. Then a continuous wavelet transform is applied to thedenoised signal to generate a spectrogram. We performed the above processing for multiple indicators to obtain multiplespectrograms, and then convert them into a multi-channel 2D image as the input of a convolutional neural networkwhich predicts the final market state of the given trading day. The model provides accurate predictions on the S&Pindex when compared with a random null model. 9
PREPRINT - A
UGUST
8, 2020The promising results observed in this challenging financial context - with hardly predictable data and a limited setof relevant features - constitute a solid basis for further applications of this method to other noisy sequential datacharacterising complex systems. The method has been shown to overcome limitations for noisy time series with lowpredictability and could outperform other methodologies for more predictable data, such as biological and medical data,e.g. ECG signals, or weather and climate data, e.g wind speed and temperature time series.
Figure 7 shows the four mother wavelets. The Haar wavelet(a) and (c) Daubechies wavelet of order 4 are discretewavelets. (b) Gaussian wavelet of order 1 and (d) Morlet wavelet are continuous wavelets.Figure 7: Four commonly used mother: wavelets (a) Haar wavelet, (b) Gaussian wavelet of order 1, (c) Daubechieswavelet of order 4, and (d) Morlet wavelet.[25]
Hard Thresholding σ Hλ ( w ) = (cid:26) w, | w | ≥ λ , | w | < λ (8)Soft Thresholding σ sλ ( w ) = (cid:26) [sgn( w )( | w | − λ )] , | w | ≥ λ , | w | < λ (9)The w is the wavelet coefficient (detail coefficient). The λ is the selected threshold. Rigrsure threshold is an adaptive threshold selection using the principle of Stein’s Unbiased Risk Estimate (SURE).(1)Take the absolute value of the elements in the signal Ψ[ t ] , and then sort from small to large, square each element toget a new signal sequence f(k)[26]. f ( k ) = (sort( | Ψ | )) , ( k = 0 , , . . . , N − (10)(2)If the threshold is the square root of the element of the k-th element of f ( k ) , λ k = (cid:112) f ( k ) , ( k = 0 , , . . . , N − (11)the risk generated by the threshold is Rish( k ) = (cid:34) N − k + k (cid:88) i =1 f ( j ) + ( N − k ) f ( N − k ) (cid:35) /N (12)10 PREPRINT - A
UGUST
8, 2020(3)According to the obtained risk curve
Risk ( k ) , take the k m in corresponding to the minimum risk point lock, then therigrsure threshold is λ k = (cid:112) f ( k min ) (13) Figure 8 shows the return distributions for four different label designs. y − : the 360th minute price compare withthe 361th minute price. y − mean : the 360th minute price compare the average price of the last 30 minutes. y − :the 360th minute price compare with the 390th minutes market closing price. y mean − mean : the average price from 1stminutes to 360th minutes compare with the average price of the last 30 minutesFigure 8: The return frequency distributions another four label designs. References [1] Pier Francesco Procacci and Tomaso Aste. Forecasting market states.
Quantitative Finance , 19(9):1491–1498,2019.[2] Sreelekshmy Selvin, R Vinayakumar, EA Gopalakrishnan, Vijay Krishna Menon, and KP Soman. Stock priceprediction using lstm, rnn and cnn-sliding window model. In , pages 1643–1647. IEEE, 2017.[3] James J Murphy.
Technical Analysis of the Financial Markets . New York Institute of Finance, 1999.[4] Ruey Tsay.
Financial Time Series and Their Characteristics , pages 1–27. Wiley, Hoboken New Jersey, 08 2010.[5] C Lee Giles, Steve Lawrence, and Ah Chung Tsoi. Noisy time series prediction using recurrent neural networksand grammatical inference.
Machine learning , 44(1-2):161–183, 2001.[6] Ashwin Siripurapu. Convolutional networks for stock trading.
Stanford Univ Dep Comput Sci , 1(2):1–6, 2014.[7] Shihao Gu, Bryan Kelly, and Dacheng Xiu. Empirical asset pricing via machine learning. Technical report,National Bureau of Economic Research, 2018.[8] M Portnoff. Time-frequency representation of digital signals and systems based on short-time fourier analysis.
IEEE Transactions on Acoustics, Speech, and Signal Processing , 28(1):55–69, 1980.[9] Daniel Griffin and Jae Lim. Signal estimation from modified short-time fourier transform.
IEEE Transactions onAcoustics, Speech, and Signal Processing , 32(2):236–243, 1984.[10] Marie Farge. Wavelet transforms and their applications to turbulence.
Annual review of fluid mechanics , 24(1):395–458, 1992.[11] Francis X Diebold.
Elements of forecasting . Citeseer, 1998.11
PREPRINT - A
UGUST
8, 2020[12] Ingrid Daubechies. The wavelet transform, time-frequency localization and signal analysis.
IEEE transactions oninformation theory , 36(5):961–1005, 1990.[13] David L Donoho and Jain M Johnstone. Ideal spatial adaptation by wavelet shrinkage. biometrika , 81(3):425–455,1994.[14] Mark J Shensa. The discrete wavelet transform: wedding the a trous and mallat algorithms.
IEEE Transactions onsignal processing , 40(10):2464–2482, 1992.[15] David L Donoho and Iain M Johnstone. Adapting to unknown smoothness via wavelet shrinkage.
Journal of theamerican statistical association , 90(432):1200–1224, 1995.[16] Z Tufekci and John N Gowdy. Feature extraction using discrete wavelet transform for speech recognition. In
Proceedings of the IEEE SoutheastCon 2000.’Preparing for The New Millennium’(Cat. No. 00CH37105) , pages116–123. IEEE, 2000.[17] Ingrid Daubechies. Orthonormal bases of compactly supported wavelets.
Communications on pure and appliedmathematics , 41(7):909–996, 1988.[18] Christopher Torrence and Gilbert P Compo. A practical guide to wavelet analysis.
Bulletin of the AmericanMeteorological society , 79(1):61–78, 1998.[19] Prabhishek Singh and Raj Shree. Statistical quality analysis of wavelet based sar images in despeckling process.
Asian J. Electrical Sci.(AJES) , 6(2):1–18, 2017.[20] Marc Antonini, Michel Barlaud, Pierre Mathieu, and Ingrid Daubechies. Image coding using wavelet transform.
IEEE Transactions on image processing , 1(2):205–220, 1992.[21] S&P Dow Jones Indices. S&p us indices methodology, 2019.[22] Aslak Grinsted, John C Moore, and Svetlana Jevrejeva. Application of the cross wavelet transform and waveletcoherence to geophysical time series.
Nonlinear processes in geophysics , 11(5/6):561–566, 2004.[23] Ehsan Asgarian, Mohsen Kahani, and Shahla Sharifi. The impact of sentiment features on the sentiment polarityclassification in persian reviews.
Cognitive Computation , 10(1):117–135, 2018.[24] Jean-Philippe Bouchaud, Marc Mézard, Marc Potters, et al. Statistical properties of stock order books: empiricalresults and models.
Quantitative finance , 2(4):251–256, 2002.[25] Jack W Baker. Quantitative classification of near-fault ground motions using wavelet analysis.
Bulletin of theSeismological Society of America , 97(5):1486–1501, 2007.[26] Daniel Valencia, David Orejuela, Jeferson Salazar, and Jose Valencia. Comparison analysis between rigrsure,sqtwolog, heursure and minimaxi techniques using hard and soft thresholding methods. In2016 XXI Symposiumon Signal Processing, Images and Artificial Vision (STSIVA)