A Stochastic Time Series Model for Predicting Financial Trends using NLP
AA Stochastic Time Series Model for Predicting FinancialTrends using NLP
Pratyush Muthukumar and Jie Zhong Department of Computer Science, University of California, Irvine Department of Mathematics, California State University, Los AngelesFebruary 3, 2021
Abstract
Stock price forecasting is a highly complex and vitally important field of research.Recent advancements in deep neural network technology allow researchers to develophighly accurate models to predict financial trends. We propose a novel deep learningmodel called ST-GAN, or Stochastic Time-series Generative Adversarial Network, thatanalyzes both financial news texts and financial numerical data to predict stock trends.We utilize cutting-edge technology like the Generative Adversarial Network (GAN) tolearn the correlations among textual and numerical data over time. We develop a newmethod of training a time-series GAN directly using the learned representations ofNaive Bayes’ sentiment analysis on financial text data alongside technical indicatorsfrom numerical data. Our experimental results show significant improvement overvarious existing models and prior research on deep neural networks for stock priceforecasting.
Stock market prediction has been a topic of interest among researchers, corporations,and market enthusiasts for decades. While the lucrative and prophetic nature of ac-curate stock market prediction is widely sought after, it may be difficult to grasp theefficacy of financial forecasting given the volatile nature of global stock markets. Thelatter sentiment is not new; in Cowles 3rd (1933), the US economist Alfred Cowleswrote that even the most successful stock market forecasters did “little, if any, betterthan what might be expected to result from pure chance”. However, with the rise of bigdata systems and machine learning, we find that the perceived quality of stochasticityin financial markets may actually contain a pattern embedded in data that spans acrossfinancial sectors and fields of interest (Lobato and Savin, 1998).Most machine learning approaches to predicting financial trends focus on an analysisof numerical features and technical indicators of stocks. Various machine learning mod-els including Linear Regression, Logistic Regression, Support Vector Machines (SVMs), a r X i v : . [ q -f i n . C P ] F e b ecision Trees, and Random Forests (RFs) have been implemented to tackle this prob-lem; see the survey paper (Kimoto et al., 1990) and references therein. More recently,deep learning models including Artificial Neural Networks (ANNs), Recurrent NeuralNetworks (RNNs), and Convolutional Neural Networks (CNNs) have also been used topredict stock prices (Tsantekidis et al., 2017; Sokolov et al., 2020). The large majorityof these works utilize machine learning models trained on solely financial numericaldata. However, stock market prediction is an invaluable market research strategy —stock prices are determined on the behavior of retail investors who themselves base de-cisions on available news sources and statistical measures. Thus, an entirely differentcategory of feature information can be learned through an analysis of textual data.The impact of financial news and text data on the trends of the stock marketcannot be understated. The financial market is highly dependent on the decisionsretail investors take, which in turn, are dependent on the daily news and informationsources they read. Thus, by incorporating the information of textual data to a financialforecasting model in addition to the technical indicators available, a machine learningmodel can understand full-fledged patterns of financial trends.There has been previous work on utilizing Earnings Conference Calls (ECCs), quar-terly public briefings by a company’s upper management team, investment team, andlegal team. ECCs provide valuable insight into the financial health of a company.Recent advancements in the field of Natural Language Processing (NLP) allow state-of-the-art language models including Google BERT (Devlin et al., 2019) and OpenAI’sGPT (Radford et al., 2019) series to be applied to financial text data to learn wordembeddings and representations which will later be used to predict financial trends.These models rely on discerning the sentiment of words or sentences of a financialtextual article, which is referred to as sentiment analysis in the field of NLP. Apply-ing pretrained language models to perform sentiment analysis is just one of the manymachine learning models that could be applied to solve this task; others include NaiveBayes classification models, Convolutional Neural Networks, or Recurrent Neural Net-works; see the survey paper (Das et al., 2019) and references therein.Although we see great strides of innovation in machine learning models which seekto predict financial trends through solely numerical features or solely textual featuresas input, we seldom see a model that encompasses information of both textual andnumerical format as the basis for predicting the stock market. Hence, we seek to utilizeadvanced machine learning models that extract details from financial text sources andnumerical indicators to predict stock trends. We define our domain of interest to bethe financial market trends in the aerospace industry. The aerospace industry servesas the perfect litmus test for the rest of the financial market, as the key public andprivate corporations in this sector contributed to $151 billion in export sales to the U.S.economy in 2018 alone. The aerospace industry is also inherently global, and accordingto Dussauge and Garrette (1995), forecasting the outlook of this industry gives us aglimpse into the world economy as well as the localized United States economy.The challenge we faced when constructing such a hybrid model predominantly stemsfrom the question of effectively joining numerical and textual features (or rather uti-lizing the output of a model trained on solely textual features as an input for the onetrained with numerical features). Additionally, it is imperative that we construct aneural network architecture with the capability to learn from data that may be sparseor less correlated, given the nature of financial forecasting. e propose a novel deep learning model in this paper which utilizes state-of-the-art machine learning models including the Generative Adversarial Network (GAN) topredict stock trends using financial text data and financial numerical data. We inventa new method of directly using sentiment analysis results on financial text data as aninput to the time-series GAN. Our model outperforms existing researched models indeep learning for financial forecasting in various forecasting time horizons.The remainder of the paper is structured as follows. Section 2 illustrates our keycontributions. Section 2.1 describes our model architecture. Section 3.1 and 3.2 outlineour experimental testing methodologies. Section 3.3 describes our model’s experimentalresults. Section 4 concludes our findings and discusses our future work. Our proposed model provides the following technical contributions to the field of fi-nancial forecasting with machine learning.
Robust Textual Understanding of Financial Data
We utilized state-of-the-art sentiment analysis techniques from the field of NLP trained on continually updateddata with high temporal resolution. Moreover, we surpassed the standard depth of usefor these sentiment analysis techniques.
Advanced Time-Series Generative Prediction
We modified the increasinglypopular Generative Adversarial Network (GAN) to predict for data indexed over timethrough a generative process. Adel et al. (2018) has shown that generative modelsoutperform traditional discriminative models like Logistic Regression, Support VectorRegression, and Conditional Random Fields in certain time-series tasks.
Novel Sequential Textual Embedding Technique
The novel aspect of ourresearch is due to the hybridization of the modular financial textual network and fi-nancial numerical network. To the best of our knowledge, our technique is the first todirectly include textual embeddings to generate stock forecasts through our deep gen-erative model. We invented a new method to use the financial text sentiment analysisresults as a latent space input to the generator of the Generative Adversarial Networktrained on numerical features. Our results show a 32 .
2% decrease in averaged NRMSEerror over multiple forecasting time horizons compared to the currently best performingresearch utilizing other deep learning models for stock price prediction.
The structure of our two-stage model follows a sequential organization, where the firstunit of our model feeds as input to the second unit. The first section of the model is aNaive Bayes’ classifier model to perform sentiment analysis on our textual data inputs.The goal of sentiment analysis is to systematically identify trends in textual data,often word-by-word or sentence-by-sentence (Narayanan et al., 2009). We performedboth on a number of subsets of our textual data. The output of this Naive Bayes’classifier model is a set of sentiments, which is a sentiment from the set {positive,neutral, negative}. It is evident why a classification model works best for sentiment nalysis, as the classifier predicts the sentiment class for each sentence of a financialtextual document.The Naive Bayes’ algorithm is a fundamental generative model used widely in ma-chine learning. Tan et al. (2009) first proposed using Naive Bayes’ classification forsentiment analysis due to the relative independence of words in a document. Thegoal of Naive Bayes sentiment analysis is to use initial training sentiments of a setof individual words to predict the sentiments of unknown words and thus the sen-timent of each sentence in the document. Specifically, given a set of word vectors X = { x , x , . . . , x k } , we predict Y , which is a label with M possible classes. Modelsthat seek to find P ( Y = y | X = { x , x , . . . , x k } ) directly, are discriminative. However,as a generative model, Naive Bayes’ classifier reverses the conditioning, P ( Y | X ) = P ( X | Y ) P ( Y ) P ( X ) , by the classical Bayes formula.Now, the most likely class of all possible classes Y given the data X isarg max y ∈ Y P ( Y | X ) = arg max y ∈ Y P ( X | Y ) P ( Y ) P ( X )= arg max y ∈ Y P ( X | Y ) P ( Y )= arg max y ∈ Y P ( x , x , . . . , x k | Y ) P ( Y ) . In practice, we find that calculating the probability P ( x , x , . . . , x k | Y ) is computation-ally infeasible, as we would have required M k computations to find P ( X | Y ), where k is the number of words in the document. Instead, we assume independence, i.e., P ( x , x , . . . , x k | Y ) = P ( x | Y ) · P ( x | Y ) · · · P ( x k | Y ) . Then, the most likely class becomesarg max y ∈ Y P ( x , x , . . . , x k | Y ) P ( Y ) = arg max y ∈ Y P ( Y ) Y x k ∈ X P ( x k | Y ) . (1)The independence assumption here is plausible due to the nature of language: individ-ual words are not dependent on other words except for context and grammar (Lewis,1998). Using independence, we perform the algorithm in linear time: to compute themost likely class, we perform M · k computations. It is important to note that theoutput of Naive Bayes’ sentiment analysis on a document is an n -dimensional vector v , where each component v i ∈ {− , , } and n denotes the number of sentences in thedocument, assuming we are performing sentence-by-sentence Naive Bayes’ sentimentanalysis. We feed this output vector v as input to a second section of our model in anovel procedure.The second section in our model is a deep predictive model capable of utilizingthe numerical data and textual sentiment analysis embeddings to identify patternsamong the input data. To achieve this goal, we select the Generative AdversarialNetwork (GAN) architecture. The GAN unit was developed as a generative variantof the traditional Convolutional Neural Network used for image-based deep learning(Goodfellow et al., 2014). GANs generate samples similar to the training data throughadversarial training. he generative nature of the network allows GANs to be particularly effective forfinancial data with multiple dimensions, low correlation, or sparsity (Muthukumar andZhong, 2020). This is because GANs are inherently an unsupervised learning model,since the predictive process of the network does not require ground truth trainingdata. Instead, the generator module of a GAN uses randomly sampled input variablesto create a prediction. In financial forecasting, the generative model introduced inKrishnan et al. (2018) provides benefits which allow for higher quality prediction onhigh dimensional, sparse data.The Generative Adversarial Network is comprised of two sub-networks: the gen-erator and the discriminator . The generator is a neural network which traditionallyuses a fixed-length random vector as input to generate a sample. The discriminator isanother neural network which traditionally classifies between training data and gener-ated samples. It is important to note that the generator is never exposed to the realworld training data, as that is only available to the discriminator. The generator anddiscriminator engage in a two-player minimax game throughout the process of train-ing, where the discriminator seeks to minimize the discriminator loss or accuratelyclassify between a real and generated sample, while the generator seeks to maximizethe discriminator loss or “fool” the discriminator.Let the generative network G have a latent space input x z ∼ p z , where x z acts as arandom “seed” for the generator to sample. In a traditional GAN, x z is a set of latentvariables sampled from a prior random noise distribution p z (Voynov and Babenko,2020). Let the discriminative network D have input either x t ∼ p t or x g ∼ p g where x t and x g denote the ground truth training data and generated samples respectively.The loss function for the GAN, described in Goodfellow et al. (2014), is derivedfrom the binary cross entropy loss and is defined as E ( G, D ) = 12 ( E x ∼ p t [1 − D ( x )] + E x ∼ p g [ D ( x )]) , where the goal of the algorithm ismax G (cid:18) min D (cid:18) E ( G, D ) (cid:19)(cid:19) = max G (cid:18) min D (cid:18) (cid:0) E x ∼ p t (cid:2) − D ( x ) (cid:3) + E x ∼ p g (cid:2) D ( x ) (cid:3)(cid:1)(cid:19)(cid:19) . Intuitively, the discriminator seeks to minimize the likelihood of the discriminatorincorrectly classifying samples, while the generator seeks to maximize this likelihood E ( G, D ).An oversight of the traditional GAN model is that the input of the generator issampled from a prior noise distribution without information from the training dataset.Recently, Chen et al. (2016) proposed the model InfoGAN, which alters the minimaxloss function to include characteristics of the random latent space input x z . This allowsthe generator and discriminator to iteratively improve the latent space variables, butthe initialization of the latent space variables remains as data points from a randomsample.We present a novel method of inputting relevant high-level embeddings as a replace-ment to the random latent inputs x z used by the Generator model of the GAN. The keyidea of our work lies in the way in which the Naive Bayes’ sentiment analysis outputsare used as inputs for the Generative Adversarial Network. Our architecture allowsus to use high-level embeddings of financial text data as an additional input whenpredicting for future stock trends on a deep neural network with financial numerical ata inputs. The Naive Bayes’ sentiment analysis on financial news texts produces an n -dimensional vector v , where n denotes the number of sentences in the text data doc-ument. We use this vector v as the latent variable “seed” x z in the GAN model, whichuses financial numerical data as the ground truth training data to generate predictions.We build on the work from InfoGAN by not only using the tweaked discriminator lossfunction that includes the latent variable x z as a tuneable parameter, but also directlyinitializing x z with the n -dimensional vector v , instead of sampling from random noisedistribution p z .In the traditional GAN model supported in the PyTorch module, the default valuefor the latent vector x z is a 100 dimensional vector created by sampling 100 pointsfrom a standard normal distribution. For our model, instead, we defined x z in ourPyTorch GAN model using the Naive Bayes’ Sentiment Analysis vector output. To doso, we select the top 100 sentiment analysis outputs that had the highest confidenceprobability. That is, the top 100 highest probabilities of the most likely class of allfinancial text documents analyzed, defined asarg max y ∈ Y P ( Y ) Y x k ∈ X P ( x k | Y ) , are selected to form the 100 dimensional vector x z . We then normalize the resultingvector such that the mean of the points is zero and the standard deviation is one. Thatis, for each x ( i ) z ∈ x z , x ( i ) z := x ( i ) z − µ x z σ x z , where i = 1 , , . . . , µ x z is the mean of the elements in the vector x z , and σ x z is thestandard deviation of the elements in x z .Thus, we have initialized our latent variables for the generator network of the GANmodel with the scaled vector of the top 100 strongest sentiments of sentences in financialnews documents for a stock. The GAN model continually updates the initialized latentvector x z through the tweaked discriminator network loss function E ( G, D ) = 12 ( E x ∼ p t [1 − D ( x )] + E x ∼ p z [ D ( x )]) , as described in Chen et al. (2016).In this way, we are able to ensure that meaningful representations from the out-put of Naive Bayes’ sentiment analysis on financial texts are used as an input to aGenerative Adversarial Network trained on financial numerical data. We also ensurethat throughout the training process, the latent variable input is fine-tuned for betterprediction accuracies. Our numerical dataset of financial indicators was sourced from the Yahoo Financehistorical stock price dataset. We select 8 stocks to apply in our model, all of whichrepresent a company in the aerospace industry. We focus on the financial trends of the aerospace industry because this microcosm of the financial sector accurately responds tothe long-term and short-term effects of current events globally (Bae and Duvall, 1996).The eight stocks we select are BA (Boeing), LMT (Lockheed Martin), AIR.PA (AirBusSE), RTX (Raytheon), GE (General Electric), ERJ (Embraer SA), NOC (NorthrupGrumman), and HON (Honeywell). This portfolio of aerospace stocks is highly diverseand global, with corporation headquarters in North America, South America, andEurope. We use historical daily stock prices for all eight aerospace stocks from January1, 2010 to March 6, 2020. We choose this timeframe as a decade worth of data exhibitsboth short-term and long-term effects of current events on the financial sector. Thetraining data spans from January 1, 2010 to January 24, 2020. The testing data spansfrom January 25, 2020 to March 6, 2020. The stock data includes the maximum price,minimum price, opening price, closing price, adjusted closing price, and volume foreach day. Figure 1 displays a visualization of the daily closing prices in USD for theeight stocks throughout the timeframe selected.Our financial text dataset is collected from various financial news outlets. We pri-marily use financial texts from Seeking Alpha, while also using texts from Forbes,MarketWatch, and Twitter. We created a webscraper to collect and download newsarticles where the headline contained at least one of the eight names of the aerospacestocks we selected. The bulk of our financial texts are in the form of Earnings Con-ference Calls (ECCs). The subject and keywords spoken during these ECCs can havedirect impacts on the future stock trends of a company, so we analyze the sentimentsof the ECCs using the Naive Bayes’ classifier.For the financial text documents, in addition to applying the Naive Bayes’ classifieron each sentence to generate the latent vector input to the generator network of the AN model, we also perform other variants of sentiment analysis, statistical calcula-tions, and pre-processing techniques to analyze the financial texts further. We inventand perform context-based sentiment analysis on company names mentioned in the textdocuments that aren’t the eight aerospace companies used we are predicting. Here,context-based sentiment analysis is the technique in which we average the sentimentvalues of words found around a specific keyword to determine its relative sentiment.Figure 2 shows a visualization of the results from context-based sentiment analysison non-aerospace company names for Boeing financial text documents. We can seethat the relative sentiment values for companies like Korea Electric Power Corp andBanco Santander SA are significantly higher than other company names mentionedin financial news regarding Boeing stock. Intuitively, these results show that whenfinancial news articles include these company names and Boeing stock together, theyare mentioned in a positive view and often lead to financial gain for Boeing.We perform data pre-processing and compute various statistical features from ourraw numerical stock data in order to generate additional technical indicators for ourmodel. For example, we calculate the 7-day and 21-day moving averages of all stocksfor the timeframe we selected. We also calculate the MACD or Moving Average Con-vergence Divergence by subtracting the 26-period Exponential Moving Average (EMA)from the 12-period EMA. The general formula for EMA isEMA = Price( t n ) × k + EMA( t n − ) × (1 − k ) , where t n and t n − denotes the n th and n − th timestep, k is defined as 2 / ( N + 1)and N denotes the number of timesteps in the EMA period. We also calculate theupper and lower Bollinger bands by adding and subtracting the 20-day rolling standarddeviation from the 21-day moving average, respectively. A visualization of the technicalindicators for the eight aerospace stocks over the dataset time period is shown in Figure3. We extract global and local trends in our stock prices using Fourier transforms,which take a function and decompose it into various sine waves that approximatethe function. Fourier transforms are often used as a financial forecasting tool, whichconvert time-series data to frequencies and amplitudes of patterns. We decomposethe historical close prices of the eight stocks into various sine functions of variousamplitudes and frames. The Fourier transform of a function g ( t ) is defined as G ( ω ) = Z ∞−∞ g ( t ) e − i πωt dt. Figure 4 displays various Fourier transform sine wave function decompositions forhistorical Boeing stock close prices.We also gathered additional numerical features by running an Auto RegressiveIntegrated Moving Average (ARIMA) model to forecast future stock prices using thehistorical stock price data. Although the forecasted stock prices were not exceedinglyaccurate, we are able to include the auto-regressive parameters for the best fit ARIMAmodel into our deep learning model. We find that an ARIMA(5,1,0), or an ARIMAmodel with 5 autoregressive terms, first order differencing, and no moving averages,performs the best with the given data. We include information on the autocorrelationand partial autocorrelation functions used to determine the optimal auto-regressiveparameters into our feature set as well.
500 1000 1500 2000 2500Days50100150200250300350400450 U S D Fourier t ransform wit h 3 com ponent sFourier t ransform wit h 6 com ponent sFourier t ransform wit h 9 com ponent sFourier t ransform wit h 100 com ponent sReal
Figure 4: Various Fourier Transform Sine Wave Decompositions for Historical Daily BoeingStock Close Price.
We create a Naive Bayes’ classifier to perform Naive Bayes’ sentiment analysis onthe vectorized representations of the financial text documents. We utilize one-hotencoding to vectorize each sentence of the text dataset in order to predict sentimentson a sentence-by-sentence level. The loss function we used to optimize the most likelysentiment, or class, is defined in (1).For the second component of our model, we used a Long-Short Term Memory(LSTM) for the Generator network and a Convolutional Neural Network (CNN) forthe Discriminator network. The Generator architecture consists of an LSTM layer with128 input units for the 128 daily numerical features processed from historical data forthe eight aerospace stocks and a Dense layer with 1 output to generate the stock pricefor a given time. The Discriminator architecture consists of a 1-dimensional CNNwith three Convolution layers and two Dense layers with one output to generate theclassification signal. The structure of the GAN model is illustrated in Figure 5. Thehyperparameter values for our model after tuning is shown in Table 1.
We seek to predict future stock prices for a variety of time frames using both financialnews texts and numerical features. We present results and error margins for predictingfuture stock prices 1 day in the future, 15 days in the future, and 30 days in the future.It is important to note that no additional ground truth data is given within the 15 or 30day prediction time frame, so the model uses its own previous days’ predictions. By thenature of Long Short-Term Memory (LSTM) models, our model provides predictionson a daily time frequency, and this measure ensures that we are not performing dataleakage when predicting for 15 and 30 day forecast horizons. Our results show thatour model has the lowest average error and thus is the best performing model among iscriminator L S T M C e ll L S T M C e ll L S T M C e ll ... T i m e - S e r i e s G e n e r a t e d D a t a Dense 500-unit500-cell LSTM Layer
Time-Series Numerical DataSentiment Analysis Vector
Generator
Figure 5: Model Architecture of the LSTM Generator Network and CNN DiscriminatorNetwork of our GAN Model.
Hyperparameter Value
LSTM Weight Initializer XavierLSTM Hidden Units 500LSTM Sequence Length (in days) 30LSTM Learning Rate 0.01LSTM Optimizer AdamLSTM Regularization L1 (Lasso)Conv1D Kernel Size 5Conv1D Stride 2Conv1D Padding NoneConv1D Activation Function LeakyReLU ( α = 0 . × − Epochs 500Batch Size 16
Table 1: Model Hyperparameter Values12 etric Model N = 1 N = 15 N = 30
ST-GAN 0 .
16 2 .
39 4 . GAN 0.74 11.74 20.41FC-LSTM 0.41 6.13 13.24RMSE ARIMA(5,1,0) 1.94 19.34 32.43Sentiment Analysis 6.89 90.24 174.87GAN-FD 0.28 3.41 8.25VolTAGE 0.33 7.25 5.11DP-LSTM 0.65 5.34 14.09
ST-GAN 0 . . . GAN 0.00229 0.03693 0.06193FC-LSTM 0.00127 0.01928 0.04018NRMSE ARIMA(5,1,0) 0.00600 0.06083 0.09841Sentiment Analysis 0.02133 0.28383 0.53063GAN-FD 0.00196 0.01114 0.02961VolTAGE 0.00152 0.04241 0.01441DP-LSTM 0.00162 0.00993 0.05114
Table 2: RMSE and NRMSE Error Values for N -day Forecast Horizons on Boeing Stock all baselines and existing deep learning models tested. To measure the accuracy ofour model, we used the Root Mean Square Error (RMSE) and Normalized RMSE(NRMSE) error metrics. RMSE error is calculated asRMSE = vuut n X i =1 (ˆ y i − y i ) n , where n is the number of observations, ˆ y is the predicted value, and y is the groundtruth. NRMSE is calculated as NRMSE = RMSE¯ y , where ¯ y is the mean of the ground truth.We included the NRMSE error metric in our error analysis because it allows us tocompare our model accuracies against other models of different scales and predictiontargets. Table 2 shows the RMSE and NRMSE results of our model on various N-dayforecast horizons where N ∈ { , , } . We show various baseline model error results against our model,
ST-GAN orStochastic Time-series Generative Adversarial Network. The baseline models wereall ran in the same experimental setting as described for our model. In addition to thegeneric baselines (GAN, FC-LSTM, ARIMA, and Sentiment Analysis), we also showresults of running other existing models in stock price forecasting in our experimentalenvironment. We show RMSE and NRMSE error values for the GAN-FD model pro-posed by Zhou et al. (2018) that uses a GAN model with an LSTM Generator networkand a CNN Discriminator network on numerical stock price data. We also tested theVolTAGE model, proposed by Sawhney et al. (2020), that uses a Graph ConvolutionalNetwork (GCN) model on Earnings Conference Call data and historical stock prices inour experimental setting. Finally, we tested the DP-LSTM model proposed by Li et al. P r i c e Ground TruthST-GANGANFC-LSTMARIMA(5,1,0)GAN-FDVolTAGEDP-LSTM
Figure 6: ST-GAN Performance Against All Experimental Models 30-day Forecast HorizonPredicted Boeing Stock Price Graph (1/24/20 - 2/25/20) P r i c e Ground TruthST-GANGAN-FDVolTAGEDP-LSTM
Figure 7: ST-GAN Performance Against Previous Research Models 30-day Forecast HorizonPredicted Boeing Stock Price Graph (1/24/20 - 2/25/20) (2019) that uses a modified LSTM on time-series historical stock price and financialnews data.Figure 6 shows a graph of the predicted stock prices for Boeing stock in a 30-dayforecast horizon from January 25, 2020 to February 24, 2020. We show the performanceof the baseline models against our model,
ST-GAN .We also show the performance of our model against previous research models onlyin Figure 7.
In this work, we have introduced a novel deep learning model that analyzes financialnews texts and financial numerical data to predict future aerospace stock trends in theshort and long term with unparalleled accuracies. We proposed a novel architecturethat applies Naive Bayes’ sentiment analysis to financial news texts and uses the learned epresentations alongside financial numerical data to train a Generative AdversarialNetwork (GAN) for time-series prediction of stock prices. Our experimental resultsshow that our model, which utilizes cutting-edge technologies, may have a significantimpact on the practice of portfolio management.Our error analysis using the RMSE and NRMSE error metrics shows significantimprovement over existing deep learning models in this field using similar technologies.We believe our contributions can be applied to help researchers, scholars, portfoliomanagers, investment officers, trustees, and consultants make better decisions andpredictions in the financial sector.In the future, we expect to measure the isolated impact of our model’s robusttextual understanding from sentiment analysis on the overall prediction accuracy. Wealso hope to increase the temporal frequency of our predictions and expand our modelprediction targets to a more diverse portfolio of stocks. References
Adel, T., Ghahramani, Z., and Weller, A. (2018). Discovering interpretable representa-tions for both deep generative and discriminative models. In
International Conferenceon Machine Learning , pages 50–59.Bae, S. C. and Duvall, G. J. (1996). An empirical analysis of market and industryfactors in stock returns of US aerospace industry.
Journal of Financial and StrategicDecisions , 9(2):85–95.Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., and Abbeel, P. (2016).InfoGAN: Interpretable representation learning by information maximizing generativeadversarial nets. In
Advances in neural information processing systems , pages 2172–2180.Cowles 3rd, A. (1933). Can stock market forecasters forecast?
Econometrica: Journalof the Econometric Society , pages 309–324.Das, S. R., Kim, S., and Kothari, B. (2019). Zero-Revelation RegTech: Detecting riskthrough linguistic analysis of corporate emails and news.
The Journal of FinancialData Science , 1(2):8–34.Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-trainingof deep bidirectional transformers for language understanding. In
Proceedings of the2019 Conference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 1 (Long and Short Papers) , pages4171–4186. Association for Computational Linguistics.Dussauge, P. and Garrette, B. (1995). Determinants of success in international strate-gic alliances: Evidence from the global aerospace industry.
Journal of InternationalBusiness Studies , 26(3):505–530.Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,Courville, A., and Bengio, Y. (2014). Generative adversarial nets. In
Advances inneural information processing systems , pages 2672–2680. imoto, T., Asakawa, K., Yoda, M., and Takeoka, M. (1990). Stock market predictionsystem with modular neural networks. In , pages 1–6. IEEE.Krishnan, R., Liang, D., and Hoffman, M. (2018). On the challenges of learning withinference networks on sparse, high-dimensional data. In International Conference onArtificial Intelligence and Statistics , pages 143–151. PMLR.Lewis, D. D. (1998). Naive (bayes) at forty: The independence assumption in infor-mation retrieval. In
European conference on machine learning , pages 4–15. Springer.Li, X., Li, Y., Yang, H., Yang, L., and Liu, X.-Y. (2019). DP-LSTM: Differen-tial privacy-inspired lstm for stock prediction using financial news. arXiv preprintarXiv:1912.10806 .Lobato, I. N. and Savin, N. E. (1998). Real and spurious long-memory properties ofstock-market data.
Journal of Business & Economic Statistics , 16(3):261–268.Muthukumar, P. and Zhong, J. (2020). A stochastic time series model for predictingfinancial trends with NLP. In
Proceedings of the Southern California Machine LearningSymposium (SCMLS) , page 68–71.Narayanan, R., Liu, B., and Choudhary, A. (2009). Sentiment analysis of conditionalsentences. In
Proceedings of the 2009 conference on empirical methods in natural lan-guage processing , pages 180–189.Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., and Sutskever, I. (2019). Lan-guage models are unsupervised multitask learners.
OpenAI blog , 1(8):9.Sawhney, R., Khanna, P., Aggarwal, A., Jain, T., Mathur, P., and Shah, R. (2020).VolTAGE: Volatility forecasting via text-audio fusion with graph convolution networksfor earnings calls. In
Proceedings of the 2020 Conference on Empirical Methods inNatural Language Processing (EMNLP) , pages 8001–8013.Sokolov, A., Mostovoy, J., Parker, B., and Seco, L. (2020). Neural embeddings offinancial time-series data.
The Journal of Financial Data Science , 2(4):33–43.Tan, S., Cheng, X., Wang, Y., and Xu, H. (2009). Adapting naive bayes to domainadaptation for sentiment analysis. In
European Conference on Information Retrieval ,pages 337–349. Springer.Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M., and Iosifidis, A.(2017). Forecasting stock prices from the limit order book using convolutional neuralnetworks. In , volume 1,pages 7–12. IEEE.Voynov, A. and Babenko, A. (2020). Unsupervised discovery of interpretable directionsin the GAN latent space. arXiv preprint arXiv:2002.03754 .Zhou, X., Pan, Z., Hu, G., Tang, S., and Zhao, C. (2018). Stock market predictionon high-frequency data using generative adversarial nets.
Mathematical Problems inEngineering , 2018., 2018.