[PDF] Adversarial Examples in Deep Learning for Multivariate Time Series Regression

Abstract

Multivariate time series (MTS) regression tasks are common in many real-world data mining applications including finance, cybersecurity, energy, healthcare, prognostics, and many others. Due to the tremendous success of deep learning (DL) algorithms in various domains including image recognition and computer vision, researchers started adopting these techniques for solving MTS data mining problems, many of which are targeted for safety-critical and cost-critical applications. Unfortunately, DL algorithms are known for their susceptibility to adversarial examples which also makes the DL regression models for MTS forecasting also vulnerable to those attacks. To the best of our knowledge, no previous work has explored the vulnerability of DL MTS regression models to adversarial time series examples, which is an important step, specifically when the forecasting from such models is used in safety-critical and cost-critical applications. In this work, we leverage existing adversarial attack generation techniques from the image classification domain and craft adversarial multivariate time series examples for three state-of-the-art deep learning regression models, specifically Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU). We evaluate our study using Google stock and household power consumption dataset. The obtained results show that all the evaluated DL regression models are vulnerable to adversarial attacks, transferable, and thus can lead to catastrophic consequences in safety-critical and cost-critical domains, such as energy and finance.

Full PDF

AAdversarial Examples in Deep Learning forMultivariate Time Series Regression

Gautam Raj Mode and Khaza Anuarul Hoque

Department of Electrical Engineering & Computer ScienceUniversity of Missouri, Columbia, MO, USA [email protected], [email protected]

Abstract —Multivariate time series (MTS) regression tasks arecommon in many real-world data mining applications includingﬁnance, cybersecurity, energy, healthcare, prognostics, and manyothers. Due to the tremendous success of deep learning (DL)algorithms in various domains including image recognition andcomputer vision, researchers started adopting these techniquesfor solving MTS data mining problems, many of which aretargeted for safety-critical and cost-critical applications. Un-fortunately, DL algorithms are known for their susceptibilityto adversarial examples which also makes the DL regressionmodels for MTS forecasting also vulnerable to those attacks.To the best of our knowledge, no previous work has exploredthe vulnerability of DL MTS regression models to adversarialtime series examples, which is an important step, speciﬁcallywhen the forecasting from such models is used in safety-critical and cost-critical applications. In this work, we leverageexisting adversarial attack generation techniques from the imageclassiﬁcation domain and craft adversarial multivariate timeseries examples for three state-of-the-art deep learning regressionmodels, speciﬁcally Convolutional Neural Network (CNN), LongShort-Term Memory (LSTM), and Gated Recurrent Unit (GRU).We evaluate our study using Google stock and household powerconsumption dataset. The obtained results show that all the eval-uated DL regression models are vulnerable to adversarial attacks,transferable, and thus can lead to catastrophic consequencesin safety-critical and cost-critical domains, such as energy andﬁnance.

Index Terms —Multivariate time series, Regression, Deep learn-ing, Adversarial examples, FGSM, BIM.

I. I

NTRODUCTION

Time series forecasting is an important problem in datamining with many real-world applications including ﬁnance[1]–[4], weather forecasting [5], [6], power consumption mon-itoring [7], [8], industrial maintenance [9], [10], occupancymonitoring in smart buildings [11], [12], and many others.Recently, deep learning (DL) models showed tremendoussuccess in analyzing time series data [1], [13] when comparedto the other traditional methods. This is due to the factthat DL models can automatically learn complex mappingsfrom multiple inputs to outputs. Interestingly, DL models canbe easily fooled by adversarial examples [14], [15]. Fromthe perspective of image processing or computer vision, anadversarial example can be an image formed by making smallperturbations (insigniﬁcant to the human eye) to an exampleimage. Another interesting fact is that the adversarial examplescan often transfer from one model to another model, known as black-box attacks , which means that it is possible to attackDL models to which the adversary does not have access [16],[17]. In recent years, many techniques have been proposed forincreasing the robustness of DL algorithms against adversarialexamples [18]–[23], however, most of them have been shownto be vulnerable to future attacks [24].Adversarial attacks in deep learning have been extensivelyexplored for image recognition and classiﬁcation applications.However, their application to the non-image domain is vastlyunder-explored. This also includes the lack of studies apply-ing adversarial examples to time series analysis despite theincreasing popularity of DL models in time series analysis.Very recently, the authors in [25] showed that a deep neuralnetwork (DNN) univariate time series classiﬁer (speciﬁcallyResNet [26]) are vulnerable to adversarial attacks. Unfortu-nately, to the knowledge of the authors, there exists no researchwork to date evaluating the impact of adversarial attackson multivariate time series (MTS) deep learning regression models. This is indeed a major concern as potential adversarialattacks are present in many safety-critical applications thatexploit DL models for time series forecasting. For instance,adding small perturbations to multivariate time series data(using false data injection methods [27]) that uses a DLregression model [28] for smart grid electric load forecastingcan generate wrong predictions, thus may lead to a nation-wide power outage.In this paper, we apply and transfer adversarial attacksfrom the image domain to deep learning regression modelsfor MTS forecasting. We present two experimental studiesusing two datasets from the ﬁnance and energy domain. Theobtained results show that modern DL regression models areprone to adversarial attacks. We also show that adversarialtime series examples crafted for one network architecturecan be transferred to other architectures, thus holds their transferability property [16]. Therefore, this work highlightsthe importance of protecting against adversarial attacks in deeplearning regression models for safety-critical MTS forecastingapplications.To summarize, the main contributions of this paper are: • Formalize adversarial attacks on DL regression modelsfor MTS forecasting. • Crafting adversarial attacks for MTS DL regressionmodels using methods that are popular in the image a r X i v : . [ c s . L G ] S e p L regression model f A dv e r s a r i a l a tt ac k : s m a ll p e r t u r b a ti on ! O r i g i n a l M T S X " P e r t u r b e d M T S X ′ = X + h ...... ≠ " DL regression model f Adversary

Time series length T D i m e n s i on N ... D i m e n s i on N Time series length T

Fig. 1: Example of perturbing the multivariate time series byadding imperceptible noisedomain and apply them to the ﬁnance and energy domain.To be speciﬁc, we use the fast gradient sign method(FGSM) [14] and basic iterative method (BIM) [15] tocraft adversarial examples for Long Short-Term Memory(LSTM) [29], Gated Recurrent Unit (GRU) [30], andConvolutional Neural Network (CNN) [31] regressionmodels. • An empirical study of adversarial attacks on two datasetsfrom the ﬁnance and energy domain. We highlight theimpact of such attacks in real-life scenarios using theGoogle stock [32] and household electric power con-sumption dataset [33]. • A comprehensive study of the transferability property ofadversarial examples in DL regression models. • A discussion on the potential defense techniques to beconsidered in future research on this topic.The rest of the paper is organized as follows. Section IIbrieﬂy discusses deep learning for multivariate time seriesregression and adversarial attacks. Section III formalizes theMTS DL regression and explains FGSM and BIM algorithmfor crafting adversarial examples. Section IV compares theperformance of CNN, LSTM, and GRU on Google stockhousehold electric power consumption dataset, and evaluatesthe impacts of crafted adversarial examples on their perfor-mance. The transferability property of the attacks is evaluatedin this section with a brief discussion on the potential defensemechanism. Section V concludes the paper.II. B

ACKGROUND

In this section, we provide an overview of DL MTS regres-sion models and adversarial attacks in deep learning. A briefsurvey of state-of-the-art methods in these two areas is alsopresented in this section.

A. Deep learning for time series forecasting

Time series forecasting is a challenging and important prob-lem in data science and data mining community. Therefore, hundreds of methods have been proposed for their analy-sis [34]. With the success of machine learning (ML) algorithmsin different domains, ML techniques for time series forecastingis also popular [35], [36]. However, among these methods,only a few (when compared to the non-DL methods) haveconsidered DL methods for time series forecasting [37]–[39].In this work, we focus on the time/cost-sensitive and safety-critical applications of deep learning time series forecasting,which motivates us for investigating the impact of adversarialattacks on them. Speciﬁcally, we explore the impact of adver-sarial attacks on LSTM, CNN, and GRU. All of these modelsare known for their effectiveness in time series forecasting.LSTM is capable of learning long-term dependencies usingseveral gates and thus suits well the time series forecastingproblems. In [40], authors employ an LSTM model for pre-dicting the trafﬁc ﬂow with missing data. The other successfulapplications of LSTM in time series forecasting includespetroleum production forecasting [41], ﬁnancial time seriesforecasting [42], solar radiation forecasting [43], and remain-ing useful life prediction of aircraft engines [44]. GRU is animprovised version of Recurrent Neural Network(RNN) [45],and also effective in time series forecasting [46]. For in-stance, in [47], authors employ 1D convnets and bidirectionalGRUs for air pollution forecasting in Beijing, China. Theother applications of GRU models in time series forecastinginclude personalized healthcare and climate forecasting [48],mine gas concentration forecasting [49], smart grid bus loadforecasting [50]. In [51], authors present a CNN-based baggingmodel for forecasting hourly loads in a smart grid. Apart fromthe energy domain, CNNs are also useful for ﬁnancial timeseries forecasting [1], [52].In [53], time-series data from smart grids are analyzed forthe detection of electricity theft. In such use cases, perturbeddata can help thieves to avoid being detected. Using adversarialattacks, a hacker might generate such perturbed synthetic datato bypass the system’s attack detection techniques withouteven having access or knowledge about the DL model usedfor decision making [16], [17]. Perturbing the data recorded bysensors placed on safety-critical applications (using false datainjection techniques [27], [54]) such as aircraft engines, smartgrids, gas pipeline, etc. have a catastrophic impact on humanlives and productivity, whereas attacks on ﬁnancial data [55]–[57] has a direct impact on the economy. Indeed, the list ofpotential attacks presented in this section is not exhaustive dueto the space limitation.

B. Adversarial attacks

The concept of adversarial attack was proposed by

Szegedyet al. [58] at ﬁrst for image recognition. The main ideais to add a small perturbation to the input images whichis insigniﬁcant to human eyes, but as a result, the targetmodel misclassiﬁes the input images with high conﬁdence.The severity of such attacks is shown by the researchers ina recent experiment where a strip of tape on a 35 mph limitsign was added which tricked a self-driving car into acceler-ation to 85 mph [59]. Based on this idea proposed in [58],any researchers have developed algorithms [14], [15], [58],[60] for constructing such adversarial examples relying onthe architecture and parameters of the DL model. Most ofthese adversarial attacks are proposed for image recognitiontasks. A fast gradient sign method (FGSM) [14] attack wasintroduced in the year 2014 which signiﬁes the presence ofadversarial examples in image recognition tasks. Followed byFGSM, an iterative version of it, known as the basic iterativemethod (BIM) [15] was proposed in the year 2016. BIMshowed more effectiveness in crafting a stealthier adversarialexample, however, it comes with a higher computational cost.Comprehensive reviews of adversarial attacks in DL modelsin different applications can be found in [61]–[64]Interestingly, the adversarial attack approaches for multi-variate time series DL regression models have been ignoredby the community. There are only two previous works thatconsider adversarial attacks on time series. In [65], the au-thors adopt a soft K-Nearest-Neighbours (KNN) coupled withDynamic Time Warping (DTW) and show that the adversarialexamples can fool the proposed classiﬁer on a simulateddataset. Unfortunately, the KNN classiﬁer is no longer con-sidered the state-of-art classiﬁer for time series data [66].The authors in [25], utilize the FGSM and BIM attacks tofool Residual network (ResNet) classiﬁers for univariate timeseries classiﬁcation tasks. In our work, we also employ theFGSM and BIM attacks, however, we apply and evaluate theirimpacts on DL regression models for mutivariate time series forecasting .In summary, our work sheds light on the resiliency ofDL regression models for multivariate time series forecastingin real-world safety-critical and cost-critical applications (asexplained in section II-A). This will guide the data mining,data science, and machine learning researchers to developtechniques for detecting and mitigating adversarial attacks intime series data.III.

ADVERSARIAL EXAMPLES FOR MULTIVARIATE TIMESERIES

In this section, we formalization of the problem, and presentthe FGSM and BIM attack algorithms that we use to generateadversarial MTS examples for the DL models.

A. Formalization of MTS regressionDeﬁnition 1:

Let X be a multivariate time series (MTS). X can be deﬁned as a sequence such that X = [ x , x , ..., x T ] , T = | X | is the length of X , and x i ∈ R N is a N dimensiondata point at time i ∈ [1 , T ] . Deﬁnition 2: D = ( x , F ) , ( x , F ) , ..., ( x T , F T ) is thedataset of pairs ( x i , F i ) where F i is a label correspondingto x i . Deﬁnition 3:

Time series regression task consists of trainingthe model on D in order to predict ˆ F from the possible inputs.Let f ( · ) : R N × T → ˆ F represent a DL model for regression. Deﬁnition 4: J f ( · , · ) denotes the cost function (e.g. meansquared error) of the model f . Algorithm 1:

FGSM attack on multivariate time series

Input :

Original MTS X and its ˆ F Output :

Perturbed MTS X (cid:48) Parameter: (cid:15)η = (cid:15) · sign ( (cid:79) x J f ( X, ˆ F )) ; X (cid:48) = X + η ; Deﬁnition 5: X (cid:48) denotes the adversarial example, a perturbedversion of X such that ˆ F (cid:54) = ˆ F (cid:48) and (cid:107) X − X (cid:48) (cid:107) ≤ (cid:15) . where (cid:15) ≥ ∈ R is a maximum perturbation magnitude.Given a trained deep learning model f and an input MTS X , crafting an adversarial example X (cid:48) can be described as abox-constrained optimization problem [64]. min X (cid:48) (cid:107) X (cid:48) − X (cid:107) s.t.f ( X (cid:48) ) = ˆ F (cid:48) , f ( X ) = ˆ F and ˆ F (cid:54) = ˆ F (cid:48) Let η = X − X (cid:48) be the perturbation added to X . Fig.1shows the process where a perturbation η is added to theoriginal MTS X for crafting an adversarial example X (cid:48) . B. Fast gradient sign method

The FGSM was ﬁrst proposed in [14] where it was able tofool the GoogLeNet model by generating stealthy adversarialimages. FGSM calculates the gradient of the cost functionrelative to the neural network input. This attack is also knownas the one-shot method as the adversarial perturbation isgenerated by a single-step computation. Note, FGSM is anapproximate solution based on linear hypothesis [61]. Adver-sarial examples are produced by the following formula: η = (cid:15) · sign ( (cid:79) x J f ( X, ˆ F )) (1) X (cid:48) = X + η (2)Here, J f is the cost function of model f , (cid:79) x indicates thegradient of the model with respect to the original MTS X with the correct label ˆ F , (cid:15) denotes the hyper-parameter whichcontrols the amplitude of the perturbation and X (cid:48) is adversarialMTS. Algorithm 1 shows different steps of the FGSM attack. C. Basic iterative method

The BIM [15] is an extension of FGSM. In BIM, FGSMis applied multiple times with small step size and clipping isperformed after each step to ensure that they are in the range[ X − (cid:15), X + (cid:15) ] i.e. (cid:15) − neighbourhood of the original MTS X . BIM is also known as Iterative-FGSM as FGSM is iteratedwith smaller step sizes. Algorithm 2 shows different stepsof the BIM attack, where it requires three hyperparameters:the per-step small perturbation α , the amount of maximumperturbation (cid:15) , and the number of iterations I . Note, BIMdoes not rely on the approximation of the model, and theadversarial examples crafted through BIM are closer to theoriginal samples when compared to FGSM. This is becausethe perturbations are added iteratively and hence have a better lgorithm 2: BIM attack on multivariate time series

Input :

Original MTS X and its ˆ F Output :

Perturbed MTS X (cid:48) Parameter:

I, (cid:15), αX (cid:48) ← X ; while i = 1 ≤ I do η = α · sign ( (cid:79) x J f ( X (cid:48) , ˆ F )) ; X (cid:48) = X + η ; X (cid:48) = min { X + (cid:15), max { X − (cid:15), X (cid:48) }} ; i + + ; end chance of fooling the network. However, compared to FGSM,BIM is computationally more expensive and slower.IV. R ESULTS

In this section, we evaluate the crafted adversarial exampleson two datasets (from the ﬁnance and energy domain) andpresent the obtained results. We also provide a brief discussionon potential defense mechanism for detecting the adversarialMTS examples in DL regression models. For the sake ofreproducibility and to allow the research community to buildon our ﬁndings, the artifacts (source code, datasets, etc.) of thefollowing experiments are publicly available on our GitHubrepository . A. Attacks on household power consumption

Due to the increase in demand for efﬁcient energy needs,there is a need for a smart infrastructure to meet the growingdemands and to generate energy more efﬁciently. Recently,deep learning [67]–[69] has shown tremendous success inforecasting the energy demands by training on the past powerconsumption data and forecasting the energy consumption inthe future. This indeed helps in making an informed decisionof how much energy should be generated for a given day inthe recent future, avoids the excessive generation of surplusenergy, and thus helps in reducing the loss of resources,manpower, and cost. In this context, an adversarial attack couldresult in incorrect predictions of global active power, whichis the power consumed by electrical appliances other than thesub-metered appliances. Such an incorrect forecast may leadto either excessive surplus or inadequate generation of energy–both of which have a direct impact on cost, productivity,available resources, and environment.In this work, we evaluate the impact of adversarial at-tacks on household energy forecasting using the individ-ual household electric power consumption dataset [33]. Thehousehold power consumption dataset is a multivariate timeseries dataset that includes the measurements of electric powerconsumption in one household with a one-minute samplingrate for almost 4 years (December 2006 to November 2010)and collected via sub-meters placed in three distinct areas.The dataset is comprised of seven variables (besides the https://github.com/dependable-cps/adversarial-MTSR date and time) which includes global active power, globalreactive power, voltage, global intensity, and sub-metering(1 to 3). We re-sample the dataset from minutes to hoursand then predict global active power using seven variablesor input features (global active power, global reactive power,voltage, global intensity, and sub-metering (1 to 3)). Thenwe use the ﬁrst three years (2006 to 2009) for training ourthree DL models (LSTM, GRU, and CNN), and last year’sdata to test our models. The DL architecture of the DLmodels can be represented as LSTM(100,100,100) lh(14),GRU(100,100,100) lh(14), and CNN(60,60,60) lh(14). Thenotation LSTM(100,100,100) lh(14) refers to a network thathas 100 nodes in the hidden layers of the ﬁrst LSTM layer,100 nodes in the hidden layers of the second LSTM layer,100 nodes in the hidden layers of the third LSTM layer, anda sequence length of 14. In the end, there is a 1-dimensionaloutput layer. In Fig. 2, we compare the performance of thesethree DL architectures in terms of their root mean squared er-ror (RMSE) [70]. From Fig. 2, it is evident that the LSTM(100,100, 100) has the best performance (with least RMSE) whenpredicting the global active power (without attack) which wastrained with 250 epochs using Adam optimizer [71] andgrid search [72] for hyperparameter optimization to minimizethe objective cost function: mean squared error (MSE). Thehyperparameter settings for the evaluated DL models areshown in Table I.Fig. 3 shows an example of the normalized FGSM andBIM attack signatures (adversarial examples) generated forthe global reactive power variable (an input feature in theform of a time series). Similar adversarial examples aregenerated for the remaining ﬁve input features to evaluatetheir impact on the LSTM, GRU and CNN models for energyconsumption prediction (global active power prediction). Asshown in Fig. 3, the adversarial attack generated using BIM isclose to the original time series data which makes such attackstealthy, hard to detect and often bypass the attack detectionalgorithms. The impact of the generated adversarial exampleson the household electric power consumption dataset is shownin Fig. 4. For the FGSM attack (with (cid:15) = 0 . ), we observethat the RMSE for the CNN, LSTM and GRU model (underattack) are increased by 19.9%, 12.3%, and 11%, respectively,when compared to the models without attack. For the BIMattack (with α = 0 . , (cid:15) = 0 . , and I = 200 ), we alsoobserve the similar trend, that is the RMSE of the CNN, LSTMand GRU models increased in a similar fashion, speciﬁcallyby 25.9%, 22.9%, and 21.7%, respectively for the householdelectric power consumption dataset. We observe that for bothFGSM and BIM attacks, it is evident that the CNN model ismore sensitive to adversarial attacks when compared to theother DL models. Also, BIM results in a larger RMSE whencompared to the FGSM. This means BIM is not only stealthierthat FGSM, but also has a stronger impact on DL regressionmodels for the this dataset.For instance, as shown in Fig. 4a, the CNN MTS regressionmodel forecasts the global active power (without attack) tobe 2.10 kW and 4.51 kW on 161st hour and 219th hour, a) CNN(60,60,60) lh(14), RMSE=0.562 . . . . . . Time steps in hours G l ob a l ac ti v e po w e r( k il o w a tt s ) TruePredicted (b) LSTM(100,100,100) lh(14), RMSE=0.541 . . . . . . Time steps in hours G l ob a l ac ti v e po w e r( k il o w a tt s ) TruePredicted (c) GRU(100,100,100) lh(14), RMSE=0.543 . . . . . . Time steps in hours G l ob a l ac ti v e po w e r( k il o w a tt s ) TruePredicted

Fig. 2: Comparison of deep learning algorithms for power consumption dataset (a) Adversarial example crafted for CNN . . . . Time steps in hours N o r m a li ze dg l ob a l r eac ti v e po w e r OriginalFGSMBIM (b) Adversarial example crafted for LSTM . . . . Time steps in hours N o r m a li ze dg l ob a l r eac ti v e po w e r OriginalFGSMBIM (c) Adversarial example crafted for GRU . . . . Time steps in hours N o r m a li ze dg l ob a l r eac ti v e po w e r OriginalFGSMBIM

Fig. 3: Attack signatures for power consumption dataset; FGSM ( (cid:15) = 0 . ) and BIM ( α = 0 . , (cid:15) = 0 . , and I = 200 )TABLE I: Hyperparameter settings for the DL models DLmodels Power consumption dataset Google stock datasetHiddenneurons Batchsize Epochs Hiddenneurons Batchsize Epochs

CNN 60,60,60 512 200 60,60,60 14 250LSTM 30,30,30 32 250 100,100,100 14 300GRU 30,30,30 32 250 100,100,100 14 300 respectively. After performing the FGSM and BIM attack, thesame CNN MTS regression model forecasts the global activepower to be 1.36 kW and 0.37 kW on 161st hour, and 5.24kW and 6.94 kW on 219th hour, respectively. This represents a35.2% and 82.3% decrease, and a 16% and 53.8% increase inthe predicted values on the 161st and 219th hour respectively(when compared to the without attack situation). Such anunder-prediction as a consequence of attack may result in theinadequate generation of energy, thus leading to a failure ofmeeting the future energy demands with a potential poweroutage. In contrast, over-prediction may result in the surplusgeneration of energy leading to increased cost and waste ofresources.

B. Attacks on stock prices

Data scientists and ﬁnancial theorists have been employedfor the past 50 years to make sense of the market by in- creasing the return on the investment. However, due to themultidimensional nature, the scale of the problem, and itsinherent variation with time makes it an overwhelming task.Advancements in DL algorithms and their application to ﬁ-nance [1]–[4] has shown tremendous prospect to revolutionizethis domain including stock market analysis and prediction.DL algorithms can learn the multivariate nature of the stocksand can make more accurate predictions [73], [74]. In thiscontext, an adversarial attack could result in incorrect stockprice predictions, which may, in turn, result in a diminishingreturn of the investment, and have a signiﬁcant impact on thestock market.In this work, we evaluate the impact of adversarial attackson Google stock prediction using the Google stock dataset[32]. The Google stock dataset contains Google stock pricesfor the past 5 years. This multivariate time series dataset has a) CNN during FGSM (RMSE=0.674) andBIM (RMSE=0.708)

Time steps in hours G l ob a l ac ti v e po w e r( k il o w a tt s ) TruePredictedFGSMBIM (b) LSTM during FGSM (RMSE=0.608) andBIM (RMSE=0.665)

Time steps in hours G l ob a l ac ti v e po w e r( k il o w a tt s ) TruePredictedFGSMBIM (c) GRU during FGSM (RMSE=0.603) andBIM (RMSE=0.661)

Time steps in hours G l ob a l ac ti v e po w e r( k il o w a tt s ) TruePredictedFGSMBIM

Fig. 4: Power consumption prediction after FGSM ( (cid:15) = 0 . ) and BIM ( α = 0 . , (cid:15) = 0 . , and I = 200 ) (a) CNN(60,60,60) lh(60), RMSE=0.81 . . . . . . . . . . Time steps in days N o r m a li ze d s t o c kop e n i ngp r i ce TruePredicted (b) LSTM(30,30,30) lh(60), RMSE=0.77 . . . . . . . . . . Time steps in days N o r m a li ze d s t o c kop e n i ngp r i ce TruePredicted (c) GRU(30,30,30) lh(60), RMSE=0.76 . . . . . . . . . . Time steps in days N o r m a li ze d s t o c kop e n i ngp r i ce TruePredicted

Fig. 5: Comparison of deep learning algorithms for Google stock datasetsix variables namely date, close, open, volume, high, and low.We use 30% of the latest stock data as our test dataset andwe train our three DL models (LSTM, GRU, and CNN) onthe remaining 70% of the data. To predict the Google stockprices, we consider the average stock prices and volume ofthe stocks traded from the previous days as input features. Asthe Google stock price prediction is dependant on multipleinput features, it is a multivariate regression problem. Weutilize the past 60 days of data to predict the stock priceof the next day. The architectures of our DL models can berepresented as LSTM(30,30,30) lh(60), GRU(30,30,30) lh(60),and CNN(60,60,60) lh(60). From Fig. 5, it is evident that theGRU(30, 30, 30) has the best performance (with least RMSE)when predicting stock opening prices (without attack) whichwas trained with 300 epochs using Adam optimizer [71] andgrid search [72] for hyperparameter optimization to minimizethe objective cost function: mean squared error (MSE). Thehyperparameter settings for the evaluated DL models areshown in Table I.Fig. 6 shows an example of the normalized FGSM andBIM attack signatures (adversarial examples) generated forthe volume of stocks traded (an input feature in the form of atime series). Similar adversarial examples are also generated for other input features to evaluate their impact on the LSTM,GRU and CNN models for the Google stock prediction (stockopening price). From Fig.6, we observe that the adversarialattack generated using BIM is close to the original time seriesdata, which makes such attacks hard to detect and thus havehigh chances of bypassing the attack detection methods. Theimpact of the crafted adversarial examples on the Google stockdataset is shown in Fig. 7. For the FGSM attack (with (cid:15) = 0 . ),we observe that the RMSE for the CNN, LSTM and GRUmodel (under attack) are increased by 16%, 12.9%, and 13.1%,respectively, when compared to the models without attack. Forthe BIM attack (with α = 0 . , (cid:15) = 0 . and I = 200 ), wealso observe the similar trend, that is the RMSE for the CNN,LSTM and GRU model (under attack) are increased by 35.2%,27.2% and 28.9%, respectively. Similar to our observation onthe power consumption dataset, we notice that the CNN modelis more sensitive to adversarial attacks when compared to theother DL models. Moreover, we also observe that BIM resultsin a larger RMSE when compared to the FGSM.For instance, as shown in Fig. 7a, the CNN MTS regressionmodel forecasts the normalized stock opening price (withoutattack) to be $0.781 on day 11 and $0.662 on day 297. Afterperforming the FGSM and BIM attack, the same CNN MTS a) Adversarial example crafted for CNN . . . . . . . . . . Time steps in days N o r m a li ze dvo l u m e o f t h e s t o c k OriginalFGSMBIM (b) Adversarial example crafted for LSTM . . . . . . . . . . Time steps in days N o r m a li ze dvo l u m e o f t h e s t o c k OriginalFGSMBIM (c) Adversarial example crafted for GRU . . . . . . . . . . Time steps in days N o r m a li ze dvo l u m e o f t h e s t o c k OriginalFGSMBIM

Fig. 6: Attack signatures for google stock dataset; FGSM ( (cid:15) = 0 . ) and BIM ( α = 0 . , (cid:15) = 0 . , and I = 200 ) (a) CNN during FGSM (RMSE=0.94) andBIM (RMSE=1.1) . . . . . . . . . . . . Time steps in days N o r m a li ze d s t o c kop e n i ngp r i ce TruePredictedFGSMBIM (b) LSTM during FGSM (RMSE=0.87) andBIM (RMSE=0.98) . . . . . . . . . . . . Time steps in days N o r m a li ze d s t o c kop e n i ngp r i ce TruePredictedFGSMBIM (c) GRU during FGSM (RMSE=0.86) andBIM (RMSE=0.98) . . . . . . . . . . . Time steps in days N o r m a li ze d s t o c kop e n i ngp r i ce TruePredictedFGSMBIM

Fig. 7: Stock price prediction after FGSM ( (cid:15) = 0 . ) and BIM ( α = 0 . , (cid:15) = 0 . , and I = 200 )regression model forecasts the normalized stock opening priceto be $0.864 and $0.975 on day 11, and $0.607 and $0.556on day 297, respectively. This represents a 10.6% and 24.8%increase, and an 8.3% and 16% decrease in the predicted stockprices on day 11 and 297, respectively (when compared to thewithout attack situation). Such an over-prediction and under-prediction in stock prices may result in investors investingmore and investing less in a particular stock whereas the stockprices are decreasing and increasing, respectively, thus leadingto a loss in the return of investment in both cases. C. Performance variation vs. the amount of perturbation

In Fig. 8, we evaluate the LSTM and GRU regressionmodel’s performance with respect to the different amountof perturbations allowed for crafting the adversarial MTSexamples. We pick the LSTM and GRU as they showed thebest performance for the MTS regression task in Fig. 2 andFig. 5. We observe that for larger values of (cid:15) , FGSM is not veryhelpful in generating adversarial MTS examples for foolingthe LSTM and GRU regression model. In comparison, withlarger values of (cid:15) , BIM crafts more devastating adversarialMTS examples for fooling both the regression models andthus RMSE follows an increasing trend. This is due to the fact [15] that BIM adds a small amount of perturbation α on each iteration whereas FGSM adds (cid:15) amount of noise foreach data point in the MTS that may not be very helpful ingenerating inaccurate forecasting with higher RMSE values. (a) Power consumption dataset(LSTM model) . . . . . . . . . . . . . . . Amount of perturbation ( (cid:15) ) R M S E FGSMBIM (b) Google stock dataset (GRUmodel) . . . . . . . . . . . . . . Amount of perturbation ( (cid:15) ) R M S E FGSMBIM

Fig. 8: RMSE variation with respect to the amount of pertur-bation ( (cid:15) ) for FGSM and BIM attacksABLE II: Transferability of FGSM and BIM attacks for power Consumption and Google stock datasets. The notation X/Yrepresents the percentage of RMSE increase using FGSM/BIM

DLmodels Transferability (% increase of RMSE)Power consumption dataset Google stock datasetCNN LSTM GRU CNN LSTM GRU

CNN - 10.2/18.7 10.8/18.1 - 16.9/24.1 16.2/23.4LSTM 8.3/16.9 - 7.5/11.2 13.1/18.6 - 11.1/16.4GRU 9.2/16.5 6.6/11.7 - 13.8/19.7 11.6/16.3 -

D. Transferability of adversarial examples

To evaluate the transferability of adversarial attacks, weapply the adversarial examples crafted for a DL MTS regres-sion model on the other DL models. Table II summarizes theobtained results on transferability. We observe that for bothdatasets, the adversarial examples crafted for CNN are themost transferable. This means a higher RMSE is observedwhen adversarial examples crafted for the CNN model aretransferred to other models. For instance, adversarial MTSexamples crafted using BIM for the CNN regression model(Google stock dataset) causes a 23.4% increase when trans-ferred to the GRU regression model. A similar trend is alsoobserved, however, with a lower percentage increases, whenadversarial examples crafted for GRU and LSTM regressionmodels are transferred to the other DL regression models. Inaddition, the obtained results also show that BIM is betterthan FGSM in fooling (even when they are transferred) theDL models for MTS regression tasks, e.g. BIM increases theRMSE more when compared to the FGSM. Overall, the resultsshow that the adversarial examples are capable of generalizingto a different DL network architecture. This type of attack isknown as black box attacks, where the attackers do not haveaccess to the target models internal parameters, yet they areable to generate perturbed time series that fool the DL modelsfor MTSR tasks.

E. Defense against adversarial attacks

Researchers have proposed different types of adversarial at-tack defense strategies so far [61] most of which are applicableto the image domain. The existing adversarial attack defensestrategies can be divided into three categories: modifying data , modifying models , and using auxiliary tools . Modifying datarefers to modifying the training dataset in the training stage,or changing the input data in the testing stage. It also includesadversarial training [14], blocking the transferability [75],data compression [76], gradient hiding [77], and data ran-domization [78]. In contrast, modifying models refer to themodiﬁcation of DL models, such as defensive distillation [79],feature squeezing [80], regularization [81], deep contractivenetwork [82] and mask defense [83]. Using additional toolsto the DL models is referred to as using an auxiliary toolwhich includes the use of defense-GAN [84], MagNet [85] andhigh-level representation guided denoiser [86]. Unfortunately,most of these detectors are prone to adversarial attacks dueto the fact that these attacks are designed speciﬁcally to foolsuch detectors [64]. Hence, the time series, data mining andmachine learning need to pay special attention to this area as DL MTS regression models are gaining popularity in the safetyand cost-critical application domains. A potential idea for thedetection of adversarial examples in MTS DL regression mod-els can be the use of inductive conformal anomaly detectionmethod [87], [88]. Another potential idea is to leverage thedecades of research into non-probabilistic classiﬁers, such asthe nearest neighbor coupled with DTW [64].V. C ONCLUSION

In this paper, we introduced the concept of adversarial at-tacks on deep learning (DL) regression models for multivariatetime series (MTS) regression. We formalized and evaluatedtwo adversarial example generation techniques, originally pro-posed for the image domain for the MTS regression task. Theobtained results showed how adversarial attacks can induceinaccurate forecasting when evaluated on the household powerconsumption and the Google stock dataset. We also observedthat BIM is not only a more stealthy attack but also causeshigher damage in DL MTS regression models. Finally, amongthe three evaluated DL regression models, the obtained resultsrevealed that the adversarial examples crafted for CNN aremore transferable when compared to the others. Through ourwork, we shed light on the importance of acknowledgingadversarial attacks as one of the prominent threats to theDL MTS regression models as they ﬁnd their applications insafety-critical and cost-critical domains.In the future, we would like to extend our work by adaptingother adversarial attacks for the image domain and evaluatethem for MTS DL regression. In addition, we also plan toinvestigate defense strategies to detect and mitigate adversarialthreats in DL regression models.R

EFERENCES[1] O. B. Sezer, M. U. Gudelek, and A. M. Ozbayoglu, “Financial timeseries forecasting with deep learning: A systematic literature review:2005–2019,”

Applied Soft Computing , vol. 90, p. 106181, 2020.[2] L. Gan, H. Wang, and Z. Yang, “Machine learning solutions to chal-lenges in ﬁnance: An application to the pricing of ﬁnancial products,”

Technological Forecasting and Social Change , vol. 153, p. 119928,2020.[3] J. Sirignano and R. Cont, “Universal features of price formation in ﬁ-nancial markets: perspectives from deep learning,”

Quantitative Finance ,vol. 19, no. 9, pp. 1449–1459, 2019.[4] S. I. Lee and S. J. Yoo, “Multimodal deep learning for ﬁnance:integrating and forecasting international stock markets,”

The Journal ofSupercomputing , pp. 1–19, 2019.[5] A. G. Salman, B. Kanigoro, and Y. Heryadi, “Weather forecasting usingdeep learning techniques,” in . IEEE, 2015, pp.281–285.6] M. Hossain, B. Rekabdar, S. J. Louis, and S. Dascalu, “Forecasting theweather of nevada: A deep learning approach,” in . IEEE, 2015, pp. 1–6.[7] M. Khan, N. Javaid, M. N. Iqbal, M. Bilal, S. F. A. Zaidi, and R. A.Raza, “Load prediction based on multivariate time series forecastingfor energy consumption and behavioral analytics,” in

Conference onComplex, Intelligent, and Software Intensive Systems . Springer, 2018,pp. 305–316.[8] S. Chan, I. Oktavianti, and V. Puspita, “A deep learning cnn and ai-tunedsvm for electricity consumption forecasting: Multivariate time seriesdata,” in . IEEE, 2019, pp.0488–0494.[9] C. H. Fontes and O. Pereira, “Pattern recognition in multivariate timeseries–a case study applied to fault detection in a gas turbine,”

Engi-neering Applications of Artiﬁcial Intelligence , vol. 49, pp. 10–18, 2016.[10] J. Lei, C. Liu, and D. Jiang, “Fault diagnosis of wind turbine basedon long short-term memory networks,”

Renewable energy , vol. 133, pp.422–432, 2019.[11] H. Zou, Y. Zhou, J. Yang, and C. J. Spanos, “Towards occupant activitydriven smart buildings via wiﬁ-enabled iot devices and deep learning,”

Energy and Buildings , vol. 177, pp. 12–22, 2018.[12] W. Zhang, W. Hu, and Y. Wen, “Thermal comfort modeling for smartbuildings: A ﬁne-grained deep learning approach,”

IEEE Internet ofThings Journal , vol. 6, no. 2, pp. 2540–2549, 2018.[13] H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P.-A. Muller,“Deep neural network ensembles for time series classiﬁcation,” in . IEEE,2019, pp. 1–6.[14] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessingadversarial examples,” arXiv preprint arXiv:1412.6572 , 2014.[15] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in thephysical world,” arXiv preprint arXiv:1607.02533 , 2016.[16] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow,and R. Fergus, “Intriguing properties of neural networks,” arXiv preprintarXiv:1312.6199 , 2013.[17] Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into transfer-able adversarial examples and black-box attacks,” arXiv preprintarXiv:1611.02770 , 2016.[18] A. Goel, A. Singh, A. Agarwal, M. Vatsa, and R. Singh, “Smartbox:Benchmarking adversarial detection and mitigation algorithms for facerecognition,” in . IEEE, 2018, pp. 1–7.[19] I. Rosenberg, A. Shabtai, Y. Elovici, and L. Rokach, “Defense methodsagainst adversarial examples for recurrent neural networks,” arXivpreprint arXiv:1901.09963 , 2019.[20] G. Goswami, N. Ratha, A. Agarwal, R. Singh, and M. Vatsa, “Un-ravelling robustness of deep learning based face recognition againstadversarial attacks,” in

Thirty-Second AAAI Conference on ArtiﬁcialIntelligence , 2018.[21] S. Kokalj-Filipovic, R. Miller, N. Chang, and C. L. Lau, “Mitigationof adversarial examples in rf deep classiﬁers utilizing autoencoder pre-training,” arXiv preprint arXiv:1902.08034 , 2019.[22] L. Song, R. Shokri, and P. Mittal, “Privacy risks of securing ma-chine learning models against adversarial examples,” arXiv preprintarXiv:1905.10291 , 2019.[23] C. Song, H.-P. Cheng, H. Yang, S. Li, C. Wu, Q. Wu, Y. Chen, andH. Li, “Mat: A multi-strength adversarial training method to mitigateadversarial attacks,” in . IEEE, 2018, pp. 476–481.[24] N. Carlini, G. Katz, C. Barrett, and D. L. Dill, “Provably minimally-distorted adversarial examples,” arXiv preprint arXiv:1709.10207 , 2017.[25] H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P.-A. Muller, “Ad-versarial attacks on deep neural networks for time series classiﬁcation,”in .IEEE, 2019, pp. 1–8.[26] Z. Wang, W. Yan, and T. Oates, “Time series classiﬁcation from scratchwith deep neural networks: A strong baseline,” in . IEEE, 2017, pp. 1578–1585.[27] A. S. Musleh, G. Chen, and Z. Y. Dong, “A survey on the detection algo-rithms for false data injection attacks in smart grids,”

IEEE Transactionson Smart Grid , 2019. [28] A. Gasparin, S. Lukovic, and C. Alippi, “Deep learning for time seriesforecasting: The electric load case,” arXiv preprint arXiv:1907.09207 ,2019.[29] S. Hochreiter and J. Schmidhuber, “Lstm can solve hard long time lagproblems,” in

Advances in neural information processing systems , 1997,pp. 473–479.[30] K. Cho, B. Van Merri¨enboer, C. Gulcehre, D. Bahdanau, F. Bougares,H. Schwenk, and Y. Bengio, “Learning phrase representations usingrnn encoder-decoder for statistical machine translation,” arXiv preprintarXiv:1406.1078 , 2014.[31] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu,X. Wang, G. Wang, J. Cai et al. , “Recent advances in convolutionalneural networks,”

Pattern Recognition

International journal of forecasting , vol. 22, no. 3, pp. 443–473,2006.[35] N. Naing, W. Yan, and Z. Z. Htike, “State of the art machine learningtechniques for time series forecasting: A survey,”

Advanced ScienceLetters , vol. 21, no. 11, pp. 3574–3576, 2015.[36] A. Tealab, “Time series forecasting using artiﬁcial neural networksmethodologies: A systematic review,”

Future Computing and InformaticsJournal , vol. 3, no. 2, pp. 334–340, 2018.[37] A. Borovykh, S. Bohte, and C. W. Oosterlee, “Conditional time se-ries forecasting with convolutional neural networks,” arXiv preprintarXiv:1703.04691 , 2017.[38] J. C. B. Gamboa, “Deep learning for time-series analysis,” arXiv preprintarXiv:1701.01887 , 2017.[39] D. Vengertsev, “Deep learning architecture for univariate time seriesforecasting,”

Cs229 , pp. 3–7, 2014.[40] Y. Tian, K. Zhang, J. Li, X. Lin, and B. Yang, “Lstm-based trafﬁc ﬂowprediction with missing data,”

Neurocomputing , vol. 318, pp. 297–305,2018.[41] A. Sagheer and M. Kotb, “Time series forecasting of petroleum produc-tion using deep lstm recurrent networks,”

Neurocomputing , vol. 323, pp.203–213, 2019.[42] J. Cao, Z. Li, and J. Li, “Financial time series forecasting modelbased on ceemdan and lstm,”

Physica A: Statistical Mechanics and itsApplications , vol. 519, pp. 127–139, 2019.[43] M. C. Sorkun, ¨O. D. ˙INCEL, and C. Paoli, “Time series forecastingon multivariate solar radiation data using deep learning (lstm),”

TurkishJournal of Electrical Engineering & Computer Sciences , vol. 28, no. 1,pp. 211–223, 2020.[44] M. Yuan, Y. Wu, and L. Lin, “Fault diagnosis and remaining useful lifeestimation of aero engine using lstm neural network,” in . IEEE,2016, pp. 135–140.[45] R. Jozefowicz, W. Zaremba, and I. Sutskever, “An empirical explorationof recurrent network architectures,” in

International conference onmachine learning , 2015, pp. 2342–2350.[46] P. T. Yamak, L. Yujian, and P. K. Gadosey, “A comparison betweenarima, lstm, and gru for time series forecasting,” in

Proceedings ofthe 2019 2nd International Conference on Algorithms, Computing andArtiﬁcial Intelligence , 2019, pp. 49–55.[47] Q. Tao, F. Liu, Y. Li, and D. Sidorov, “Air pollution forecasting using adeep learning model based on 1d convnets and bidirectional gru,”

IEEEAccess , vol. 7, pp. 76 690–76 698, 2019.[48] E. De Brouwer, J. Simm, A. Arany, and Y. Moreau, “Gru-ode-bayes:Continuous modeling of sporadically-observed time series,” in

Advancesin Neural Information Processing Systems , 2019, pp. 7377–7388.[49] P. Jia, H. Liu, S. Wang, and P. Wang, “Research on a mine gasconcentration forecasting model based on a gru network,”

IEEE Access ,vol. 8, pp. 38 023–38 031, 2020.[50] M. Shen, Q. Xu, K. Wang, M. Tu, and B. Wu, “Short-term bus loadforecasting method based on cnn-gru neural network,” in

Proceedingsof PURPLE MOUNTAIN FORUM 2019-International Forum on SmartGrid Protection and Control . Springer, 2020, pp. 711–722.51] X. Dong, L. Qian, and L. Huang, “A cnn based bagging learningapproach to short-term load forecasting in smart grid,” in . IEEE, 2017, pp. 1–6.[52] A. Arratia and E. Sep´ulveda, “Convolutional neural networks, imagerecognition and ﬁnancial time series forecasting,” in

Workshop onMining Data for Financial Applications . Springer, 2019, pp. 60–69.[53] Z. Zheng, Y. Yang, X. Niu, H.-N. Dai, and Y. Zhou, “Wide and deepconvolutional neural networks for electricity-theft detection to securesmart grids,”

IEEE Transactions on Industrial Informatics , vol. 14, no. 4,pp. 1606–1615, 2017.[54] G. R. Mode, P. Calyam, and K. A. Hoque, “Impact of false data injectionattacks on deep learning enabled predictive analytics,” in

NOMS 2020-2020 IEEE/IFIP Network Operations and Management Symposium .IEEE, 2020, pp. 1–7.[55] E. W. Ngai, Y. Hu, Y. H. Wong, Y. Chen, and X. Sun, “The applicationof data mining techniques in ﬁnancial fraud detection: A classiﬁcationframework and an academic review of literature,”

Decision supportsystems , vol. 50, no. 3, pp. 559–569, 2011.[56] S. Das, A. Mukhopadhyay, and M. Anand, “Stock market response toinformation security breach: A study using ﬁrm and attack character-istics,”

Journal of Information Privacy and Security , vol. 8, no. 4, pp.27–55, 2012.[57] M. S. Akshaya and G. Padmavathi, “Taxonomy of security attacks andrisk assessment of cloud computing,” in

Advances in Big Data and CloudComputing . Springer, 2019, pp. 37–59.[58] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow,and R. Fergus, “Intriguing properties of neural networks,” arXiv preprintarXiv:1312.6199 arXiv preprintarXiv:1706.06083 , 2017.[61] S. Qiu, Q. Liu, S. Zhou, and C. Wu, “Review of artiﬁcial intelligenceadversarial attack and defense technologies,”

Applied Sciences , vol. 9,no. 5, p. 909, 2019.[62] H. Xu, Y. Ma, H. Liu, D. Deb, H. Liu, J. Tang, and A. Jain, “Adversarialattacks and defenses in images, graphs and text: A review,” arXivpreprint arXiv:1909.08072 , 2019.[63] B. Biggio, P. Russu, L. Didaci, F. Roli et al. , “Adversarial biometricrecognition: A review on biometric system security from the adversar-ial machine-learning perspective,”

IEEE Signal Processing Magazine ,vol. 32, no. 5, pp. 31–41, 2015.[64] X. Yuan, P. He, Q. Zhu, and X. Li, “Adversarial examples: Attacks anddefenses for deep learning,”

IEEE transactions on neural networks andlearning systems , vol. 30, no. 9, pp. 2805–2824, 2019.[65] I. Oregi, J. Del Ser, A. Perez, and J. A. Lozano, “Adversarial samplecrafting for time series classiﬁcation with elastic similarity measures,”in

International Symposium on Intelligent and Distributed Computing .Springer, 2018, pp. 26–39.[66] A. Bagnall, J. Lines, A. Bostrom, J. Large, and E. Keogh, “The greattime series classiﬁcation bake off: a review and experimental evaluationof recent algorithmic advances,”

Data Mining and Knowledge Discovery ,vol. 31, no. 3, pp. 606–660, 2017.[67] T.-Y. Kim and S.-B. Cho, “Predicting residential energy consumptionusing cnn-lstm neural networks,”

Energy , vol. 182, pp. 72–81, 2019.[68] Z. Wang, T. Hong, and M. A. Piette, “Data fusion in predicting internalheat gains for ofﬁce buildings through a deep learning approach,”

Applied energy , vol. 240, pp. 386–398, 2019.[69] J. Moon, J. Park, E. Hwang, and S. Jun, “Forecasting power consumptionfor higher educational institutions based on machine learning,”

TheJournal of Supercomputing , vol. 74, no. 8, pp. 3778–3800, 2018.[70] T. Chai and R. R. Draxler, “Root mean square error (rmse) or meanabsolute error (mae)?–arguments against avoiding rmse in the literature,”

Geoscientiﬁc model development , vol. 7, no. 3, pp. 1247–1250, 2014.[71] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 , 2014. [72] M.-A. Z¨oller and M. F. Huber, “Survey on automated machine learning,” arXiv preprint arXiv:1904.12054 , 2019.[73] W. Long, Z. Lu, and L. Cui, “Deep learning-based feature engineeringfor stock price movement prediction,”

Knowledge-Based Systems , vol.164, pp. 163–173, 2019.[74] Y. Song, J. W. Lee, and J. Lee, “A study on novel ﬁltering andrelationship between input-features and target-vectors in a deep learningmodel for stock price prediction,”

Applied Intelligence , vol. 49, no. 3,pp. 897–911, 2019.[75] H. Hosseini, Y. Chen, S. Kannan, B. Zhang, and R. Poovendran,“Blocking transferability of adversarial examples in black-box learningsystems,” arXiv preprint arXiv:1703.04318 , 2017.[76] N. Das, M. Shanbhogue, S.-T. Chen, F. Hohman, L. Chen, M. E.Kounavis, and D. H. Chau, “Keeping the bad guys out: Protectingand vaccinating deep learning with jpeg compression,” arXiv preprintarXiv:1705.02900 , 2017.[77] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, andA. Swami, “Practical black-box attacks against machine learning,” in

Proceedings of the 2017 ACM on Asia conference on computer andcommunications security . ACM, 2017, pp. 506–519.[78] C. Xie, J. Wang, Z. Zhang, Y. Zhou, L. Xie, and A. Yuille, “Adversarialexamples for semantic segmentation and object detection,” in

Proceed-ings of the IEEE International Conference on Computer Vision , 2017,pp. 1369–1378.[79] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillationas a defense to adversarial perturbations against deep neural networks,”in . IEEE, 2016,pp. 582–597.[80] W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarialexamples in deep neural networks,” arXiv preprint arXiv:1704.01155 ,2017.[81] B. Biggio, B. Nelson, and P. Laskov, “Support vector machines underadversarial label noise,” in

Asian conference on machine learning , 2011,pp. 97–112.[82] S. Gu and L. Rigazio, “Towards deep neural network architectures robustto adversarial examples,” arXiv preprint arXiv:1412.5068 , 2014.[83] J. Gao, B. Wang, Z. Lin, W. Xu, and Y. Qi, “Deepcloak: Masking deepneural network models for robustness against adversarial samples,” arXivpreprint arXiv:1702.06763 , 2017.[84] P. Samangouei, M. Kabkab, and R. Chellappa, “Defense-gan: Protectingclassiﬁers against adversarial attacks using generative models,” arXivpreprint arXiv:1805.06605 , 2018.[85] D. Meng and H. Chen, “Magnet: a two-pronged defense against adver-sarial examples,” in

Proceedings of the 2017 ACM SIGSAC Conferenceon Computer and Communications Security . ACM, 2017, pp. 135–147.[86] F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense againstadversarial attacks using high-level representation guided denoiser,” in

Proceedings of the IEEE Conference on Computer Vision and PatternRecognition , 2018, pp. 1778–1787.[87] V. Balasubramanian, S.-S. Ho, and V. Vovk,

Conformal predictionfor reliable machine learning: theory, adaptations and applications .Newnes, 2014.[88] D. Volkhonskiy, I. Nouretdinov, A. Gammerman, V. Vovk, and E. Bur-naev, “Inductive conformal martingales for change-point detection,” arXiv preprint arXiv:1706.03415arXiv preprint arXiv:1706.03415