[PDF] Analysis of stock index with a generalized BN-S model: an approach based on machine learning and fuzzy parameters

Abstract

We use the superposition of the Levy processes to optimize the classic BN-S model. Considering the frequent fluctuations of price parameters difficult to accurately estimate in the model, we preprocess the price data based on fuzzy theory. The price of S&P500 stock index options in the past ten years are analyzed, and the deterministic fluctuations are captured by machine learning methods. The results show that the new model in a fuzzy environment solves the long-term dependence problem of the classic model with fewer parameter changes, and effectively analyzes the random dynamic characteristics of stock index option price time series.

Full PDF

aa r X i v : . [ q -f i n . M F ] J a n Analysis of stock index with a generalized BN-S model: an approach basedon machine learning and fuzzy parameters

Xianfei Hui

School of Management, Harbin Institute of Technology, Harbin, 150001, ChinaDepartment of Mathematics, North Dakota State University,Fargo, North Dakota 58108, USA

Baiqing Sun

School of Management, Harbin Institute of Technology, Harbin, 150001, China

Hui Jiang

College of Management and Economics, Tianjin University, Tianjin 300072, China

Indranil SenGupta Department of Mathematics, North Dakota State University,Fargo, North Dakota 58108, USA

Abstract : We use the superposition of the L´evy processes to optimize the classic BN-S model.Considering the frequent ﬂuctuations of price parameters diﬃcult to accurately estimate in themodel, we preprocess the price data based on fuzzy theory. The price of S&P500 stock indexoptions in the past ten years are analyzed, and the deterministic ﬂuctuations are captured bymachine learning methods. The results show that the new model in a fuzzy environment solvesthe long-term dependence problem of the classic model with fewer parameter changes, and ef-fectively analyzes the random dynamic characteristics of stock index option price time series.

Keywords : Barndorﬀ-Nielsen and Shephard model, L´evy process, Fuzzy sets, Machine learning,Stock index options.

Stock index, the most important indicator that reﬂects and predicts ﬁnancial market ﬂuctu-ations and global economic changes, is playing an increasingly important role in capital market.The total value and trading volume of global trading products of stock index options are in-creasing year by year. It is often used as a benchmark to measure the performance of investmentfunds, the basis of passive management that replicates its performance, derivative instrumentsinvolved in transactions, benchmark indicators for ﬁnancial contracts, and reﬁned risk manage-ment tools [20]. Most collective investment funds are based on indexes [23]. For example, theassets under management of US $ Corresponding author: [email protected] τ ( ν, α ) and inverse Gaussiandistribution, and modeled to realize the exchange of variance and volatility. Considering thepossibility of two-way jumps in the asset price process, Bann¨or et al. [6] proposed an extendedform of the BN-S stochastic volatility model, the double-sided BN-S model, and used the newmodel as a calibration for the price of foreign exchange options. Based on the stationary and2elf-decomposing distribution of the variance process, Awasthi et al. [2] gave an approximateexpression of the BN-S model, and analyzed the ﬁrst exit time and its distribution of thecombination of Brownian motion and L´evy subordinate terms.Machine learning is aﬀecting the ﬁnancial ﬁeld in a very common way [9]. In innovative re-search on topics such as price prediction, risk control, volatility simulation, quantitative trading,data processing, and trend analysis, machine learning technology is welcomed and widely used.Guo et al. [11] used machine learning to study the short-term prediction ability of the Bitcoinmarket against the ﬂuctuation of the US dollar exchange rate. By applying deep learning meth-ods to high-frequency ﬁnancial data sets, Sirignano et al. [27] found non-parametric evidencefor the existence of general and ﬁxed price formation mechanisms related to price dynamics.Xu [36] proposed an eﬀective moment matching machine learning method to manage and hedgethe risks associated with variable annuity (VA) products. Cont et al. [8] proposed a machinelearning random algorithm, calculated the best trading strategy and studied the sensitivity ofthe solution to various parameters in cross-platform complex transactions. Shoshi H et al. [31]proposed the volatility method and the duration method to capture the random behavior ofthe time series, analyzed the Bakken crude oil data through various machine learning and deeplearning algorithms, found the determined model parameters, and realized the variance swappricing and hedging strategy. Roberts et al. [24, 25] proposed a sequential hypothesis test thatdetects the general jump size distribution and is applicable to the crude oil price data set toimprove the random model using various machines and deep learning algorithms.Fuzzy theory is a powerful tool to describe uncertain events [40], and has a wide rangeof applications in management science, decision science, and intelligent control. There aremany randomness and ambiguities in the ﬁnancial market, which are mainly reﬂected in theuncertainty of market changes, the incomplete symmetry of the information of the parties tothe transaction, and the real-time ﬂuctuation of commodity transaction data. Some scholarshave tried to apply fuzzy theory to the ﬁeld of ﬁnancial research and have achieved many goodresults. Hatami-Marbini et al. [14] used fuzzy numbers to describe the indicators and factors ofstock performance, and obtained fuzzy distances from expert conﬁdence and fuzzy performancelevels to optimize stock portfolios. The combination of fuzzy theory and deep learning is used topredict changes in high-frequency ﬁnancial data [10]. Gui B et al. [12] converted ﬁnancial timeseries into fuzzy granular series to predict the range of market trends. Wang, Xiandong et al. [33]optimized the compound option pricing model based on fuzzy interest rate and fuzzy volatility.A. Thavaneswaran [32] used fuzzy set theory to price binary options by using trapezoid, parabolaand adaptive fuzzy stock price maturity value. Zhang [41] considered the clear probability meanof the fuzzy number, and obtained the clear probability mean option pricing formula in the jumpdiﬀusion model of fuzzy double index. Zmˇeskal [42] proposed a generalized fuzzy stochasticbinary tree model to study the pricing of American practical options. Nowak [21] [22]refered toexpert opinions and imprecise information estimation, using mar method and fuzzy set theory3o study European option valuation problems. The use of fuzzy theory has obvious advantagesin ﬁnancial data processing. Data processing involving fuzzy parameters can accurately describethe fuzzy changes of real market information.In summary, we apply fuzzy sets and machine learning to the analysis of stock index optionprice volatility based on the previous research. The following work is completed in this paper.Section 2 describes the eﬀectiveness and application methods of the superposition of L´evy pro-cesses in the BN-S model. Compare the diﬀerent utility of the generalized BN-S model andthe classic BN-S model in long-term applications. The generalized BN-S model overcomes theproblem of lack of long-term dependence through the superposition of L´evy subordinate terms,and is more eﬀective in processing option price data. The ﬁnancial data processing problem un-der fuzzy random uncertainty environment is introduced in Section 3. In Section 4, we analyzethe price volatility data of S&P500 options in the past ten years, and found the deterministiccomponents of random volatility based on machine learning algorithms. A brief conclusion isprovided in Section 5. Barndorﬀ-Nielsen and Shephard (BN-S) model is a type of stochastic volatility model com-monly used to describe the dynamic changes of asset prices. We use it to capture the responsepatterns of some stylized characteristics of ﬁnancial asset time series in historical big data inthe ﬁnancial market. The non-Gaussian Ornstein-Uhlenbeck (OU) process in the BN-S modelis driven by an incremental L´evy process, which is a random process of positive mean recovery.Consider a frictionless ﬁnancial market where a stock and a risk-free asset with a ﬁxedrate of return r are traded on the horizon date T . The BN-S model assumes that the priceprocess of stocks or commodities is S = ( S t ) t ≥ , which is deﬁned in a ﬁltered probability space(Ω , F, ( F t ) ≤ t ≤ T , P ), and is given by: S t = S exp( X t ) (2 . X t is controlled by: dX t = ( µ + βσ t ) dt + σ t dW t + ρdZ λt (2 . σ t is the volatility at time t . R is the set of real numbers, the parameters µ, β, ρ ∈ R , and ρ ≤ dσ t = − λσ t + dZ λt , σ > . λ ∈ R , and λ >

0. For the probability measure P , W = ( W t ) is the standard Brow-nian motion deﬁned in the probability space. And the process Z = ( Z λt ) is the subordinator,4lso known as background driving L´evy process (BDLP). It is assumed that the processes W and Z are independent , and ( F t ) is a conventional enhancement of the ﬁltering produced by( W, Z ).There are still some problems in the application of the classic BN-S model. Both logarithmicreturn and volatility or variance contain a single BDLP, which makes them completely dependenton each other, leading to inaccurate volatility simulations. This absolute correlation means thatthe model will fail in a longer time frame. And the longer time frame may also span a few days.The model cannot consistently capture the basic characteristics of the relevant time series. Forhistorical data, the jump in volatility is not completely synchronized with the jump in stockprices. The volatility σt usually cannot immediately respond to sudden ﬂuctuations in stock orcommodity prices, which will cause the classic BN-S model to fail to work.These problems are solved in the generalized BN-S model, which is superimposed by L´evysubordination [29]. The new model simulates option prices and volatility in an interrelated butdiﬀerent way. We assume that Z t and Z ∗ t are two independent L´evy subordinators with same(ﬁnite) variance. Then, there exists a L´evy subordinate d e Z λt independent of W , which is givenby: d ¯ Z λt = ρ ′ dZ λt + p − ( ρ ′ ) dZ ∗ λt , ≤ ρ ′ ≤ . σ t is dσ t = − λσ t dt + d ¯ Z λt , σ > . Z = ( ¯ Z λt ) is related to the corresponding Z in (2.3) and is alsoindependent of W subordination. And referring to the log-return X t in (2.2), on some risk-neutral ﬁltered probability space the convex combination of the subordinates Z and Z ( b ) can bewritten accurately dX t = ( µ + βσ t ) dt + σ t dW t + ρ (cid:16) (1 − θ ) dZ λt + θdZ ( b ) λt (cid:17) (2 . θ ∈ R . θ is a deterministic parameter, and its value is between 0 and 1. λ > t . Z λt and Z ( b ) λt are independent L´evy processes. Compared to Z λt , Z ( b ) λt corresponds to the greater L´evy intensity [30]. Then, new variance process is given by: dσ t = − λσ t dt + (1 − θ ′ ) dZ λt + θ ′ dZ ( b ) λt , σ > . θ ′ ∈ [0 ,

1] is deterministic. Z λt and Z ( b ) λt are independent L´evy processes.The sum of (1 − θ ) Z λt and θZ ( b ) λt is also a L´evy process and is positively correlated with Z λt and Z ( b ) λt .The integral variance is given by σ I = R Tt σ s ds in the time period [ t, T ], the calculation resultof (2.7) is given by: 5 I = ǫ ( t, T ) σ t + R Tt ǫ ( s, T )((1 − θ ′ ) dZ λt + θ ′ dZ ( b ) λt ) ,ǫ ( s, T ) = (1 − exp ( − λ ( T − s ))) /λ, t ≤ s ≤ T. (2 . , T ] is written as σ R = T R T σ t dt + ρ (1 − θ ) λV ar [ Z ] + ρ θ λV ar [ Z ( b )1 ] (2 . θ = θ ′ for the convenience of calculation in the following text.We assume that J Z is a jump measure related to the subordinate Z of the L´evy process, J ( b ) Z corresponds to the subordinate Z ( b ) of the L´evy process, and J ( s ) = R s R R + J Z ( λdτ, dy ) , J ( b )( s ) = R s R R + J ( b ) Z ( λdτ, dy ), then for the log-return of the classic BN-S model and generalized BN-Smodel, Corr ( X t , X s ) = R s σ τ dτ + ρ J ( s ) q ( R t σ τ dτ + tρ λV ar ( Z ))( R s σ τ dτ + sρ λV ar ( Z )) , t > s (2 . Corr ( X t , X s ) = R s σ τ dτ + ρ (1 − θ ) J ( s )+ ρ θ J ( b ) ( s ) √ α ( t ) α ( s ) , t > s (2 . α ( ν ) = R ν σ τ dτ + νρ λ ((1 − θ ) V ar ( Z ) + θ V ar ( Z ( b )1 )) . In (2.10), for a ﬁxed s , Corr ( X t , X s ) rapidly becomes smaller as t increases. It shows thatthe classic BN-S model is aﬀected by time changes in the process of ﬁtting random ﬂuctuations,which leads to inaccurate volatility simulation. This attenuation means that the model willbe severely invalid in a longer time range, and it is unable to accurately capture the basiccharacteristics of the relevant time series. Aﬀected by the value of the parameter θ , for a ﬁxed s , Corr ( X t , X s ) in (2.11) will never become ”too small”. Because the value of t always hasan upper limit. Compared with the classical model, the generalized BN-S model introducedin this article extracts certainty components from a completely random process. The newmodel improves the long-term dependence of the classic model with less parameter changes,and provides dynamic characteristics with obvious advantages for time series analysis of stockindex option price ﬂuctuations. There are many uncertainties in the ﬁnancial market, including randomness and fuzziness.The advantage of treating the daily price of stock index option assets as a fuzzy parameter isthat it can accurately describe the range of price ﬂuctuations. It can overcome the errors causedby some unreasonable data in the historical data, and increase the accuracy and operability ofthe yield and risk quantiﬁcation process. Fuzzy numbers are often used to describe uncertaininformation. Suppose S is a domain, which is any element on the real number set R . Forthe fuzzy set, there is a mapping µ A ( x ) ∈ [0 ,

1] corresponding to it. µ A ( x ) represents thedegree of membership of x to A , which is called the membership function of A , which is also6 fuzzy number. We form fuzzy parameters by associating µ A ( x ) with real data to realize thequantiﬁcation of the fuzzy environment in the ﬁnancial market. The general representation ofthe membership function µ can be written as: µ A ( x ) = ( L ( x ) , ≤ x ≤ m,R ( x ) , m ≤ x ≤ r. (3 . L ( x ) is a right continuous increasing function, and 0 ≤ L ( x ) ≤ R ( x ) is aleft continuous decreasing function, and 0 ≤ R ( x ) ≤

1. The value of membership (conﬁdence) α is usually expressed as: Aα = { x : µ A > α } , the α level of fuzzy set A constitutes the set of allelements whose membership of A is greater than or equal to α in the complete set, so α ∈ [0 ,

0 a L a M a M -0.500.511.5 Figure 1:

Triangular fuzzy number distribution

Triangular fuzzy number is one of the classic expressions of fuzzy number, which is widelyused in fuzzy evaluation system. The membership function µ A ( x ) is used to show the degreeto which the element x belongs to the fuzzy set A . As shown in Figure 1, the triangularfuzzy number is a normal, continuous convex function, composed of linear non-decreasing partsand non-increasing parts. Generally, the membership function of the triangular fuzzy number A = ( a l , a m , a u ) is expressed as follows: µ A ( x ) =  , if x ≤ a lx − a l a m − a l , if a l ≤ x ≤ a ma u − xa u − a m , if a m ≤ x ≤ a u , if x ≥ a u , ≤ a l ≤ a m ≤ a u ≤ . (3 . a l and a u are called the leftmost and rightmost values of fuzzy set A , respectively, and describethe lower and upper limits of the triangular fuzzy number A . Their diﬀerence indicates the fuzzy7able 1: Properties of the empirical data setDaily Price Change Daily Price Change % Daily Volatility RangeMean 0.83 0.04 22.36Median 1.43 0.07 16.59Minimum -228.62 -8.56 0Maximum 180.36 8.04 218.96degree of fuzzy set A . a m is called the kernel of A ( a l , a m , a u ) and represents the most likelyvalue of the triangular fuzzy number A . In particular, if a l = a m = a u , then the fuzzy numberdegenerates into a real number. In addition, the α − level set ( α − cut set) of A is denoted as: A = [(1 − α ) a l + αa m , (1 − α ) a u + αa m ], A L = (1 − α ) a l + αa m , A R = (1 − α ) a u + αa m . If both A L and A R are integrable, then the expectation E ( A ) of set A is: E ( A ) = [(1 − λ ) a l + a m + λa u ] / , ≤ λ ≤

1. Among them, the value of λ depends on the importance of the inﬂuence of the fuzzyboundary.In order to improve the volatility of the ﬁnancial market in the BN-S model and the problemthat some input parameters in the model are diﬃcult to accurately estimate, we consider thedaily closing price, highest and lowest price of stock index options, and use triangular fuzzynumbers to describe the fuzzy daily closing price. Fuzzy the closing price variables of stockindex options, we can get the closing price S = ( s l , s m , s u ) in fuzzy form. The new fuzzy closingprice is composed of three real numbers, which correspond to the lowest, closing and highestprices of historical prices. We deﬁne the value of λ to get a new fuzzy closing price expectation e S = ( P rice min , P rice close , P rice max ), as the daily fuzzy price for the next analysis.

This section gives a numerical example to ﬁnd the value of θ in the above model. Stock indexoption price data within 10 years are considered. We select the S&P 500 ( ˆ GSPC) price dataset from November 1 , , https : //f inance.yahoo.com/ ). The ﬂuctuation characteristics of the empirical data set overtime are shown in Table 1. According to the daily closing price data, it is easy to know the dailyrise and fall of the S&P 500 stock index option price. Figure 2 shows the annual change curveof close price, daily risk and fall and volatility in the dataset.Aﬀected by many factors inside and outside the ﬁnancial market, the price of stock indexoptions ﬂuctuates many times within a day. In the process of looking for the deterministic8omponent in the random price time series, we hope to ﬁnd a suitable daily price parameter todescribe the ﬂuctuation, which has both randomness and ambiguity. Data preprocessing usingfuzzy parameters can describe the fuzzy situation of the range of price changes more accurately.We adjust the value of λ in the triangular fuzzy number, so that the fuzzy price accuratelydescribes diﬀerent risk preferences, diﬀerent market trends, and diﬀerent investment objectives.For investors with diﬀerent risk preferences, the value of λ is diﬀerent. Risk-averse investors aremore likely to be aﬀected by the lower limit of the fuzzy price boundary, so we believe that thedegree of risk aversion of investors is inversely proportional to the value of λ . Assuming thatthe λ corresponding to a risk-neutral investor is 0 .

5, the aggressive risk pursuer corresponds toa higher λ . The value of λ is also aﬀected by the market environment. For example, in a bullmarket, the overall operating trend of a long market is upward, and the upper limit of the fuzzyboundary is more important. The value of λ is larger than that in a bear market. Diﬀerentvalues of λ can also describe diﬀerent investment objectives. We can use λ closer to 0 to describethe volatility of put options, and λ closer to 1 to describe the volatility of call options.Generally, the daily closing price is the volume-weighted average price of all transactions oneminute before the last transaction of the option on that day. The daily highest price and dailylowest price describe the degree of price change. In this paper, we use triangular fuzzy numbersto calculate the daily fuzzy price of the stock index option based on three variables: daily lowestprice, daily highest price and daily closing price. In Table 3, we list the feature estimators offuzzy price data when λ takes diﬀerent values. Figure 3 provides a time series chart of the fuzzyprice of S&P 500 stock index options. We take λ = 0 . λ = 0 .

3) ( λ = 0 .

5) ( λ = 0 . Date0.10.20.3 mov_vol

Figure 2:

S&P 500 close price, daily rise and fall and volatility Date100015002000250030003500

Figure 3:

S&P 500 fuzzy price (November 1, 2010 to October 30, 2020)

We index the available fuzzy price data by date. Based on the attributes of the data set, weconstruct a machine learning classiﬁcation problem and provide quantitative decision supportfor estimating the value of θ with reasonable accuracy through calculation. Speciﬁc steps are asfollows:Step 1 We select the available data and arrange the preprocessed fuzzy price data in theorder of date (from November 1 , , Date100015002000250030003500 Fuzzy Price5d42d252d

Figure 4:

Moving average for the fuzzy price F u zzy P r i c e Figure 5:

Yearly boxplot for the fuzzy price F u zz y P r i c e . . . . . . F i g u r e : D i s t r i bu t i o np l o t f o r f u zz y p r i ce d a t a F i g u r e : B a r c h a r t f o r f u zz y p r i ce S t e p P r i ce ﬂu c t u a t i o n s a r e t h e f o c u s o f o u r a tt e n t i o n . B a s e d o nS t e p , w ec a l c u l a t e t h e d a il y c h a n g e s o ff u zz y p r i ce s a nd s u mm a r i ze t h e v i s u a li z a t i o n o f t h e s ec h a n g e s . F i g u r e s nd r e h i s t og r a m s o f t h e d a il y c h a n g e a ndd a il y c h a n g e p e r ce n t ag e o ff u zz y p r i ce . S t e p W ec o n t i nu e t o q u a n t i f y p r i ce v o l a t ili t y . B a s e d o nS t e p , c a l c u l a t e t h e r e a li ze d v o l a t ili t y a nd t h e r e a li ze d v o l a t ili t y r e t u r n o f t h e f u zz y p r i ce s e q u e n ce . F i g u r e nd F i g u r e p r o v i d e t h e h e a t m a p a nd li n e g r a ph o f t h e r e a li ze d v o l a t ili t y . A nd F i g u r e s nd s h o w t h e h e a t d i ag r a m a nd li n e d i ag r a m o f t h e r e a li ze d v o l a t ili t y r e t u r n .

200 −100 0 100 200Daily Change in Fuzzy Price0.0000.0050.0100.0150.0200.0250.030

Figure 8:

Histogram for daily change in fuzzy price −10.0 −7.5 −5.0 −2.5 0.0 2.5 5.0 7.5Daily Change Percentage in Fuzzy Price0.00.10.20.30.40.50.60.7

Figure 9:

Histogram for daily change percentage in fuzzy price M o n t h −0.004−0.0020.0000.0020.004 Figure 10:

Heatmap for the realized volatility of the fuzzy price over ten years

Date−0.075−0.050−0.0250.0000.0250.0500.075

Figure 11:

Line Plot for the realized volatility of the fuzzy price

011 2012 2013 2014 2015 2016 2017 2018 2019 2020Year M o n t h Figure 12:

Heatmap for the realized volatility return in percentage over the ten years for the fuzzyprice

Date0.000020.000040.000060.000080.000100.000120.00014

Figure 13:

Line plot for the realized volatility return in percentage for the fuzzy price

Step 5 We deﬁne the threshold value of the fuzzy price change percentage as C , and lookfor the date when the ﬂuctuation is lower than the previous day’s C ”point” as the ”big jump”of the fuzzy price. (For example, if C = 1, the date when the fuzzy price is 1% lower than theprevious business day is a ”big jump”).Step 6 Referring to the ﬁgures and tables in the above steps, we summarize the ﬂuctuation16haracteristics of the data set and divide the empirical data. This step is dedicated to creatinga new data structure from the existing data set, taking the fuzzy price change percentage for 10consecutive days as an array with 10 elements in one row, and superimposing it layer by layerto form a matrix of fuzzy price ﬂuctuations, with a format like :  a , a , a , · · · · · · , a a , a , a , · · · · · · , a a , a , a , · · · · · · , a · · · · · · a , a , a , · · · · · · , a  Step 7 We add a new target column θ to the new data frame. If there are at least two ”bigjumps” in the next 10 days, the value of θ in the target column of the row is 1. Otherwise, weset θ = 0 corresponding to the row.Step 8 We run various machine learning and deep learning on Python to classify the newmatrix data. The input is the daily change percentage of the fuzzy price for 10 consecutive days,and the output is the θ value (0 or 1) in the target column.It is worth noting that we can improve the result by adjusting the value of C in step 5.Adjusting the data division in step 6 can also improve the validity of the results. The variousmachine learning and deep learning models involved in step 8 provide θ values between 0 and 1.By implementing the above steps, we can ﬁnd θ with reasonable accuracy and apply it to thegeneralized BN-S model. The speciﬁc algorithm and calculation results in step 8 are introducedand displayed below:The machine learning and deep learning algorithms used in step 8 are introduced below:(A) Logistic regressionLogistic regression is one of the commonly used classiﬁcation models in machine learning,and is often used for two-result classiﬁcation. We assume that the fuzzy price data obey thefollowing continuous distribution function and density function: F ( x ) = P ( X ≤ x ) = e ( x − µ ) /γ ,f ( x ) = F ′ ( X ≤ x ) = e ( x − µ ) /γ γ (1+ e ( x − µ ) /γ ) . (4 . θ can be achieved through maximum likelihood estimation.(B) Decision treeAs a supervised classiﬁcation algorithm, decision tree is a process of generating decisionresults (or tree diagram) based on historical experience (or training set). We perform attributeselection on fuzzy price data, measure and determine the topological structure between eachcharacteristic attribute, construct a decision tree, and then ﬁnd the value of θ through pruningto support the overall optimal decision result.(C) Random Forest 17andom forest is a kind of classiﬁer in machine learning. It contains many decision trees andcan be eﬀectively run on large data sets. Its output is determined by the mode of the categoryoutput by the individual tree. We divide the fuzzy price data set into a training set and a testset, instantiate the model and use the standardized data to ﬁt the model, measure the accuracyof the model through the training data, and then predict the θ value.(D) Neural networkNeural network is the basic algorithm of deep learning. It performs distributed and parallelinformation processing by constructing multiple ”neurons” to form a multilayer network. Webuild a neural network with two hidden layers and an output layer to predict θ . If the outputprobability of the softmax activation function corresponding to θ = 1 is greater than 0 .

3, wetake θ = 1.(E) Long and short-term memory neural networkLong and short-term memory neural network (LSTM) is a special recurrent neural network(RNN) that has the characteristics of maintaining long-term memory of information. We putthe implementation of LSTM in the LstmLayer class, use the forward method to achieve forwardcalculation, and use the backward method to achieve back propagation. The activation functionof the gate is the sigmoid function, and the output activation function is tanh.(F) Batch normalizer (BN) in LSTM networkBatch normalizer solves the problem of gradient disappearance and explosion by adjustingthe input of the activation function, and it helps to improve the training speed of the LSTMnetwork.In the result tables (tables 3 to 10), we provide classiﬁcation reports of the above algorithms.Often, the results of machine learning are not completely accurate. In order to observe thefeasibility of each algorithm and the possibility of misjudgment of θ in the calculation. Weuse ”precision” to represent the accuracy of all θ = 1( θ = 0) prediction results, which canbe speciﬁcally quantiﬁed as the ratio of the number of accurately predicted θ = 1( θ = 0) tothe number of all θ = 1( θ = 0) prediction results. ”Recall” is used to describe the eﬃciencyof θ = 1( θ = 0) being accurately predicted. It can be speciﬁcally quantiﬁed as the ratio ofthe number of accurately predicted θ = 1( θ = 0)to the actual number of θ = 1( θ = 0). Thevalues of ”precision” and ”recall” are positively correlated with the accuracy of the algorithm’sprediction. We propose ” f θ = 0 0.73 0.75 0.79 0.62 0.71 0.51recall θ = 0 0.80 0.48 0.64 0.28 0.43 0.7f1-score θ = 0 0.76 0.59 0.71 0.39 0.53 0.35support θ = 0 75 75 75 75 75 75precison θ = 1 0.44 0.36 0.44 0.28 0.33 0.21recall θ = 1 0.35 0.65 0.62 0.62 0.62 0.44f1-score θ = 1 0.39 0.46 0.51 0.39 0.43 0.29support θ = 1 34 34 34 34 34 34Table 4: Evaluation of calculation results within 2 year(Train set: 11/01/2018-05/13/2020, Test set: 05/14/2020-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.67 0.76 0.76 0.66 0.81recall θ = 0 0.89 0.49 0.77 0.68 0.36 0.59f1-score θ = 0 0.79 0.57 0.77 0.72 0.47 0.68support θ = 0 75 75 75 75 75 75precison θ = 1 0.47 0.30 0.48 0.43 0.29 0.44recall θ = 1 0.21 0.47 0.47 0.53 0.59 0.71f1-score θ = 1 0.29 0.36 0.48 0.47 0.39 0.54support θ = 1 34 34 34 34 34 3419able 5: Evaluation of calculation results within 3 year(Train set: 11/01/2017-07/29/2019, Test set: 07/30/2019-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.73 0.76 0.71 0.76 0.75recall θ = 0 0.93 0.64 0.76 0.70 0.69 0.57f1-score θ = 0 0.80 0.68 0.76 0.71 0.72 0.65support θ = 0 203 203 203 203 203 203precison θ = 1 0.66 0.44 0.54 0.44 0.50 0.44recall θ = 1 0.27 0.54 0.55 0.45 0.58 0.63f1-score θ = 1 0.39 0.48 0.54 0.45 0.54 0.52support θ = 1 106 106 106 106 106 106Table 6: Evaluation of calculation results within 4 year(Train set: 11/01/2016-07/29/2019, Test set: 07/30/2019-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.70 0.77 0.68 0.77 0.84recall θ = 0 0.93 0.68 0.79 0.78 0.61 0.72f1-score θ = 0 0.80 0.70 0.78 0.73 0.68 0.78support θ = 0 203 203 203 203 203 203precison θ = 1 0.66 0.45 0.58 0.41 0.47 0.58recall θ = 1 0.25 0.50 0.54 0.29 0.65 0.74f1-score θ = 1 0.37 0.47 0.56 0.34 0.54 0.65support θ = 1 106 106 106 106 106 10620able 7: Evaluation of calculation results within 5 year(Train set: 11/01/2015-10/09/2018, Test set: 10/10/2018-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.70 0.73 0.76 0.70 0.76 0.71recall θ = 0 0.97 0.63 0.76 0.87 0.67 0.64f1-score θ = 0 0.81 0.68 0.76 0.77 0.71 0.67support θ = 0 336 336 336 336 336 336precison θ = 1 0.74 0.43 0.54 0.51 0.48 0.42recall θ = 1 0.18 0.54 0.54 0.27 0.59 0.50f1-score θ = 1 0.29 0.48 0.54 0.35 0.53 0.45support θ = 1 173 173 173 173 173 173Table 8: Evaluation of calculation results within 6 year(Train set: 11/01/2014-10/09/2018, Test set: 10/10/2018-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.69 0.70 0.75 0.71 0.78 0.75recall θ = 0 0.99 0.64 0.70 0.83 0.71 0.56f1-score θ = 0 0.82 0.67 0.72 0.77 0.75 0.64support θ = 0 336 336 336 336 336 336precison θ = 1 0.90 0.41 0.48 0.51 0.53 0.43recall θ = 1 0.15 0.47 0.55 0.34 0.62 0.64f1-score θ = 1 0.26 0.44 0.51 0.41 0.57 0.51support θ = 1 173 173 173 173 173 17321able 9: Evaluation of calculation results within 7 year(Train set: 11/01/2013-12/21/2017, Test set: 12/22/2017-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.74 0.75 0.69 0.70 0.71recall θ = 0 0.98 0.69 0.88 0.88 0.66 0.66f1-score θ = 0 0.82 0.72 0.81 0.77 0.68 0.68support θ = 0 481 481 481 481 481 481precison θ = 1 0.77 0.43 0.59 0.38 0.35 0.38recall θ = 1 0.16 0.49 0.38 0.16 0.39 0.43f1-score θ = 1 0.26 0.46 0.46 0.23 0.37 0.40support θ = 1 228 228 228 228 228 228Table 10: Evaluation of calculation results within 8 year(Train set: 11/01/2012-12/21/2017, Test set: 12/22/2017-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.74 0.75 0.70 0.75 0.74recall θ = 0 0.98 0.68 0.84 0.87 0.64 0.73f1-score θ = 0 0.82 0.70 0.79 0.77 0.69 0.74support θ = 0 481 481 481 481 481 481precison θ = 1 0.75 0.42 0.54 0.43 0.42 0.45recall θ = 1 0.14 0.49 0.40 0.21 0.56 0.46f1-score θ = 1 0.24 0.45 0.46 0.28 0.48 0.45support θ = 1 228 228 228 228 228 22822able 11: Evaluation of calculation results within 9 year(Train set: 11/01/2011-10/13/2016, Test set: 10/14/2016-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.79 0.84 0.83 0.78 0.79 0.79recall θ = 0 0.99 0.72 0.89 0.88 0.75 0.76f1-score θ = 0 0.88 0.77 0.86 0.83 0.77 0.77support θ = 0 776 776 776 776 776 776precison θ = 1 0.79 0.36 0.52 0.31 0.29 0.29recall θ = 1 0.13 0.54 0.40 0.18 0.33 0.33f1-score θ = 1 0.22 0.43 0.46 0.23 0.31 0.31support θ = 1 233 233 233 233 233 233Table 12: Evaluation of calculation results within 10 year(Train set: 11/01/2010-10/13/2016, Test set: 10/14/2016-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.80 0.80 0.83 0.79 0.83 0.85recall θ = 0 0.98 0.71 0.86 0.78 0.64 0.61f1-score θ = 0 0.88 0.75 0.85 0.79 0.72 0.71support θ = 0 776 776 776 776 776 776precison θ = 1 0.72 0.29 0.48 0.31 0.32 0.33recall θ = 1 0.18 0.39 0.43 0.32 0.56 0.65f1-score θ = 1 0.29 0.33 0.45 0.31 0.41 0.44support θ = 1 233 233 233 233 233 233In the above tables, we show the dynamic estimation results of the value of θ . Diﬀerentalgorithms have diﬀerent prediction eﬀects on the same data set. But diﬀerent combinations ofresults and accuracy provide more decision support indicators for us to determine the value of θ . According to the fuzzy data of stock index options in diﬀerent time spans, machine learningand deep learning algorithms can update the predicted value of θ in time and give the accuracyof the prediction. It helps us ﬁnd the deterministic component in the random price ﬂuctuationsof stock index options. Once θ is determined, we can apply it to the generalized BN-S modelintroduced in Section 2. 23 Conclusions

The risk management tool widely used by investors is stock index option, which is an impor-tant part of international derivatives market. The random ﬂuctuation of stock index options hasbeen focused on in the ﬁnancial products market. The generalized BN-S model inherits many ad-vantages of the traditional BN-S model in analyzing ﬁnancial product price ﬂuctuations throughthe superposition of the L´evy processes, and solves the problem of the traditional BN-S modelthat lacks long-term dependence. This paper takes stock index options as an example to explorethe broader application scenarios of the generalized BN-S model. Based on machine learningalgorithms and fuzzy theory, deterministic component θ is extracted from completely randomtime series ﬂuctuations. Data preprocessing involving fuzzy parameters can more accuratelydescribe the range of fuzzy changes in market prices. Fuzzy prices containing diﬀerent risk pref-erences, diﬀerent market trends, and diﬀerent investment objectives help us ﬁnd a more suitable θ . Machine learning algorithms are believed to help mine the eﬀective information hidden inbig data. We apply various supervision and deep learning techniques to identify θ . Throughthe learning of fuzzy price data, the ﬁtting of random ﬂuctuations is realized. The calculationresult will be the decision support for determining the value of θ .The development and application of stochastic model introduced in this paper can optimizethe function of traditional model, realize accurate dynamic ﬂuctuation analysis, and enrich thetheoretical basis of ﬁnancial risk management. Future research will continue to focus on theapplication of the generalized BN-S model in ﬁnancial market analysis. Acknowledgments

This work is supported in part by the National Key Research and Development Program ofChina (2017YFB1401801), National Natural Science Foundation of China (71774042, 71532004)and China Scholarship Council (201906120273). The authors would like to thank the anonymousreviewers for their careful reading of the manuscript and for suggesting points to improve thequality of the paper.

References [1] Anadu, K., Kruttli, M., McCabe, P., & Osambela, E. (2020). The Shift from Active to Passive Investing:Risks to Financial Stability?. Financial Analysts Journal, 76(4), 23-39.[2] Awasthi, S., & SenGupta, I. (2020). First exit-time analysis for an approximate Barndorﬀ-Nielsen andShephard model with stationary self-decomposable variance process. arXiv preprint arXiv:2006.07167.[3] Barndorﬀ-Nielsen, O. E. (2001). Superposition of Ornstein–Uhlenbeck type processes. Theory of Probability& Its Applications, 45(2), 175-194.

4] Barndorﬀ-Nielsen, O. E., & Shephard, N. (2001). Non-Gaussian Ornstein–Uhlenbeck-based models andsome of their uses in ﬁnancial economics. Journal of the Royal Statistical Society: Series B (StatisticalMethodology), 63(2), 167-241.[5] Barndorﬀ-Nielsen, O. E., Jensen, J. L., & Sørensen, M. (1998). Some stationary processes in discrete andcontinuous time. Advances in Applied Probability, 989-1007.[6] Bann¨o, K. F., & Scherer, M. (2013). A BNS-Type Stochastic Volatility Model With Two-Sided Jumps WithApplications to FX Options Pricing. Wilmott, 2013(65), 58-69.[7] Benth, F. E. (2011). The stochastic volatility model of Barndorﬀ-Nielsen and Shephard in commodity mar-kets. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics,21(4), 595-625.[8] Cont, R., & Kukanov, A. (2017). Optimal order placement in limit order markets. Quantitative Finance,17(1), 21-39.[9] Culkin, R., & Das, S. R. (2017). Machine learning in ﬁnance: The case of deep learning for option pricing.Journal of Investment Management, 15(4), 92-100.[10] Deng, Y., Ren, Z., Kong, Y., Bao, F., & Dai, Q. (2016). A hierarchical fused fuzzy deep neural network fordata classiﬁcation. IEEE Transactions on Fuzzy Systems, 25(4), 1006-1012.[11] Guo, T., & Antulov-Fantulin, N. (2018). Predicting short-term Bitcoin price ﬂuctuations from buy and sellorders. arXiv preprint arXiv:1802.04065.[12] Gui, B., Wei, X., Shen, Q., Qi, J., & Guo, L. (2014, November). Financial time series forecasting usingsupport vector machine. In 2014 Tenth International Conference on Computational Intelligence and Security(pp. 39-43). IEEE.[13] Habtemicael, S., Ghebremichael, M., & SenGupta, I. (2019). Volatility and variance swap using superpositionof the Barndorﬀ-Nielsen and Shephard type L´evy processes. Sankhya B, 1-18.[14] Hatami-Marbini, A., & Kangi, F. (2017). An extension of fuzzy TOPSIS for a group decision making withan application to Tehran stock exchange. Applied Soft Computing, 52, 1084-1097.[15] Ihsan, A., & SenGupta, I. (2018). Moments of the asset price for the Barndorﬀ-Nielsen and Shephard model.Lithuanian Mathematical Journal, 58(4), 408-420.[16] Issaka, A., & SenGupta, I. (2017). Analysis of variance based instruments for Ornstein–Uhlenbeck typemodels: swap and price index. Annals of Finance, 13(4), 401-434.[17] Issaka, A., & SenGupta, I. (2017). Feynman path integrals and asymptotic expansions for transition prob-ability densities of some L´evy driven ﬁnancial markets. Journal of Applied Mathematics and Computing,54(1), 159-182.[18] Jawadi, F., Louhichi, W., Cheﬀou, A. I., & Randrianarivony, R. (2016). Intraday jumps and trading volume:a nonlinear Tobit speciﬁcation. Review of Quantitative Finance and Accounting, 47(4), 1167-1186.[19] Kallsen, J., Muhle-Karbe, J., & Voß, M. (2011). Pricing options on variance in aﬃne stochastic volatilitymodels. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Eco-nomics, 21(4), 627-641.

20] Laurent Grillet-Aubert L (2020). Opportunities and Risks in the Financial Index Market.[21] Nowak, P., & Romaniuk, M. (2010). Computing option price for L´evy process with fuzzy parameters.European Journal of Operational Research, 201(1), 206-210.[22] Nowak, P., & Romaniuk, M. (2014). Application of L´evy processes and Esscher transformed martingalemeasures for option pricing in fuzzy framework. Journal of Computational and Applied Mathematics, 263,129-151.[23] Petry, J., Fichtner, J., & Heemskerk, E. (2019). Steering capital: the growing private authority of indexproviders in the age of passive asset management. Review of International Political Economy, 1-25.[24] Roberts M., & SenGupta I.(2020) Inﬁnitesimal generators for two-dimensional L´evy process-driven hypoth-esis testing. Annals of Finance, 16 (1): 121-139.[25] Roberts, M., & SenGupta, I. (2021). Sequential hypothesis testing in machine learning, and crude oil pricejump size detection. To appear in Applied Mathematical Finance, Accepted on December, 2020.[26] Ruan, X. (2020). Volatility-of-volatility and the cross-section of option returns. Journal of Financial Markets,48, 100492.[27] Sirignano, J., & Cont, R. (2019). Universal features of price formation in ﬁnancial markets: perspectivesfrom deep learning. Quantitative Finance, 19(9), 1449-1459.[28] SenGupta, I. (2016). Generalized BN–S stochastic volatility model for option pricing. International Journalof Theoretical and Applied Finance, 19(02), 1650014.[29] SenGupta, I. (2014). Pricing Asian options in ﬁnancial markets using Mellin transforms. Electronic Journalof Diﬀerential Equations, 234, 1-9.[30] SenGupta, I., Nganje, W., & Hanson, E. (2020). Reﬁnements of Barndorﬀ-Nielsen and Shephard model: ananalysis of crude oil price with machine learning. Annals of Data Science, 1-17.[31] Shoshi, H., & SenGupta, I. (2020). Hedging and machine learning driven crude oil data analysis using areﬁned Barndorﬀ-Nielsen and Shephard model. arXiv preprint arXiv:2004.14862.[32] Thavaneswaran, A., Appadoo, S. S., & Frank, J. (2013). Binary option pricing using fuzzy numbers. AppliedMathematics Letters, 26(1), 65-72.[33] Wang, X., He, J., & Li, S. (2014). Compound option pricing under fuzzy environment. Journal of AppliedMathematics, 2014.[34] Wu, H. C. (2004). Pricing European options based on the fuzzy pattern of Black–Scholes formula. Computers& Operations Research, 31(7), 1069-1081.[35] Wu, H. C. (2007). Using fuzzy sets theory and Black–Scholes formula to generate pricing boundaries ofEuropean options. Applied Mathematics and Computation, 185(1), 136-146.[36] Xu, W., Chen, Y., Coleman, C., & Coleman, T. F. (2018). Moment matching machine learning methods forrisk management of large variable annuity portfolios. Journal of Economic Dynamics and Control, 87, 1-20.[37] Yarovaya, L., Brzeszczy´nski, J., & Lau, C. K. M. (2016). Intra-and inter-regional return and volatilityspillovers across emerging and developed markets: Evidence from stock indices and stock index futures.International Review of Financial Analysis, 43, 96-114.

38] Yoshida, Y. (2003). The valuation of European options in uncertain environment. European Journal ofOperational Research, 145(1), 221-229.[39] Yoshida, Y., Yasuda, M., Nakagami, J. I., & Kurano, M. (2006). A new evaluation of mean value for fuzzynumbers and its application to American put option under uncertainty. Fuzzy Sets and Systems, 157(19),2614-2626.[40] Zadeh, L. A., Klir, G. J., & Yuan, B. (1996). Fuzzy sets, fuzzy logic, and fuzzy systems: selected papers(Vol. 6), 394-432.[41] Zhang, L. H., Zhang, W. G., Xu, W. J., & Xiao, W. L. (2012). The double exponential jump diﬀusion modelfor pricing European options under fuzzy environments. Economic Modelling, 29(3), 780-786.[42] Zmeˇskal, Z. (2010). Generalised soft binomial American real option pricing model (fuzzy–stochastic ap-proach). European Journal of Operational Research, 207(2), 1096-1103.38] Yoshida, Y. (2003). The valuation of European options in uncertain environment. European Journal ofOperational Research, 145(1), 221-229.[39] Yoshida, Y., Yasuda, M., Nakagami, J. I., & Kurano, M. (2006). A new evaluation of mean value for fuzzynumbers and its application to American put option under uncertainty. Fuzzy Sets and Systems, 157(19),2614-2626.[40] Zadeh, L. A., Klir, G. J., & Yuan, B. (1996). Fuzzy sets, fuzzy logic, and fuzzy systems: selected papers(Vol. 6), 394-432.[41] Zhang, L. H., Zhang, W. G., Xu, W. J., & Xiao, W. L. (2012). The double exponential jump diﬀusion modelfor pricing European options under fuzzy environments. Economic Modelling, 29(3), 780-786.[42] Zmeˇskal, Z. (2010). Generalised soft binomial American real option pricing model (fuzzy–stochastic ap-proach). European Journal of Operational Research, 207(2), 1096-1103.