Analysis of stock index with a generalized BN-S model: an approach based on machine learning and fuzzy parameters
aa r X i v : . [ q -f i n . M F ] J a n Analysis of stock index with a generalized BN-S model: an approach basedon machine learning and fuzzy parameters
Xianfei Hui
School of Management, Harbin Institute of Technology, Harbin, 150001, ChinaDepartment of Mathematics, North Dakota State University,Fargo, North Dakota 58108, USA
Baiqing Sun
School of Management, Harbin Institute of Technology, Harbin, 150001, China
Hui Jiang
College of Management and Economics, Tianjin University, Tianjin 300072, China
Indranil SenGupta Department of Mathematics, North Dakota State University,Fargo, North Dakota 58108, USA
Abstract : We use the superposition of the L´evy processes to optimize the classic BN-S model.Considering the frequent fluctuations of price parameters difficult to accurately estimate in themodel, we preprocess the price data based on fuzzy theory. The price of S&P500 stock indexoptions in the past ten years are analyzed, and the deterministic fluctuations are captured bymachine learning methods. The results show that the new model in a fuzzy environment solvesthe long-term dependence problem of the classic model with fewer parameter changes, and ef-fectively analyzes the random dynamic characteristics of stock index option price time series.
Keywords : Barndorff-Nielsen and Shephard model, L´evy process, Fuzzy sets, Machine learning,Stock index options.
Stock index, the most important indicator that reflects and predicts financial market fluctu-ations and global economic changes, is playing an increasingly important role in capital market.The total value and trading volume of global trading products of stock index options are in-creasing year by year. It is often used as a benchmark to measure the performance of investmentfunds, the basis of passive management that replicates its performance, derivative instrumentsinvolved in transactions, benchmark indicators for financial contracts, and refined risk manage-ment tools [20]. Most collective investment funds are based on indexes [23]. For example, theassets under management of US $ Corresponding author: [email protected] τ ( ν, α ) and inverse Gaussiandistribution, and modeled to realize the exchange of variance and volatility. Considering thepossibility of two-way jumps in the asset price process, Bann¨or et al. [6] proposed an extendedform of the BN-S stochastic volatility model, the double-sided BN-S model, and used the newmodel as a calibration for the price of foreign exchange options. Based on the stationary and2elf-decomposing distribution of the variance process, Awasthi et al. [2] gave an approximateexpression of the BN-S model, and analyzed the first exit time and its distribution of thecombination of Brownian motion and L´evy subordinate terms.Machine learning is affecting the financial field in a very common way [9]. In innovative re-search on topics such as price prediction, risk control, volatility simulation, quantitative trading,data processing, and trend analysis, machine learning technology is welcomed and widely used.Guo et al. [11] used machine learning to study the short-term prediction ability of the Bitcoinmarket against the fluctuation of the US dollar exchange rate. By applying deep learning meth-ods to high-frequency financial data sets, Sirignano et al. [27] found non-parametric evidencefor the existence of general and fixed price formation mechanisms related to price dynamics.Xu [36] proposed an effective moment matching machine learning method to manage and hedgethe risks associated with variable annuity (VA) products. Cont et al. [8] proposed a machinelearning random algorithm, calculated the best trading strategy and studied the sensitivity ofthe solution to various parameters in cross-platform complex transactions. Shoshi H et al. [31]proposed the volatility method and the duration method to capture the random behavior ofthe time series, analyzed the Bakken crude oil data through various machine learning and deeplearning algorithms, found the determined model parameters, and realized the variance swappricing and hedging strategy. Roberts et al. [24, 25] proposed a sequential hypothesis test thatdetects the general jump size distribution and is applicable to the crude oil price data set toimprove the random model using various machines and deep learning algorithms.Fuzzy theory is a powerful tool to describe uncertain events [40], and has a wide rangeof applications in management science, decision science, and intelligent control. There aremany randomness and ambiguities in the financial market, which are mainly reflected in theuncertainty of market changes, the incomplete symmetry of the information of the parties tothe transaction, and the real-time fluctuation of commodity transaction data. Some scholarshave tried to apply fuzzy theory to the field of financial research and have achieved many goodresults. Hatami-Marbini et al. [14] used fuzzy numbers to describe the indicators and factors ofstock performance, and obtained fuzzy distances from expert confidence and fuzzy performancelevels to optimize stock portfolios. The combination of fuzzy theory and deep learning is used topredict changes in high-frequency financial data [10]. Gui B et al. [12] converted financial timeseries into fuzzy granular series to predict the range of market trends. Wang, Xiandong et al. [33]optimized the compound option pricing model based on fuzzy interest rate and fuzzy volatility.A. Thavaneswaran [32] used fuzzy set theory to price binary options by using trapezoid, parabolaand adaptive fuzzy stock price maturity value. Zhang [41] considered the clear probability meanof the fuzzy number, and obtained the clear probability mean option pricing formula in the jumpdiffusion model of fuzzy double index. Zmˇeskal [42] proposed a generalized fuzzy stochasticbinary tree model to study the pricing of American practical options. Nowak [21] [22]refered toexpert opinions and imprecise information estimation, using mar method and fuzzy set theory3o study European option valuation problems. The use of fuzzy theory has obvious advantagesin financial data processing. Data processing involving fuzzy parameters can accurately describethe fuzzy changes of real market information.In summary, we apply fuzzy sets and machine learning to the analysis of stock index optionprice volatility based on the previous research. The following work is completed in this paper.Section 2 describes the effectiveness and application methods of the superposition of L´evy pro-cesses in the BN-S model. Compare the different utility of the generalized BN-S model andthe classic BN-S model in long-term applications. The generalized BN-S model overcomes theproblem of lack of long-term dependence through the superposition of L´evy subordinate terms,and is more effective in processing option price data. The financial data processing problem un-der fuzzy random uncertainty environment is introduced in Section 3. In Section 4, we analyzethe price volatility data of S&P500 options in the past ten years, and found the deterministiccomponents of random volatility based on machine learning algorithms. A brief conclusion isprovided in Section 5. Barndorff-Nielsen and Shephard (BN-S) model is a type of stochastic volatility model com-monly used to describe the dynamic changes of asset prices. We use it to capture the responsepatterns of some stylized characteristics of financial asset time series in historical big data inthe financial market. The non-Gaussian Ornstein-Uhlenbeck (OU) process in the BN-S modelis driven by an incremental L´evy process, which is a random process of positive mean recovery.Consider a frictionless financial market where a stock and a risk-free asset with a fixedrate of return r are traded on the horizon date T . The BN-S model assumes that the priceprocess of stocks or commodities is S = ( S t ) t ≥ , which is defined in a filtered probability space(Ω , F, ( F t ) ≤ t ≤ T , P ), and is given by: S t = S exp( X t ) (2 . X t is controlled by: dX t = ( µ + βσ t ) dt + σ t dW t + ρdZ λt (2 . σ t is the volatility at time t . R is the set of real numbers, the parameters µ, β, ρ ∈ R , and ρ ≤ dσ t = − λσ t + dZ λt , σ > . λ ∈ R , and λ >
0. For the probability measure P , W = ( W t ) is the standard Brow-nian motion defined in the probability space. And the process Z = ( Z λt ) is the subordinator,4lso known as background driving L´evy process (BDLP). It is assumed that the processes W and Z are independent , and ( F t ) is a conventional enhancement of the filtering produced by( W, Z ).There are still some problems in the application of the classic BN-S model. Both logarithmicreturn and volatility or variance contain a single BDLP, which makes them completely dependenton each other, leading to inaccurate volatility simulations. This absolute correlation means thatthe model will fail in a longer time frame. And the longer time frame may also span a few days.The model cannot consistently capture the basic characteristics of the relevant time series. Forhistorical data, the jump in volatility is not completely synchronized with the jump in stockprices. The volatility σt usually cannot immediately respond to sudden fluctuations in stock orcommodity prices, which will cause the classic BN-S model to fail to work.These problems are solved in the generalized BN-S model, which is superimposed by L´evysubordination [29]. The new model simulates option prices and volatility in an interrelated butdifferent way. We assume that Z t and Z ∗ t are two independent L´evy subordinators with same(finite) variance. Then, there exists a L´evy subordinate d e Z λt independent of W , which is givenby: d ¯ Z λt = ρ ′ dZ λt + p − ( ρ ′ ) dZ ∗ λt , ≤ ρ ′ ≤ . σ t is dσ t = − λσ t dt + d ¯ Z λt , σ > . Z = ( ¯ Z λt ) is related to the corresponding Z in (2.3) and is alsoindependent of W subordination. And referring to the log-return X t in (2.2), on some risk-neutral filtered probability space the convex combination of the subordinates Z and Z ( b ) can bewritten accurately dX t = ( µ + βσ t ) dt + σ t dW t + ρ (cid:16) (1 − θ ) dZ λt + θdZ ( b ) λt (cid:17) (2 . θ ∈ R . θ is a deterministic parameter, and its value is between 0 and 1. λ > t . Z λt and Z ( b ) λt are independent L´evy processes. Compared to Z λt , Z ( b ) λt corresponds to the greater L´evy intensity [30]. Then, new variance process is given by: dσ t = − λσ t dt + (1 − θ ′ ) dZ λt + θ ′ dZ ( b ) λt , σ > . θ ′ ∈ [0 ,
1] is deterministic. Z λt and Z ( b ) λt are independent L´evy processes.The sum of (1 − θ ) Z λt and θZ ( b ) λt is also a L´evy process and is positively correlated with Z λt and Z ( b ) λt .The integral variance is given by σ I = R Tt σ s ds in the time period [ t, T ], the calculation resultof (2.7) is given by: 5 I = ǫ ( t, T ) σ t + R Tt ǫ ( s, T )((1 − θ ′ ) dZ λt + θ ′ dZ ( b ) λt ) ,ǫ ( s, T ) = (1 − exp ( − λ ( T − s ))) /λ, t ≤ s ≤ T. (2 . , T ] is written as σ R = T R T σ t dt + ρ (1 − θ ) λV ar [ Z ] + ρ θ λV ar [ Z ( b )1 ] (2 . θ = θ ′ for the convenience of calculation in the following text.We assume that J Z is a jump measure related to the subordinate Z of the L´evy process, J ( b ) Z corresponds to the subordinate Z ( b ) of the L´evy process, and J ( s ) = R s R R + J Z ( λdτ, dy ) , J ( b )( s ) = R s R R + J ( b ) Z ( λdτ, dy ), then for the log-return of the classic BN-S model and generalized BN-Smodel, Corr ( X t , X s ) = R s σ τ dτ + ρ J ( s ) q ( R t σ τ dτ + tρ λV ar ( Z ))( R s σ τ dτ + sρ λV ar ( Z )) , t > s (2 . Corr ( X t , X s ) = R s σ τ dτ + ρ (1 − θ ) J ( s )+ ρ θ J ( b ) ( s ) √ α ( t ) α ( s ) , t > s (2 . α ( ν ) = R ν σ τ dτ + νρ λ ((1 − θ ) V ar ( Z ) + θ V ar ( Z ( b )1 )) . In (2.10), for a fixed s , Corr ( X t , X s ) rapidly becomes smaller as t increases. It shows thatthe classic BN-S model is affected by time changes in the process of fitting random fluctuations,which leads to inaccurate volatility simulation. This attenuation means that the model willbe severely invalid in a longer time range, and it is unable to accurately capture the basiccharacteristics of the relevant time series. Affected by the value of the parameter θ , for a fixed s , Corr ( X t , X s ) in (2.11) will never become ”too small”. Because the value of t always hasan upper limit. Compared with the classical model, the generalized BN-S model introducedin this article extracts certainty components from a completely random process. The newmodel improves the long-term dependence of the classic model with less parameter changes,and provides dynamic characteristics with obvious advantages for time series analysis of stockindex option price fluctuations. There are many uncertainties in the financial market, including randomness and fuzziness.The advantage of treating the daily price of stock index option assets as a fuzzy parameter isthat it can accurately describe the range of price fluctuations. It can overcome the errors causedby some unreasonable data in the historical data, and increase the accuracy and operability ofthe yield and risk quantification process. Fuzzy numbers are often used to describe uncertaininformation. Suppose S is a domain, which is any element on the real number set R . Forthe fuzzy set, there is a mapping µ A ( x ) ∈ [0 ,
1] corresponding to it. µ A ( x ) represents thedegree of membership of x to A , which is called the membership function of A , which is also6 fuzzy number. We form fuzzy parameters by associating µ A ( x ) with real data to realize thequantification of the fuzzy environment in the financial market. The general representation ofthe membership function µ can be written as: µ A ( x ) = ( L ( x ) , ≤ x ≤ m,R ( x ) , m ≤ x ≤ r. (3 . L ( x ) is a right continuous increasing function, and 0 ≤ L ( x ) ≤ R ( x ) is aleft continuous decreasing function, and 0 ≤ R ( x ) ≤
1. The value of membership (confidence) α is usually expressed as: Aα = { x : µ A > α } , the α level of fuzzy set A constitutes the set of allelements whose membership of A is greater than or equal to α in the complete set, so α ∈ [0 ,
0 a L a M a M -0.500.511.5 Figure 1:
Triangular fuzzy number distribution
Triangular fuzzy number is one of the classic expressions of fuzzy number, which is widelyused in fuzzy evaluation system. The membership function µ A ( x ) is used to show the degreeto which the element x belongs to the fuzzy set A . As shown in Figure 1, the triangularfuzzy number is a normal, continuous convex function, composed of linear non-decreasing partsand non-increasing parts. Generally, the membership function of the triangular fuzzy number A = ( a l , a m , a u ) is expressed as follows: µ A ( x ) = , if x ≤ a lx − a l a m − a l , if a l ≤ x ≤ a ma u − xa u − a m , if a m ≤ x ≤ a u , if x ≥ a u , ≤ a l ≤ a m ≤ a u ≤ . (3 . a l and a u are called the leftmost and rightmost values of fuzzy set A , respectively, and describethe lower and upper limits of the triangular fuzzy number A . Their difference indicates the fuzzy7able 1: Properties of the empirical data setDaily Price Change Daily Price Change % Daily Volatility RangeMean 0.83 0.04 22.36Median 1.43 0.07 16.59Minimum -228.62 -8.56 0Maximum 180.36 8.04 218.96degree of fuzzy set A . a m is called the kernel of A ( a l , a m , a u ) and represents the most likelyvalue of the triangular fuzzy number A . In particular, if a l = a m = a u , then the fuzzy numberdegenerates into a real number. In addition, the α − level set ( α − cut set) of A is denoted as: A = [(1 − α ) a l + αa m , (1 − α ) a u + αa m ], A L = (1 − α ) a l + αa m , A R = (1 − α ) a u + αa m . If both A L and A R are integrable, then the expectation E ( A ) of set A is: E ( A ) = [(1 − λ ) a l + a m + λa u ] / , ≤ λ ≤
1. Among them, the value of λ depends on the importance of the influence of the fuzzyboundary.In order to improve the volatility of the financial market in the BN-S model and the problemthat some input parameters in the model are difficult to accurately estimate, we consider thedaily closing price, highest and lowest price of stock index options, and use triangular fuzzynumbers to describe the fuzzy daily closing price. Fuzzy the closing price variables of stockindex options, we can get the closing price S = ( s l , s m , s u ) in fuzzy form. The new fuzzy closingprice is composed of three real numbers, which correspond to the lowest, closing and highestprices of historical prices. We define the value of λ to get a new fuzzy closing price expectation e S = ( P rice min , P rice close , P rice max ), as the daily fuzzy price for the next analysis.
This section gives a numerical example to find the value of θ in the above model. Stock indexoption price data within 10 years are considered. We select the S&P 500 ( ˆ GSPC) price dataset from November 1 , , https : //f inance.yahoo.com/ ). The fluctuation characteristics of the empirical data set overtime are shown in Table 1. According to the daily closing price data, it is easy to know the dailyrise and fall of the S&P 500 stock index option price. Figure 2 shows the annual change curveof close price, daily risk and fall and volatility in the dataset.Affected by many factors inside and outside the financial market, the price of stock indexoptions fluctuates many times within a day. In the process of looking for the deterministic8omponent in the random price time series, we hope to find a suitable daily price parameter todescribe the fluctuation, which has both randomness and ambiguity. Data preprocessing usingfuzzy parameters can describe the fuzzy situation of the range of price changes more accurately.We adjust the value of λ in the triangular fuzzy number, so that the fuzzy price accuratelydescribes different risk preferences, different market trends, and different investment objectives.For investors with different risk preferences, the value of λ is different. Risk-averse investors aremore likely to be affected by the lower limit of the fuzzy price boundary, so we believe that thedegree of risk aversion of investors is inversely proportional to the value of λ . Assuming thatthe λ corresponding to a risk-neutral investor is 0 .
5, the aggressive risk pursuer corresponds toa higher λ . The value of λ is also affected by the market environment. For example, in a bullmarket, the overall operating trend of a long market is upward, and the upper limit of the fuzzyboundary is more important. The value of λ is larger than that in a bear market. Differentvalues of λ can also describe different investment objectives. We can use λ closer to 0 to describethe volatility of put options, and λ closer to 1 to describe the volatility of call options.Generally, the daily closing price is the volume-weighted average price of all transactions oneminute before the last transaction of the option on that day. The daily highest price and dailylowest price describe the degree of price change. In this paper, we use triangular fuzzy numbersto calculate the daily fuzzy price of the stock index option based on three variables: daily lowestprice, daily highest price and daily closing price. In Table 3, we list the feature estimators offuzzy price data when λ takes different values. Figure 3 provides a time series chart of the fuzzyprice of S&P 500 stock index options. We take λ = 0 . λ = 0 .
3) ( λ = 0 .
5) ( λ = 0 . Date0.10.20.3 mov_vol
Figure 2:
S&P 500 close price, daily rise and fall and volatility Date100015002000250030003500
Figure 3:
S&P 500 fuzzy price (November 1, 2010 to October 30, 2020)
We index the available fuzzy price data by date. Based on the attributes of the data set, weconstruct a machine learning classification problem and provide quantitative decision supportfor estimating the value of θ with reasonable accuracy through calculation. Specific steps are asfollows:Step 1 We select the available data and arrange the preprocessed fuzzy price data in theorder of date (from November 1 , , Date100015002000250030003500 Fuzzy Price5d42d252d
Figure 4:
Moving average for the fuzzy price F u zzy P r i c e Figure 5:
Yearly boxplot for the fuzzy price F u zz y P r i c e . . . . . . F i g u r e : D i s t r i bu t i o np l o t f o r f u zz y p r i ce d a t a F i g u r e : B a r c h a r t f o r f u zz y p r i ce S t e p P r i ce flu c t u a t i o n s a r e t h e f o c u s o f o u r a tt e n t i o n . B a s e d o nS t e p , w ec a l c u l a t e t h e d a il y c h a n g e s o ff u zz y p r i ce s a nd s u mm a r i ze t h e v i s u a li z a t i o n o f t h e s ec h a n g e s . F i g u r e s nd r e h i s t og r a m s o f t h e d a il y c h a n g e a ndd a il y c h a n g e p e r ce n t ag e o ff u zz y p r i ce . S t e p W ec o n t i nu e t o q u a n t i f y p r i ce v o l a t ili t y . B a s e d o nS t e p , c a l c u l a t e t h e r e a li ze d v o l a t ili t y a nd t h e r e a li ze d v o l a t ili t y r e t u r n o f t h e f u zz y p r i ce s e q u e n ce . F i g u r e nd F i g u r e p r o v i d e t h e h e a t m a p a nd li n e g r a ph o f t h e r e a li ze d v o l a t ili t y . A nd F i g u r e s nd s h o w t h e h e a t d i ag r a m a nd li n e d i ag r a m o f t h e r e a li ze d v o l a t ili t y r e t u r n .
200 −100 0 100 200Daily Change in Fuzzy Price0.0000.0050.0100.0150.0200.0250.030
Figure 8:
Histogram for daily change in fuzzy price −10.0 −7.5 −5.0 −2.5 0.0 2.5 5.0 7.5Daily Change Percentage in Fuzzy Price0.00.10.20.30.40.50.60.7
Figure 9:
Histogram for daily change percentage in fuzzy price M o n t h −0.004−0.0020.0000.0020.004 Figure 10:
Heatmap for the realized volatility of the fuzzy price over ten years
Date−0.075−0.050−0.0250.0000.0250.0500.075
Figure 11:
Line Plot for the realized volatility of the fuzzy price
011 2012 2013 2014 2015 2016 2017 2018 2019 2020Year M o n t h Figure 12:
Heatmap for the realized volatility return in percentage over the ten years for the fuzzyprice
Date0.000020.000040.000060.000080.000100.000120.00014
Figure 13:
Line plot for the realized volatility return in percentage for the fuzzy price
Step 5 We define the threshold value of the fuzzy price change percentage as C , and lookfor the date when the fluctuation is lower than the previous day’s C ”point” as the ”big jump”of the fuzzy price. (For example, if C = 1, the date when the fuzzy price is 1% lower than theprevious business day is a ”big jump”).Step 6 Referring to the figures and tables in the above steps, we summarize the fluctuation16haracteristics of the data set and divide the empirical data. This step is dedicated to creatinga new data structure from the existing data set, taking the fuzzy price change percentage for 10consecutive days as an array with 10 elements in one row, and superimposing it layer by layerto form a matrix of fuzzy price fluctuations, with a format like : a , a , a , · · · · · · , a a , a , a , · · · · · · , a a , a , a , · · · · · · , a · · · · · · a , a , a , · · · · · · , a Step 7 We add a new target column θ to the new data frame. If there are at least two ”bigjumps” in the next 10 days, the value of θ in the target column of the row is 1. Otherwise, weset θ = 0 corresponding to the row.Step 8 We run various machine learning and deep learning on Python to classify the newmatrix data. The input is the daily change percentage of the fuzzy price for 10 consecutive days,and the output is the θ value (0 or 1) in the target column.It is worth noting that we can improve the result by adjusting the value of C in step 5.Adjusting the data division in step 6 can also improve the validity of the results. The variousmachine learning and deep learning models involved in step 8 provide θ values between 0 and 1.By implementing the above steps, we can find θ with reasonable accuracy and apply it to thegeneralized BN-S model. The specific algorithm and calculation results in step 8 are introducedand displayed below:The machine learning and deep learning algorithms used in step 8 are introduced below:(A) Logistic regressionLogistic regression is one of the commonly used classification models in machine learning,and is often used for two-result classification. We assume that the fuzzy price data obey thefollowing continuous distribution function and density function: F ( x ) = P ( X ≤ x ) = e ( x − µ ) /γ ,f ( x ) = F ′ ( X ≤ x ) = e ( x − µ ) /γ γ (1+ e ( x − µ ) /γ ) . (4 . θ can be achieved through maximum likelihood estimation.(B) Decision treeAs a supervised classification algorithm, decision tree is a process of generating decisionresults (or tree diagram) based on historical experience (or training set). We perform attributeselection on fuzzy price data, measure and determine the topological structure between eachcharacteristic attribute, construct a decision tree, and then find the value of θ through pruningto support the overall optimal decision result.(C) Random Forest 17andom forest is a kind of classifier in machine learning. It contains many decision trees andcan be effectively run on large data sets. Its output is determined by the mode of the categoryoutput by the individual tree. We divide the fuzzy price data set into a training set and a testset, instantiate the model and use the standardized data to fit the model, measure the accuracyof the model through the training data, and then predict the θ value.(D) Neural networkNeural network is the basic algorithm of deep learning. It performs distributed and parallelinformation processing by constructing multiple ”neurons” to form a multilayer network. Webuild a neural network with two hidden layers and an output layer to predict θ . If the outputprobability of the softmax activation function corresponding to θ = 1 is greater than 0 .
3, wetake θ = 1.(E) Long and short-term memory neural networkLong and short-term memory neural network (LSTM) is a special recurrent neural network(RNN) that has the characteristics of maintaining long-term memory of information. We putthe implementation of LSTM in the LstmLayer class, use the forward method to achieve forwardcalculation, and use the backward method to achieve back propagation. The activation functionof the gate is the sigmoid function, and the output activation function is tanh.(F) Batch normalizer (BN) in LSTM networkBatch normalizer solves the problem of gradient disappearance and explosion by adjustingthe input of the activation function, and it helps to improve the training speed of the LSTMnetwork.In the result tables (tables 3 to 10), we provide classification reports of the above algorithms.Often, the results of machine learning are not completely accurate. In order to observe thefeasibility of each algorithm and the possibility of misjudgment of θ in the calculation. Weuse ”precision” to represent the accuracy of all θ = 1( θ = 0) prediction results, which canbe specifically quantified as the ratio of the number of accurately predicted θ = 1( θ = 0) tothe number of all θ = 1( θ = 0) prediction results. ”Recall” is used to describe the efficiencyof θ = 1( θ = 0) being accurately predicted. It can be specifically quantified as the ratio ofthe number of accurately predicted θ = 1( θ = 0)to the actual number of θ = 1( θ = 0). Thevalues of ”precision” and ”recall” are positively correlated with the accuracy of the algorithm’sprediction. We propose ” f θ = 0 0.73 0.75 0.79 0.62 0.71 0.51recall θ = 0 0.80 0.48 0.64 0.28 0.43 0.7f1-score θ = 0 0.76 0.59 0.71 0.39 0.53 0.35support θ = 0 75 75 75 75 75 75precison θ = 1 0.44 0.36 0.44 0.28 0.33 0.21recall θ = 1 0.35 0.65 0.62 0.62 0.62 0.44f1-score θ = 1 0.39 0.46 0.51 0.39 0.43 0.29support θ = 1 34 34 34 34 34 34Table 4: Evaluation of calculation results within 2 year(Train set: 11/01/2018-05/13/2020, Test set: 05/14/2020-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.67 0.76 0.76 0.66 0.81recall θ = 0 0.89 0.49 0.77 0.68 0.36 0.59f1-score θ = 0 0.79 0.57 0.77 0.72 0.47 0.68support θ = 0 75 75 75 75 75 75precison θ = 1 0.47 0.30 0.48 0.43 0.29 0.44recall θ = 1 0.21 0.47 0.47 0.53 0.59 0.71f1-score θ = 1 0.29 0.36 0.48 0.47 0.39 0.54support θ = 1 34 34 34 34 34 3419able 5: Evaluation of calculation results within 3 year(Train set: 11/01/2017-07/29/2019, Test set: 07/30/2019-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.73 0.76 0.71 0.76 0.75recall θ = 0 0.93 0.64 0.76 0.70 0.69 0.57f1-score θ = 0 0.80 0.68 0.76 0.71 0.72 0.65support θ = 0 203 203 203 203 203 203precison θ = 1 0.66 0.44 0.54 0.44 0.50 0.44recall θ = 1 0.27 0.54 0.55 0.45 0.58 0.63f1-score θ = 1 0.39 0.48 0.54 0.45 0.54 0.52support θ = 1 106 106 106 106 106 106Table 6: Evaluation of calculation results within 4 year(Train set: 11/01/2016-07/29/2019, Test set: 07/30/2019-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.70 0.77 0.68 0.77 0.84recall θ = 0 0.93 0.68 0.79 0.78 0.61 0.72f1-score θ = 0 0.80 0.70 0.78 0.73 0.68 0.78support θ = 0 203 203 203 203 203 203precison θ = 1 0.66 0.45 0.58 0.41 0.47 0.58recall θ = 1 0.25 0.50 0.54 0.29 0.65 0.74f1-score θ = 1 0.37 0.47 0.56 0.34 0.54 0.65support θ = 1 106 106 106 106 106 10620able 7: Evaluation of calculation results within 5 year(Train set: 11/01/2015-10/09/2018, Test set: 10/10/2018-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.70 0.73 0.76 0.70 0.76 0.71recall θ = 0 0.97 0.63 0.76 0.87 0.67 0.64f1-score θ = 0 0.81 0.68 0.76 0.77 0.71 0.67support θ = 0 336 336 336 336 336 336precison θ = 1 0.74 0.43 0.54 0.51 0.48 0.42recall θ = 1 0.18 0.54 0.54 0.27 0.59 0.50f1-score θ = 1 0.29 0.48 0.54 0.35 0.53 0.45support θ = 1 173 173 173 173 173 173Table 8: Evaluation of calculation results within 6 year(Train set: 11/01/2014-10/09/2018, Test set: 10/10/2018-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.69 0.70 0.75 0.71 0.78 0.75recall θ = 0 0.99 0.64 0.70 0.83 0.71 0.56f1-score θ = 0 0.82 0.67 0.72 0.77 0.75 0.64support θ = 0 336 336 336 336 336 336precison θ = 1 0.90 0.41 0.48 0.51 0.53 0.43recall θ = 1 0.15 0.47 0.55 0.34 0.62 0.64f1-score θ = 1 0.26 0.44 0.51 0.41 0.57 0.51support θ = 1 173 173 173 173 173 17321able 9: Evaluation of calculation results within 7 year(Train set: 11/01/2013-12/21/2017, Test set: 12/22/2017-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.74 0.75 0.69 0.70 0.71recall θ = 0 0.98 0.69 0.88 0.88 0.66 0.66f1-score θ = 0 0.82 0.72 0.81 0.77 0.68 0.68support θ = 0 481 481 481 481 481 481precison θ = 1 0.77 0.43 0.59 0.38 0.35 0.38recall θ = 1 0.16 0.49 0.38 0.16 0.39 0.43f1-score θ = 1 0.26 0.46 0.46 0.23 0.37 0.40support θ = 1 228 228 228 228 228 228Table 10: Evaluation of calculation results within 8 year(Train set: 11/01/2012-12/21/2017, Test set: 12/22/2017-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.71 0.74 0.75 0.70 0.75 0.74recall θ = 0 0.98 0.68 0.84 0.87 0.64 0.73f1-score θ = 0 0.82 0.70 0.79 0.77 0.69 0.74support θ = 0 481 481 481 481 481 481precison θ = 1 0.75 0.42 0.54 0.43 0.42 0.45recall θ = 1 0.14 0.49 0.40 0.21 0.56 0.46f1-score θ = 1 0.24 0.45 0.46 0.28 0.48 0.45support θ = 1 228 228 228 228 228 22822able 11: Evaluation of calculation results within 9 year(Train set: 11/01/2011-10/13/2016, Test set: 10/14/2016-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.79 0.84 0.83 0.78 0.79 0.79recall θ = 0 0.99 0.72 0.89 0.88 0.75 0.76f1-score θ = 0 0.88 0.77 0.86 0.83 0.77 0.77support θ = 0 776 776 776 776 776 776precison θ = 1 0.79 0.36 0.52 0.31 0.29 0.29recall θ = 1 0.13 0.54 0.40 0.18 0.33 0.33f1-score θ = 1 0.22 0.43 0.46 0.23 0.31 0.31support θ = 1 233 233 233 233 233 233Table 12: Evaluation of calculation results within 10 year(Train set: 11/01/2010-10/13/2016, Test set: 10/14/2016-10/30/2020)(A) (B) (C) (D) (E) (F)precision θ = 0 0.80 0.80 0.83 0.79 0.83 0.85recall θ = 0 0.98 0.71 0.86 0.78 0.64 0.61f1-score θ = 0 0.88 0.75 0.85 0.79 0.72 0.71support θ = 0 776 776 776 776 776 776precison θ = 1 0.72 0.29 0.48 0.31 0.32 0.33recall θ = 1 0.18 0.39 0.43 0.32 0.56 0.65f1-score θ = 1 0.29 0.33 0.45 0.31 0.41 0.44support θ = 1 233 233 233 233 233 233In the above tables, we show the dynamic estimation results of the value of θ . Differentalgorithms have different prediction effects on the same data set. But different combinations ofresults and accuracy provide more decision support indicators for us to determine the value of θ . According to the fuzzy data of stock index options in different time spans, machine learningand deep learning algorithms can update the predicted value of θ in time and give the accuracyof the prediction. It helps us find the deterministic component in the random price fluctuationsof stock index options. Once θ is determined, we can apply it to the generalized BN-S modelintroduced in Section 2. 23 Conclusions
The risk management tool widely used by investors is stock index option, which is an impor-tant part of international derivatives market. The random fluctuation of stock index options hasbeen focused on in the financial products market. The generalized BN-S model inherits many ad-vantages of the traditional BN-S model in analyzing financial product price fluctuations throughthe superposition of the L´evy processes, and solves the problem of the traditional BN-S modelthat lacks long-term dependence. This paper takes stock index options as an example to explorethe broader application scenarios of the generalized BN-S model. Based on machine learningalgorithms and fuzzy theory, deterministic component θ is extracted from completely randomtime series fluctuations. Data preprocessing involving fuzzy parameters can more accuratelydescribe the range of fuzzy changes in market prices. Fuzzy prices containing different risk pref-erences, different market trends, and different investment objectives help us find a more suitable θ . Machine learning algorithms are believed to help mine the effective information hidden inbig data. We apply various supervision and deep learning techniques to identify θ . Throughthe learning of fuzzy price data, the fitting of random fluctuations is realized. The calculationresult will be the decision support for determining the value of θ .The development and application of stochastic model introduced in this paper can optimizethe function of traditional model, realize accurate dynamic fluctuation analysis, and enrich thetheoretical basis of financial risk management. Future research will continue to focus on theapplication of the generalized BN-S model in financial market analysis. Acknowledgments
This work is supported in part by the National Key Research and Development Program ofChina (2017YFB1401801), National Natural Science Foundation of China (71774042, 71532004)and China Scholarship Council (201906120273). The authors would like to thank the anonymousreviewers for their careful reading of the manuscript and for suggesting points to improve thequality of the paper.
References [1] Anadu, K., Kruttli, M., McCabe, P., & Osambela, E. (2020). The Shift from Active to Passive Investing:Risks to Financial Stability?. Financial Analysts Journal, 76(4), 23-39.[2] Awasthi, S., & SenGupta, I. (2020). First exit-time analysis for an approximate Barndorff-Nielsen andShephard model with stationary self-decomposable variance process. arXiv preprint arXiv:2006.07167.[3] Barndorff-Nielsen, O. E. (2001). Superposition of Ornstein–Uhlenbeck type processes. Theory of Probability& Its Applications, 45(2), 175-194.
4] Barndorff-Nielsen, O. E., & Shephard, N. (2001). Non-Gaussian Ornstein–Uhlenbeck-based models andsome of their uses in financial economics. Journal of the Royal Statistical Society: Series B (StatisticalMethodology), 63(2), 167-241.[5] Barndorff-Nielsen, O. E., Jensen, J. L., & Sørensen, M. (1998). Some stationary processes in discrete andcontinuous time. Advances in Applied Probability, 989-1007.[6] Bann¨o, K. F., & Scherer, M. (2013). A BNS-Type Stochastic Volatility Model With Two-Sided Jumps WithApplications to FX Options Pricing. Wilmott, 2013(65), 58-69.[7] Benth, F. E. (2011). The stochastic volatility model of Barndorff-Nielsen and Shephard in commodity mar-kets. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics,21(4), 595-625.[8] Cont, R., & Kukanov, A. (2017). Optimal order placement in limit order markets. Quantitative Finance,17(1), 21-39.[9] Culkin, R., & Das, S. R. (2017). Machine learning in finance: The case of deep learning for option pricing.Journal of Investment Management, 15(4), 92-100.[10] Deng, Y., Ren, Z., Kong, Y., Bao, F., & Dai, Q. (2016). A hierarchical fused fuzzy deep neural network fordata classification. IEEE Transactions on Fuzzy Systems, 25(4), 1006-1012.[11] Guo, T., & Antulov-Fantulin, N. (2018). Predicting short-term Bitcoin price fluctuations from buy and sellorders. arXiv preprint arXiv:1802.04065.[12] Gui, B., Wei, X., Shen, Q., Qi, J., & Guo, L. (2014, November). Financial time series forecasting usingsupport vector machine. In 2014 Tenth International Conference on Computational Intelligence and Security(pp. 39-43). IEEE.[13] Habtemicael, S., Ghebremichael, M., & SenGupta, I. (2019). Volatility and variance swap using superpositionof the Barndorff-Nielsen and Shephard type L´evy processes. Sankhya B, 1-18.[14] Hatami-Marbini, A., & Kangi, F. (2017). An extension of fuzzy TOPSIS for a group decision making withan application to Tehran stock exchange. Applied Soft Computing, 52, 1084-1097.[15] Ihsan, A., & SenGupta, I. (2018). Moments of the asset price for the Barndorff-Nielsen and Shephard model.Lithuanian Mathematical Journal, 58(4), 408-420.[16] Issaka, A., & SenGupta, I. (2017). Analysis of variance based instruments for Ornstein–Uhlenbeck typemodels: swap and price index. Annals of Finance, 13(4), 401-434.[17] Issaka, A., & SenGupta, I. (2017). Feynman path integrals and asymptotic expansions for transition prob-ability densities of some L´evy driven financial markets. Journal of Applied Mathematics and Computing,54(1), 159-182.[18] Jawadi, F., Louhichi, W., Cheffou, A. I., & Randrianarivony, R. (2016). Intraday jumps and trading volume:a nonlinear Tobit specification. Review of Quantitative Finance and Accounting, 47(4), 1167-1186.[19] Kallsen, J., Muhle-Karbe, J., & Voß, M. (2011). Pricing options on variance in affine stochastic volatilitymodels. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Eco-nomics, 21(4), 627-641.
20] Laurent Grillet-Aubert L (2020). Opportunities and Risks in the Financial Index Market.[21] Nowak, P., & Romaniuk, M. (2010). Computing option price for L´evy process with fuzzy parameters.European Journal of Operational Research, 201(1), 206-210.[22] Nowak, P., & Romaniuk, M. (2014). Application of L´evy processes and Esscher transformed martingalemeasures for option pricing in fuzzy framework. Journal of Computational and Applied Mathematics, 263,129-151.[23] Petry, J., Fichtner, J., & Heemskerk, E. (2019). Steering capital: the growing private authority of indexproviders in the age of passive asset management. Review of International Political Economy, 1-25.[24] Roberts M., & SenGupta I.(2020) Infinitesimal generators for two-dimensional L´evy process-driven hypoth-esis testing. Annals of Finance, 16 (1): 121-139.[25] Roberts, M., & SenGupta, I. (2021). Sequential hypothesis testing in machine learning, and crude oil pricejump size detection. To appear in Applied Mathematical Finance, Accepted on December, 2020.[26] Ruan, X. (2020). Volatility-of-volatility and the cross-section of option returns. Journal of Financial Markets,48, 100492.[27] Sirignano, J., & Cont, R. (2019). Universal features of price formation in financial markets: perspectivesfrom deep learning. Quantitative Finance, 19(9), 1449-1459.[28] SenGupta, I. (2016). Generalized BN–S stochastic volatility model for option pricing. International Journalof Theoretical and Applied Finance, 19(02), 1650014.[29] SenGupta, I. (2014). Pricing Asian options in financial markets using Mellin transforms. Electronic Journalof Differential Equations, 234, 1-9.[30] SenGupta, I., Nganje, W., & Hanson, E. (2020). Refinements of Barndorff-Nielsen and Shephard model: ananalysis of crude oil price with machine learning. Annals of Data Science, 1-17.[31] Shoshi, H., & SenGupta, I. (2020). Hedging and machine learning driven crude oil data analysis using arefined Barndorff-Nielsen and Shephard model. arXiv preprint arXiv:2004.14862.[32] Thavaneswaran, A., Appadoo, S. S., & Frank, J. (2013). Binary option pricing using fuzzy numbers. AppliedMathematics Letters, 26(1), 65-72.[33] Wang, X., He, J., & Li, S. (2014). Compound option pricing under fuzzy environment. Journal of AppliedMathematics, 2014.[34] Wu, H. C. (2004). Pricing European options based on the fuzzy pattern of Black–Scholes formula. Computers& Operations Research, 31(7), 1069-1081.[35] Wu, H. C. (2007). Using fuzzy sets theory and Black–Scholes formula to generate pricing boundaries ofEuropean options. Applied Mathematics and Computation, 185(1), 136-146.[36] Xu, W., Chen, Y., Coleman, C., & Coleman, T. F. (2018). Moment matching machine learning methods forrisk management of large variable annuity portfolios. Journal of Economic Dynamics and Control, 87, 1-20.[37] Yarovaya, L., Brzeszczy´nski, J., & Lau, C. K. M. (2016). Intra-and inter-regional return and volatilityspillovers across emerging and developed markets: Evidence from stock indices and stock index futures.International Review of Financial Analysis, 43, 96-114.
38] Yoshida, Y. (2003). The valuation of European options in uncertain environment. European Journal ofOperational Research, 145(1), 221-229.[39] Yoshida, Y., Yasuda, M., Nakagami, J. I., & Kurano, M. (2006). A new evaluation of mean value for fuzzynumbers and its application to American put option under uncertainty. Fuzzy Sets and Systems, 157(19),2614-2626.[40] Zadeh, L. A., Klir, G. J., & Yuan, B. (1996). Fuzzy sets, fuzzy logic, and fuzzy systems: selected papers(Vol. 6), 394-432.[41] Zhang, L. H., Zhang, W. G., Xu, W. J., & Xiao, W. L. (2012). The double exponential jump diffusion modelfor pricing European options under fuzzy environments. Economic Modelling, 29(3), 780-786.[42] Zmeˇskal, Z. (2010). Generalised soft binomial American real option pricing model (fuzzy–stochastic ap-proach). European Journal of Operational Research, 207(2), 1096-1103.38] Yoshida, Y. (2003). The valuation of European options in uncertain environment. European Journal ofOperational Research, 145(1), 221-229.[39] Yoshida, Y., Yasuda, M., Nakagami, J. I., & Kurano, M. (2006). A new evaluation of mean value for fuzzynumbers and its application to American put option under uncertainty. Fuzzy Sets and Systems, 157(19),2614-2626.[40] Zadeh, L. A., Klir, G. J., & Yuan, B. (1996). Fuzzy sets, fuzzy logic, and fuzzy systems: selected papers(Vol. 6), 394-432.[41] Zhang, L. H., Zhang, W. G., Xu, W. J., & Xiao, W. L. (2012). The double exponential jump diffusion modelfor pricing European options under fuzzy environments. Economic Modelling, 29(3), 780-786.[42] Zmeˇskal, Z. (2010). Generalised soft binomial American real option pricing model (fuzzy–stochastic ap-proach). European Journal of Operational Research, 207(2), 1096-1103.