A Bayesian regularization-backpropagation neural network model for peeling computations
Saipraneeth Gouravaraju, Jyotindra Narayan, Roger A. Sauer, Sachin Singh Gautam
AA Bayesian regularization-backpropagation neural networkmodel for peeling computations
Saipraneeth Gouravaraju , Jyotindra Narayan , Roger A. Sauer , and Sachin Singh Gautam ∗ Indian Institute of Technology Guwahati, Guwahati, India 781039 Graduate School, Aachen Institute for Advanced Study in Computational Engineering Science (AICES), RWTHAachen University, Templergraben 55, 52056 Aachen, Germany Department of Mechanical Engineering, Indian Institute of Technology Kanpur, UP 208016, India
Abstract
Bayesian regularization-backpropagation neural network (BR-BPNN), a machine learningalgorithm, is employed to predict some aspects of the gecko spatula peeling such as thevariation of the maximum normal and tangential pull-off forces and the resultant force angleat detachment with the peeling angle. The input data is taken from finite element (FE)peeling results. The neural network is trained with 75% of the FE dataset. The remaining25% are utilized to predict the peeling behavior. The training performance is evaluatedfor every change in the number of hidden layer neurons to determine the optimal networkstructure. The relative error is calculated to draw a clear comparison between predicted andFE results. It is observed that BR-BPNN models have significant potential to estimate thepeeling behavior.
Keywords : Machine learning, Adhesion, Peeling, Artificial neural networks, Bayesian regulariza-tion.
The study of peeling is essential in understanding the adhesion characteristics in many applicationssuch as adhesive tapes, micro- and nano-electronics [1, 2], coatings [3], microfiber arrays [4, 5],wearable medical bands [6], and cell adhesion [7]. Peeling problems have been used by manyresearchers to analyze multiscale adhesion in biological adhesive pads such as in geckos, insects,and spiders [8–11], where peeling is an important aspect of detachment. For example, the nanoscalespatulae in geckos are very thin structures (approximately 5 −
10 nm thick) with a width of around200 nm that can be modeled effectively as a thin strip [12–15]. These nanoscale structures interactwith substrates via short-range van der Waals forces [16]. Peeling of the various components inthe hierarchical microstructure of the adhesive pads have been studied extensively using analytical[12, 17, 18], experimental [16, 19–21], and numerical models [14, 15, 22, 23] to gain new insight intotheir mechanics. Several researchers used thin film peeling models to understand various aspects ofgecko adhesion such as reversible adhesion [22, 24], pre-straining [18, 25, 26], dynamic self-cleaning ∗ Corresponding Author, email: [email protected] a r X i v : . [ c s . C E ] J un In this section, the adhesive friction model of Mergel et al. [72] and its application to gecko spatulapeeling by Gouravaraju et al. [30, 70] are briefly described.2he “Model EA” of Mergel et al. [72] defines a sliding traction threshold T s that is non-zeroeven for tensile normal forces. This sliding threshold depends on the magnitude of the normaltraction T n = (cid:107) T n (cid:107) due to adhesion between the spatula and the substrate. Further, it is assumedthat the interfacial frictional forces act only up to a certain cut-off distance r c . Then we have, T s ( r ) = µ f J c (cid:104) T n ( r ) − T n ( r c ) (cid:105) , r < r c , , r ≥ r c , (1)where J c is the local contact surface stretch (= 1 for rigid substrates), µ f is the friction coefficient,and r denotes the minimum distance between the interacting surfaces.The normal traction T n is obtained from the variation of the total adhesion potential, whichis the summation of individual adhesion potentials acting between the molecules of the substrateand the spatula, and is given as [73] T n = A πr (cid:20) (cid:16) r r (cid:17) − (cid:16) r r (cid:17) (cid:21) n s , (2)where r is the equilibrium distance of the Lennard-Jones potential, A is Hamaker’s constant, and n s is the normal to the substrate.Similar to Coulomb’s friction model, the magnitude of frictional traction T f is governed by (cid:107) T f (cid:107) (cid:40) < T s for sticking,= T s for sliding, (3)and is computed using a predictor-corrector algorithm [30]. A Neo-Hookean material model isemployed to model the spatula response [74]. For further details on the application of the adhesivefriction model, we refer to Gouravaraju et al. [30].The spatula is modeled as a thin two-dimensional strip as shown in Fig. 1. A displacement ¯ u is applied to the spatula shaft at an angle called the peeling angle θ p and nonlinear finite elementanalysis is employed to solve the resulting mechanical boundary value problem. Figure 1:
Peeling of a deformable strip from a rigid substrate. The strip is adhering on 75% of thesurface.
The entire peeling of the spatula can be divided into two phases based on the evolution of thenormal and tangential pull-off forces shown in Fig. 2. In the first phase (from displacement ¯ u to¯ u max ), the spatula continuously undergoes stretching due to the fact that it is in a state of partialsliding/sticking near the peeling front. Thus, it accumulates strain energy. At ¯ u max the spatulais stretched to the maximum as the pull-off forces reach a maximum value. During the secondphase (from ¯ u max to ¯ u det ) the spatula fully slides on the substrate. As a result, the spatula relaxes3
10 20 30 40 50 60 70 80050100150 0300600900
Figure 2:
Evolution of normal ( F n ) and tangential ( F t ) pull-off forces with the applied displacement ¯ u for peeling angle θ p = 45 ◦ . and releases the accumulated energy until it detaches from the substrate spontaneously at ¯ u det .Similar peeling curves are obtained for other peeling angles.In this study, the focus is on three aspects of the peeling process viz. the maximum normalpull-off force F maxn , the maximum tangential pull-off force F maxt , and the resultant force angle α = arctan( F n /F t ) at detachment. It has been shown that depending on the peeling angle θ p , themaximum pull-off forces F maxn and F maxt , the corresponding displacement ¯ u max and the detachmentdisplacement ¯ u det vary considerably [30]. On the other hand, it has been observed that the resultantforce angle at detachment α det remains the same irrespective of the peeling angle (see Table A1in Appendix 1). In this section, a backpropagation neural network (BPNN) along with the Bayesian regularizationlearning algorithm are described.A classical neural network architecture mimics the function of the human brain. The brainneurons and their connections with each other form an equivalence relation with neural networkneurons and their associated weight values ( w ). In a single layer network with multiple neurons,each element u j of an input vector is associated with each neuron i with a corresponding weight w ij . A constant scalar term called bias b i corresponding to each neuron, which is like a weight, isgenerally introduced in order to increase the flexibility of the network. This bias b i is multipliedby a scalar input value (chosen to be 1 here) and is added to the weighted sum w ij u j of thevector components u j to form a net input n i . This net input n i is then passed to an activationfunction f (also called transfer function) that produces an output value a i . In general, a neuralnetwork consists of two or more layers. Adding a hidden layer of neurons between the input layerand output layer constitutes a multi-layer neural network, also named shallow neural network.Furthermore, addition of more than one hidden layer in the multi-layer neural network is called adeep neural network. 4raditionally, a BPNN model, a kind of multi-layer neural network, comprises three layers:an input layer, one or more hidden layers, and an output layer, as shown in Fig. 3. The inputlayer associates the input vector u having R elements with input weight matrix W and first biasvector b to yield an effective input n to the activation function f , which produces an outputvector a . The output vector a from the first layer forms the input to the hidden layer and isassociated with the weight matrix W and bias vector b of the hidden layer. At last, the hiddenlayer output a is given as an input to the output layer and delivers a predicted output a withweight matrix W and bias vector b . In a neural network with a total of n l number of layers,the weight matrix W l and bias vector b l for layer l (where l = 1 , , . . . , n l ) can be written as W l = w l w l w l . . . w l R w l w l w l . . . w l R ... ... ... . . . ... w lN l w lN l w lN l . . . w lN l R , b l = b l b l ... b lN l , (4)where N l denotes the number of neurons in layer l and the effective input n l is then given as n l = W l a l − + b l , with a = u . (5)The number of neurons in the input layer ( N ) and output layer ( N ) is linked to the number ofinput and output vectors, respectively. However, the number of neurons in the hidden layer ( N )are accountable for the quantification of the weights and biases. The optimal network structureis versed by the optimum number of neurons in each layer required for the training and denotedas N - N - N . A variety of activation functions are used in backpropagation neural network viz.,hard limit, linear, sigmoid, log-sigmoid, hyperbolic tangent sigmoid [75]. In the current work,linear activation functions are employed in all the layers according to which, the output is equalto the input i.e. a l = n l . Figure 3:
A typical backpropagation neural network with input, hidden, and output layers. Adaptedfrom [75].
The network error e is calculated by subtracting predicted output a o from target output t o .The sensitivity s , i.e. the measure of how the output of the network changes due to perturbationsin the input, is back-propagated from output layer ( s ) to input layer ( s ) via the hidden layer( s ). Through the backpropagation process, the error of the neurons in the hidden layer is esti-mated as the backward weighted sum of the sensitivity. Thereafter, to update weights, different5earning algorithms are used in association with the sensitivity such as the steepest descent, LM,and conjugate gradient algorithms. The sensitivity at layer l is calculated using the recurrencerelation [75] s l = ˙F l (cid:0) n l (cid:1) W l +1 s l +1 , where l = n l − , . . . , , , (6)with s n l = ˙F n l (cid:0) n n l (cid:1) ( t o − a o ) , (7)where ˙F l ( n l ) is a diagonal matrix containing the partial derivatives of the activation function f l with respect to the net inputs n l and is given as ˙F l (cid:0) n l (cid:1) = ˙ f l (cid:0) n l (cid:1) . . .
00 ˙ f l (cid:0) n l (cid:1) . . . . . . ˙ f l (cid:0) n lN l (cid:1) , where ˙ f l (cid:0) n lj (cid:1) = ∂f l (cid:0) n lj (cid:1) ∂n lj , (8)and for the considered linear activation function is equal to the identity matrix.The purpose of a backpropagation neural network model is to ensure a network with smalldeviations for the training dataset and supervise the unknown inputs effectively. The intricacyof the BPNN, monitored by neurons in the hidden layer and their associated weights, leads tooverfitting, i.e. the network tries to make the error as small as possible for the training setbut performs poorly when new data is presented. However, a robust network model should beable to generalize well, i.e. it should predict well even when presented with new data. Therefore,Bayesian regularization based learning of BPNN models is utilized to achieve better generalizationand minimal over-fitting for the trained networks [61, 62].Consider a neural network with training dataset D having n t number of input u and target t o vector pairs in the network model, i.e D = (cid:110) ( u , t o1 ) , ( u , t o2 ) , . . . , (cid:0) u n t , t o n t (cid:1)(cid:111) . (9)For each input to the network, the difference between target output ( t o ) and predicted output( a o ) is computed as error e . In order to evaluate the performance of the network, i.e. how well theneural network is fitting the test data, a quantitative measure is needed. This measure is calledperformance index of the network and is used to optimize the network parameters. The standardperformance index F ( ¯ w ) is governed by the sum of the squared errors (SSE) F ( ¯ w ) = E D = n t (cid:88) i =1 ( e i ) = n t (cid:88) i =1 ( t o i − a o i ) T ( t o i − a o i ) , (10)where ¯ w denotes the vector of size K containing all the weights and biases of the network¯ w T = (cid:2) w , w , . . . , w n l (cid:3) × K , (11)where K = N ( R + 1) + N (cid:0) N + 1 (cid:1) + . . . + N n l (cid:0) N n l − + 1 (cid:1) , (12)and (cid:0) w l (cid:1) T = (cid:2) w l , w l , . . . , w lN R , b l , b l , . . . , b lN l (cid:3) . (13)6s described in the introduction, in order to generalize the neural network, the performanceindex of Eq. (10) is modified using a regularization method. A penalty term ( µ/ν ) E w is added tothe performance index F ( ¯ w ) [76], F (cid:0) ¯ w (cid:1) = µE w + νE D , (14)where µ and ν are the regularization parameters and E w represents the sum of the squared networkweights (SSW), i.e. E w = ¯ w T ¯ w . (15)Finding the optimum values for µ and ν is a challenging task, as their comparative values setup the basis for the training error. If µ << ν , smaller errors are generated, while if µ >> ν , thereshould be reduced weight size at the cost of network errors [64]. For the purpose of finding theoptimum regularization parameters, a Bayesian regularization method is employed.Considering the network weights ¯ w as random variables, the aim is to choose the weights thatmaximize the posterior probability distribution of the weights P (cid:0) ¯ w | D, µ, ν, M N (cid:1) given a certaindata D . According to Bayes’ rule [61], the posterior distribution of the weights depends on thelikelihood function P (cid:0) D | ¯ w , ν, M N (cid:1) , the prior density P (cid:0) ¯ w | µ, M N (cid:1) , and the normalization factor P (cid:0) D | µ, ν, M N (cid:1) for a particular neural network model M N and can be evaluated from P (cid:0) ¯ w | D, µ, ν, M N (cid:1) = P (cid:0) D | ¯ w , ν, M N (cid:1) P (cid:0) ¯ w | µ, M N (cid:1) P (cid:0) D | µ, ν, M N (cid:1) . (16)Considering that the noise in the training set has a Gaussian distribution, the likelihood func-tion is given by P (cid:0) D | ¯ w , ν, M N (cid:1) = exp (cid:0) − νE D (cid:1) Z D (cid:0) ν (cid:1) , (17)where Z D = (cid:0) π/ν (cid:1) Q/ and Q = n t × N n l .Similarly, assuming a Gaussian distribution for the network weights, the prior probabilitydensity P (cid:0) ¯ w | µ, M N (cid:1) is given as P (cid:0) ¯ w | µ, M N (cid:1) = exp (cid:0) − µE w (cid:1) Z w (cid:0) µ (cid:1) , (18)where Z w = (cid:0) π/α (cid:1) K/ .The posterior probability with the network weights ¯ w can then be expressed as [64] P (cid:0) ¯ w | D, µ, ν, M N (cid:1) = exp (cid:0) − µE w − νE D (cid:1) Z F (cid:0) µ, ν (cid:1) = exp (cid:0) − F ( ¯ w ) (cid:1) Z F (cid:0) µ, ν (cid:1) , (19)where Z F (cid:0) µ, ν (cid:1) = Z D (cid:0) ν (cid:1) Z w (cid:0) µ (cid:1) is the normalization factor.The complexity of the model M N is governed by regularization parameters µ and ν , whichneed to be estimated from the data. Therefore, Bayes’ rule is again applied to optimize them asfollows: P (cid:0) µ, ν | D, M N (cid:1) = P (cid:0) D | µ, ν, M N (cid:1) P (cid:0) µ, ν | M N (cid:1) P (cid:0) D | M N (cid:1) , (20)where P (cid:0) µ, ν | M N (cid:1) denotes the assumed uniform prior density for the parameters µ and ν . FromEq. (20), it is evident that maximizing the likelihood function P (cid:0) D | µ, ν, M N (cid:1) eventually maximizes7he posterior probability P (cid:0) µ, ν | D, M N (cid:1) . Moreover, it can be noted that the likelihood functionin Eq. (20) is the normalization factor of Eq. (16). Therefore, solving for the likelihood function P (cid:0) D | µ, ν, M N (cid:1) and expanding the objective function in Eq. (14) around the minimal point ¯ w ∗ viaa Taylor series expansion, the optimum values of regularization parameters can be evaluated asfollows [77] µ ∗ = γ E w (cid:0) ¯ w ∗ (cid:1) and ν ∗ = Q − γ E D (cid:0) ¯ w ∗ (cid:1) , (21)where γ signifies the “number” of effective parameters exhausted in minimizing the error function γ = K − µ ∗ tr (cid:0) H ∗ (cid:1) − , f or ≤ γ ≤ K , (22)and H ∗ is the Hessian matrix of the objective function evaluated at ¯ w ∗ and is calculated usingthe Gauss-Newton approximation as [64] H ∗ ≈ J T J , (23)where J is the Jacobian matrix formed by the first derivatives of the network errors e with respectto network weights w ij . In (22), tr( · ) denotes the trace operator. The normalization factor Z F ( µ, ν )can then be approximated as [75] Z F (cid:0) µ, ν (cid:1) ≈ (cid:0) π (cid:1) K/ (cid:0) det (cid:0) H ∗ (cid:1)(cid:1) − / exp (cid:0) − F (cid:0) ¯ w ∗ (cid:1)(cid:1) . (24)At the end of the training, a few checks regarding the number of effective parameters arerequired for better performance of the network [64]. The problem of computing the Hessianmatrix at the minimal point ¯ w ∗ is implicitly solved in the Levenberg-Marquardt (LM) trainingalgorithm while finding the minimum of F ( ¯ w ). In the LM algorithm, the network weights andbiases at the k th iteration are adjusted according to [61, 77]¯ w k +1 = ¯ w k − (cid:2) J T J + λ I (cid:3) − J T e , (25)where λ denotes the Levenberg’s damping factor and J T e is the error gradient, which needs to beclose to zero at end of the training. In this work, the inputs u of the BR-BPNN models are the seventeen peeling angle values θ p rangingfrom 10 ◦ to 90 ◦ at an interval of 5 ◦ . The seventeen peeling angles are divided into the training,validation and testing sub-datasets as shown in Table 1. The training dataset is used to trainthe neural network model using Bayesian regularization method and the trained model is furthervalidated with the validation dataset. The validation dataset, in other back-propagation trainingalgorithms, is used to optimize the hyperparameters for effective training. The hyperparameters,like the number of neurons in the hidden layer and the learning parameters such as γ and λ , aredefined as the variables required for training the neural network. However, for BR based learningnetworks, the hyperparameters in the form of the regularization parameters ( µ, ν ) are implicitlyoptimized using Eq. (14). Therefore, the validation set is not essentially required in this casefor optimizing the network hyperparameters. Finally, the testing dataset is utilized to predictthe targeted output t o and analyze the model performance, accordingly. Appendix B presents asimple algorithmic overview of BR-BPNN. 8 able 1: Division of the input dataset.
Input Dataset Peeling angles ( θ p )Training set 10 ◦ , ◦ , ◦ , ◦ , ◦ , ◦ , ◦ , ◦ , ◦ , ◦ , ◦ , ◦ , ◦ Validation set Not requiredTesting set 50 ◦ , ◦ , ◦ , ◦ Table 2:
Output dataset for three different BR-BPNN models (see Appendix A for the FE results).
BPNN-I
Applied displacement at force maximum ¯ u max := [¯ u max1 , ¯ u max2 , . . . . . . , ¯ u max16 , ¯ u max17 ] T Maximum normal pull-off force F maxn := (cid:2) F max n , F max n . . . . . . , F max n , F max n (cid:3) T BPNN-II
Applied displacement at force maximum ¯ u max := [¯ u max1 , ¯ u max2 , . . . . . . , ¯ u max16 , ¯ u max17 ] T Maximum tangential pull-off force F maxt := (cid:2) F max t , F max t . . . . . . , F max t , F max t (cid:3) T BPNN-III
Applied displacement at detachment ¯ u det := (cid:2) ¯ u det1 , ¯ u det2 , . . . . . . , ¯ u det16 , ¯ u det17 (cid:3) T Resultant force angle at detachment α det := (cid:2) α det1 , α det2 . . . . . . , α det16 , α det17 (cid:3) T Next, three BR-BPNN models are formed with three different output datasets; each havingtwo output vectors, as shown in Table 2. The two output vectors for BPNN-I are the applieddisplacement at force maximum ¯ u max and the maximum normal pull-off force F maxn , for BPNN-IIthey are the applied displacement at force maximum ¯ u max and the maximum tangential pull-offforce F maxt , and for BPNN-III they are the applied displacement at detachment ¯ u det and theresultant force angle at detachment α det , respectively. Each output vector consists of 17 elementsin all three models. However, only thirteen elements corresponding to the input training dataset(see Table 1) are selected for training the BPNN models. Then, the input and output vectorsare normalized by the corresponding maximum values. The performance of BR-BPNN modelsare estimated by comparing the MSE values with the number of neurons in the hidden layer anddetermining the optimal number. The MSE is computed using the network error and defined asthe mean of the sum of squared networks errors, i.e.MSE = 1 n t E D . (26) This section presents the BR based backpropagation neural network predictions of the maximumnormal pull-off force F maxn , the maximum tangential pull-off force F maxt , and the resultant force9ngle at detachment α det along with the corresponding displacements ¯ u max and ¯ u det . Predictionsof the networks are then compared with the FE results of Gouravaraju et al. [30, 70] that havenot been yet used for training. -3 -2 -1 Figure 4:
Mean square error vs. number of neurons in the hidden layer for different BPNN models.
To define the optimal structure of each network model, the mean square error (MSE) of Eq. (26)is investigated along with the number of neurons (1 to 16) in the hidden layer. For the three BRbased BPNN models (BPNN- I, BPNN- II and BPNN- III), training is performed with 1 to 16hidden neurons. The MSE values for all three models with only one hidden neuron are found tobe very high i.e. 152 , .
47 and 5 .
79, being incapable to form an efficient network. However, asthe number of hidden neurons increases to two, a sudden drop in the MSE values (0 . , . , and1 .
24) is recorded. Each model is trained 15 times independently for different number of neuronsto mitigate the unfavorable effects by choosing random initial weights. Each network model istrained for a maximum of 2000 epochs. An epoch is completed when the entire training datasetis passed forward and backward through the network thus updating the weights once. For theBPNN-II, the mean square error attains a broad minimum and continuous to decrease between 8and 10 hidden neurons as shown in Fig. 4. For N greater than 11, the MSE value again starts torise due to overfitting of the network models. Therefore, for BPNN- II the number of neurons inthe hidden layer is selected as 8. The number of neurons in the input and output layers are takenas 1 and 2 as there is one input vector and two output vectors for each model. Then the optimalnetwork structure of BPNN-II is formed as 1-8-2. Following a similar trend, the optimal numberof hidden neurons for BPNN-I and BPNN-III models is found to be 6 and 5, forming the networkstructure 1-6-2 and 1-5-2, respectively (see Table 3).Either of the following criteria are selected to terminate or complete the training process:maximum number of epochs reached, minimum value of performance gradient reached, minimumconstant value of effective parameters ( γ ) reached, maximum value of Levenberg’s damping factor( λ ) attained, or MSE reached within the performance limits. The training results are achieved at717 , , and 1000 epochs for the three different models having MSE of 0 . , . , and 0 . γ (Eq. (22)) for the three modelsare recorded as 19 . , . , and 14 .
2, respectively. The other network training parameters like thetraining time, sum of square errors (SSE) (Eq. (10)), sum of square weights (SSW) (Eq. (15)),Levenberg’s damping factor, and error gradient (Eq. (25)) values are also shown in Table 3.10 able 3:
Training parameters for the three BR-BPNN models.
Network Network Epochs Time MSE SSE SSW Number of effective LM GradientModels Structure [Min:sec] ( E D ) ( E w ) parameters ( γ ) Parameter ( λ ) ( J T e )BPNN-I 1-6-2 717 00 : 07 0 .
003 0 .
053 137 19 . . × . × − BPNN-II 1-8-2 1992 00 : 55 0 .
090 1 .
097 70 . . . × . × − BPNN-III 1-5-2 1000 00 : 10 0 .
006 0 .
036 82 . . . × . × − After training the models with input-output dataset containing thirteen values, the testingdataset having the four peeling angles 50 ◦ , 55 ◦ , 85 ◦ and 90 ◦ , are utilized to predict the corre-sponding desired output values. The relative error (RE) is used to measure the accuracy of thenetwork predictions. RE is calculated as the deviation of the predicted result from the desiredtarget result, i.e. RE = (cid:18) t i − a i t i (cid:19) , (27)where t i and a i denote the desired target result and the network prediction for a particular peelingangle of the testing data set, respectively. Based on the training parameters from Table 3, Fig. 5 presents the predicted (BPNN-I) resultsof the maximum normal pull-off force F maxn and the corresponding applied displacement ¯ u max . Itcan be seen from Fig. 5a that the predicted values of F maxn for θ p = 50 ◦ , ◦ , ◦ are very close tothe desired target results (that are obtained by FE). However, for θ p = 90 ◦ , the predicted resultsshow slightly more deviation compared to the other testing angles. The predictions are similar for¯ u max as shown in Fig. 5b. This can also be observed from Table 4, which lists the relative error(RE) for the four testing angles.
50 55 60 65 70 75 80 85 908090100110120130140 (a)
50 55 60 65 70 75 80 85 90303540455055 (b)Figure 5:
Predictions from BR-BPNN-I (a) maximum normal pull-off force F maxn and (b) applieddisplacement values at the force maximum ¯ u max . able 4: Relative error (RE) for the predictions of BR-BPNN-I model.
Peeling angle θ p [degrees] RE in ¯ u max [%] RE in F maxn [%]50 ◦ . . ◦ . . ◦ . . ◦ . . From Table 3, the training parameters of BPNN-II are utilized to predict the maximum tangentialpull-off force F maxt and the corresponding applied displacement ¯ u max at the four testing angles. Asshown in Fig. 6a, the deviations of the values predicted using BR-BPNN for all the testing angles θ p = 50 ◦ , 55 ◦ , 85 ◦ , and 90 ◦ are very small. However, for ¯ u max the predicted values show slightlylarger deviation from the desired target results for θ p = 90 ◦ . The RE for both F maxt and ¯ u max is given in Table 5. These results show that the except for ¯ u max at θ p = 90 ◦ , the current neuralnetwork is very accurate.
50 55 60 65 70 75 80 85 90200300400500600700800900 (a)
50 55 60 65 70 75 80 85 90303540455055 (b)Figure 6:
Predictions from BR-BPNN-II (a) maximum tangential pull-off force F maxt and (b) applieddisplacement values at the force maximum ¯ u max , at four testing peeling angles. Table 5:
Relative error (RE) for the predictions of BR-BPNN-II model.
Peeling angle θ p [degrees] RE in ¯ u max [%] RE in F maxt [%]50 ◦ . . ◦ . . ◦ . . ◦ . . .3 Case III: Resultant force angle at detachment Figure 7 shows the predictions for the output dataset of BPNN-III, i.e. the applied displacement atdetachment ¯ u det and the resultant force angle at detachment α det using the corresponding trainingparameters from Table 3. As shown in Fig. 7b, the predicted values of α det are also very close tothe desired target results as seen from Fig. 7a. The RE values corresponding to α det predictions for50 ◦ , 55 ◦ , 85 ◦ , and 90 ◦ are estimated to be 0 . , . , . . u det for θ p = 50 ◦ and 90 ◦ are found to be very close to the desired targetresults (see Fig. 7a). This can also be observed from the RE results in Table 6. It can be observedthat the predictions are very accurate even outside of the training data set.
50 55 60 65 70 75 80 85 9025.425.4525.525.5525.625.6525.7 (a)
50 55 60 65 70 75 80 85 90657075808590 (b)Figure 7:
Predictions of BR-BPNN (a) the resultant force angle at detachment α det and (b) applieddisplacement at detachment ¯ u det , at four testing peeling angles. Table 6:
Relative error (RE) for the predictions of BR-BPNN-III model.
Peeling angle θ p [degrees] RE in ¯ u det [%] RE in α det [%]50 ◦ . . ◦ . . ◦ . . ◦ . . u max than for F maxn and F maxt . BPNN-III performs better even for the data outside the training data set, which couldbe due to the fact that α det is almost constant for all the peeling angles θ p . Whereas in case ofBPNN-I and BPNN-II, ¯ u max , F maxn , and F maxt vary quite abruptly near θ p = 90 ◦ .13 Conclusions
An artificial neural network is employed to study the peeling behavior of a peeling strip suchas a gecko spatula. Particularly, the variation of the maximum normal and tangential pull-offforces and the resultant force angle at detachment with the peeling angle is investigated. Bayesianregularization is used to improve the robustness of the backpropagation neural network and to elim-inate cross-validation. The input data is obtained from the finite element analysis of Gouravarajuet al. [30, 70]. Three networks corresponding to the maximum normal pull-off force, maximumtangential pull-off force, and the resultant force angle at detachment and their corresponding dis-placements are formed. The number of hidden neurons in each model are evaluated based on theirrespective mean square errors. From all the results, maximum and minimum relative deviations ofthe predicted values from the FE results are found to be 8 .
52% and 0 . Appendix A. Results from finite element simulations
Table A1 lists the values of the maximum normal force F maxn , maximum tangential force F maxt ,applied displacement at force maximum ¯ u max , applied displacement at ¯ u det , and resultant forceangle at detachment α det for different peeling angles as obtained by Gouravaraju et al. [30, 70]using nonlinear finite element analysis. Table A1:
Data from finite element results of Gouravaraju et al. [30, 70].
Peeling Applied Maximum normal Maximum tangential Applied Resultant forceCase angle displacement at pull-off force pull-off force displacement at angle at detachment θ p force maximum F maxn F maxt at detachment α det [degrees] ¯ u max [nm] [nN] [nN] ¯ u det [nm] [degrees]1 10 41 . . .
719 393 . . . . .
699 263 . . . . .
545 199 . . . . .
153 161 . . . . .
391 136 . . . . .
944 119 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ppendix B. Framework of Bayesian regularization based backpropa-gation The algorithm for the Bayesian regularization based backpropagation is composed of the followingsteps:1. Pick training data set D containing the 13 cases specified in Table 1, 2 and Appendix A.(a) Input vector, u : Peeling angles θ p (b) Target output vector, t o : ¯ u max , F maxn (for BPNN-I)¯ u max , F maxt (for BPNN-II)¯ u det , α det (for BPNN-III)2. Initialize neural network with(a) Number of neurons in the input layer equal to the number of input vectors, which isequal to 1 for all the three BPNN models as described in step 1(a), i.e. N = 1.(b) Number of neurons in the output layer equal to the number of output vectors, whichis equal to 2 for all the three BPNN models as described in Table 2, i.e. N = 2.(c) Number of neurons in the hidden layer equal to one, i.e. N = 1.3. Set learning method to Bayesian regularization(a) Set maximum number of epochs to 2000.(b) Divide the training data set as per Table 1.4. Train the network(a) Compute regularization parameters µ and ν using Eq. (21).(b) Backpropagate sensitivities calculated using Eqs. (6) and (7).(c) Update weights using Eq. (25).5. Compute mean square error (MSE) using Eq. (26).6. Loop over steps 4 and 5 with different number of neurons in the hidden layer.7. Plot the MSE vs. number of neurons in the hidden layer as in Fig. 4.8. Select the number of neurons in the hidden layer to be the value from which MSE attainsa broad minimum and decreases as N is further increased. This determines the optimalnetwork structure N - N - N .9. Retrain the neural network model with optimal network structure from step 8.10. Save the model parameters (as in Table 3) along with weights and biases.11. Using the saved parameters in step 10, predict for the testing data set in Table 1.15 eferences [1] K. Komvopoulos. Adhesion and friction forces in microelectromechanical systems: mecha-nisms, measurement, surface modification techniques, and adhesion theory. J. Adhes. Sci.Technol. , 17(4):477–517, 2003. doi:10.1163/15685610360554384.[2] X. Zhang, Y. Liu, Y. Liu, and S.I.-U. Ahmed. Controllable and switchable capillary adhesionmechanism for bio-adhesive pads: Effect of micro patterns.
Sci. Bull. , 54(10):1648–1654,2009. doi:10.1007/s11434-009-0234-z.[3] M. Sexsmith and T. Troczynski. Peel adhesion test for thermal spray coatings.
J. Therm.Spray Technol. , 3(4):404–411, 1994. doi:10.1007/BF02658986.[4] C. Majidi, R.E. Groff, Y. Maeno, B. Schubert, S. Baek, B. Bush, R. Maboudian, N. Gravish,M. Wilkinson, K. Autumn, and R.S. Fearing. High Friction from a Stiff Polymer UsingMicrofiber Arrays.
Phys. Rev. Lett. , 97(7):076103, 2006. doi:10.1103/PhysRevLett.97.076103.[5] B. Schubert, C. Majidi, R.E. Groff, S. Baek, B. Bush, R. Maboudian, and R.S. Fearing.Towards friction and adhesion from high modulus microfiber arrays.
J. Adhes. Sci. Technol. ,21(12-13):1297–1315, 2007. doi:10.1163/156856107782328344.[6] D. Drotlef, M Amjadi, M Yunusa, and M Sitti. Bioinspired Composite Microfibers for SkinAdhesion and Signal Amplification of Wearable Sensors.
Adv. Mater. , 29(28):1701353, 2017.doi:10.1002/adma.201701353.[7] C. Zhu. Kinetics and mechanics of cell adhesion.
J. Biomech. , 33(1):23–33, 2000.doi:10.1016/S0021-9290(99)00163-3.[8] B.N.J. Persson and S. Gorb. The effect of surface roughness on the adhesion of elasticplates with application to biological systems.
J. Chem. Phys. , 119(21):11437–11444, 2003.doi:10.1063/1.1621854.[9] R.A. Sauer. Multiscale modelling and simulation of the deformation and adhesion ofa single gecko seta.
Comput. Methods Biomech. Biomed. Engin. , 12(6):627–640, 2009.doi:10.1080/10255840902802917.[10] D. Labonte and W. Federle. Biomechanics of shear-sensitive adhesion in climbing animals:peeling, pre-tension and sliding-induced changes in interface strength.
J. R. Soc. Interface ,13(122):20160373, 2016. doi:10.1098/rsif.2016.0373.[11] W. Federle and D. Labonte. Dynamic biological adhesion: mechanisms for controlling at-tachment during locomotion.
Philos. Trans. R. Soc. B Biol. Sci. , 374(1784):20190199, 2019.doi:10.1098/rstb.2019.0199.[12] Y. Tian, N. Pesika, H. Zeng, K. Rosenberg, B. Zhao, P. McGuiggan, K. Autumn, and J. Is-raelachvili. Adhesion and friction in gecko toe attachment and detachment.
Proc. Natl. Acad.Sci. , 103(51):19320–19325, 2006. doi:10.1073/pnas.0608841103.[13] N. S. Pesika, Y. Tian, B. Zhao, K. Rosenberg, H. Zeng, P. McGuiggan, K. Autumn, andJ. N. Israelachvili. Peel-Zone Model of Tape Peeling Based on the Gecko Adhesive System.
J. Adhes. , 83(4):383–401, 2007. doi:10.1080/00218460701282539.1614] Z.L. Peng, S.H. Chen, and A.K. Soh. Peeling behavior of a bio-inspired nano-film on asubstrate.
Int. J. Solids Struct. , 47(14-15):1952–1960, 2010. doi:10.1016/j.ijsolstr.2010.03.035.[15] R.A. Sauer. The Peeling Behavior of Thin Films with Finite Bending Stiff-ness and the Implications on Gecko Adhesion.
J. Adhes. , 87(7-8):624–643, 2011.doi:10.1080/00218464.2011.596084.[16] K. Autumn, Y.A. Liang, S.T. Hsieh, W. Zesch, W.P. Chan, T.W. Kenny, R. Fearing, andR.J. Full. Adhesive force of a single gecko foot-hair.
Nature , 405(6787):681–685, 2000.doi:10.1038/35015073.[17] K. Takahashi, J.O.L. Berengueres, K.J. Obata, and S. Saito. Geckos’ foot hair structure andtheir ability to hang from rough surfaces and move quickly.
Int. J. Adhes. Adhes. , 26(8):639–643, 2006. doi:10.1016/j.ijadhadh.2005.12.002.[18] B. Chen, P. Wu, and H. Gao. Pre-tension generates strongly reversible adhesion of a spatulapad on substrate.
J. R. Soc. Interface , 6(35):529–537, 2009. doi:10.1098/rsif.2008.0322.[19] K. Autumn. Mechanisms of Adhesion in Geckos.
Integr. Comp. Biol. , 42(6):1081–1090, 2002.doi:10.1093/icb/42.6.1081.[20] G. Huber, S.N. Gorb, R. Spolenak, and E. Arzt. Resolving the nanoscale adhesionof individual gecko spatulae by atomic force microscopy.
Biol. Lett. , 1(1):2–4, 2005.doi:10.1098/rsbl.2004.0254.[21] W. Sun, P. Neuzil, T.S. Kustandi, S. Oh, and V.D. Samper. The Nature of the Gecko LizardAdhesive Force.
Biophys. J. , 89(2):L14–L17, 2005. doi:10.1529/biophysj.105.065268.[22] H. Gao, X. Wang, H. Yao, S. Gorb, and E. Arzt. Mechanics of hierarchical adhesion structuresof geckos.
Mech. Mater. , 37(2-3):275–285, 2005. doi:10.1016/j.mechmat.2004.03.008.[23] R.A. Sauer and M. Holl. A detailed 3D finite element analysis of the peeling behaviourof a gecko spatula.
Comput. Methods Biomech. Biomed. Engin. , 16(6):577–591, 2013.doi:10.1080/10255842.2011.628944.[24] K. Autumn, A. Dittmore, D. Santos, and M. Spenko, M. Cutkosky. Frictional adhesion: a newangle on gecko attachment.
J. Exp. Biol. , 209(18):3569–3579, 2006. doi:10.1242/jeb.02486.[25] Q.H. Cheng, B. Chen, H.J. Gao, and Y.W. Zhang. Sliding-induced non-uniform pre-tensiongoverns robust and reversible adhesion: a revisit of adhesion mechanisms of geckos.
J. R.Soc. Interface , 9(67):283–291, 2012. doi:10.1098/rsif.2011.0254.[26] Z. Peng and S. Chen. Effect of pre-tension on the peeling behavior of a bio-inspirednano-film and a hierarchical adhesive structure.
Appl. Phys. Lett. , 101(16):163702, 2012.doi:10.1063/1.4758481.[27] S. Hu, S.and Lopez, P.H. Niewiarowski, and Z. Xia. Dynamic self-cleaning ingecko setae via digital hyperextension.
J. R. Soc. Interface , 9(76):2781–2790, 2012.doi:10.1098/rsif.2012.0108. 1728] M.R. Begley, R.R. Collino, J.N. Israelachvili, and R.M. McMeeking. Peeling of a tape withlarge deformations and frictional sliding.
J. Mech. Phys. Solids , 61(5):1265–1279, 2013.doi:10.1016/j.jmps.2012.09.014.[29] R.R. Collino, N.R. Philips, M.N. Rossol, R.M. McMeeking, and M.R. Begley. Detachment ofcompliant films adhered to stiff substrates via van der Waals interactions: role of frictionalsliding during peeling.
J. R. Soc. Interface , 11(97):20140453, 2014. doi:10.1098/rsif.2014.0453.[30] S. Gouravaraju, R.A. Sauer, and S.S. Gautam. Investigating the normal and tangentialpeeling behaviour of gecko spatulae using a coupled adhesion-friction model.
J. Adhes. , pages1–32, 2020. doi:10.1080/00218464.2020.1719838.[31] J.A. Williams and J.J. Kauzlarich. The influence of peel angle on the mechanics of peelingflexible adherends with arbitrary loadextension characteristics.
Tribol. Int. , 38(11-12):951–958, 2005. doi:10.1016/j.triboint.2005.07.024.[32] Z. Peng, H. Yin, Y. Yao, and S. Chen. Effect of thin-film length on the peeling behavior of film-substrate interfaces.
Phys. Rev. E , 100(3):032804, 2019. doi:10.1103/PhysRevE.100.032804.[33] A.M. Garner, A.Y. Stark, S.A. Thomas, and P.H. Niewiarowski. Geckos go the Distance:Water’s Effect on the Speed of Adhesive Locomotion in Geckos.
J. Herpetol. , 51(2):240–244,2017. doi:10.1670/16-010.[34] R.A. Sauer. A Survey of Computational Models for Adhesion.
J. Adhes. , 92(2):81–120, 2016.doi:10.1080/00218464.2014.1003210.[35] G. Yagawa and H. Okuda. Neural networks in computational mechanics.
Arch. Comput.Methods Eng. , 3(4):435–512, 1996. doi:10.1007/BF02818935.[36] R. Hambli. Numerical procedure for multiscale bone adaptation prediction based on neu-ral networks and finite element simulation.
Finite Elem. Anal. Des. , 47(7):835–842, 2011.doi:10.1016/j.finel.2011.02.014.[37] N. Sengupta, Md. Sahidullah, and G. Saha. Lung sound classification us-ing cepstral-based statistical features.
Comput. Biol. Med. , 75:118–129, 2016.doi:10.1016/j.compbiomed.2016.05.013.[38] E. Alizadeh, S.M. Lyons, J.M. Castle, and A. Prasad. Measuring systematic changes ininvasive cancer cell shape using Zernike moments.
Integr. Biol. , 8(11):1183–1193, 2016.doi:10.1039/C6IB00100A.[39] A. Oishi and G. Yagawa. Computational mechanics enhanced by deep learning.
Comput.Methods Appl. Mech. Eng. , 327:327–351, 2017. doi:10.1016/j.cma.2017.08.040.[40] S. Islam and A. Kim. Machine Learning Enabled Wearable Brain Deformation Sens-ing System. In , pages 1–3. IEEE, 2019.doi:10.1109/SPMB47826.2019.9037843.[41] B. Sattari Baboukani, Z. Ye, K. G. Reyes, and P. C. Nalam. Prediction of nanoscale frictionfor two-dimensional materials using a machine learning approach.
Tribol. Lett. , 68:57, 2020.doi:https://doi.org/10.1007/s11249-020-01294-w.1842] I.E. Lagaris, Aristidis Likas, and D.I. Fotiadis. Artificial neural networks for solving ordi-nary and partial differential equations.
IEEE Trans. Neural Networks , 9(5):987–1000, 1998.doi:10.1109/72.712178.[43] S.J.S. Hakim and H. Abdul Razak. Structural damage detection of steel bridge girder usingartificial neural networks and finite element models.
Steel Compos. Struct. , 14(4):367–377,2013. doi:10.12989/scs.2013.14.4.367.[44] H. Sadegh, A.N. Mehdi, and A. Mehdi. Classification of acoustic emission signals generatedfrom journal bearing at different lubrication conditions based on wavelet analysis in combi-nation with artificial neural network and genetic algorithm.
Tribol. Int. , 95:426–434, 2016.doi:10.1016/j.triboint.2015.11.045.[45] L. Liang, M. Liu, C. Martin, and W. Sun. A deep learning approach to estimate stressdistribution: a fast and accurate surrogate of finite-element analysis.
J. R. Soc. Interface , 15(138):20170844, 2018. doi:10.1098/rsif.2017.0844.[46] J. Zurada.
Introduction to Artificial Neural Systems . West Publishing Co., USA, 1992.[47] S. Yoshimura, A. Matsuda, and G. Yagawa. New regularization by transformationfor neural network based inverse analyses and its application to structure identifica-tion.
Int. J. Numer. Methods Eng. , 39(23):3953–3968, 1996. doi:10.1002/(SICI)1097-0207(19961215)39:23 < > Int. J. Numer. Methods Eng. , 75(11):1341–1360, 2008. doi:10.1002/nme.2304.[49] T. Furukawa and G. Yagawa. Implicit constitutive modelling for viscoplasticity using neu-ral networks.
Int. J. Numer. Methods Eng. , 43(2):195–219, 1998. doi:10.1002/(SICI)1097-0207(19980930)43:2 < > Comput. Methods Appl.Mech. Eng. , 191(3-5):353–384, 2001. doi:10.1016/S0045-7825(01)00278-X.[51] H. Man and T. Furukawa. Neural network constitutive modelling for non-linear charac-terization of anisotropic materials.
Int. J. Numer. Methods Eng. , 85(8):939–957, 2011.doi:10.1002/nme.2999.[52] S.W. Liu, Jin H. Huang, J.C. Sung, and C.C. Lee. Detection of cracks using neural networksand computational mechanics.
Comput. Methods Appl. Mech. Eng. , 191(25-26):2831–2845,2002. doi:10.1016/S0045-7825(02)00221-9.[53] J. Zacharias, C. Hartmann, and A. Delgado. Damage detection on crates of beverages byartificial neural networks trained with finite-element data.
Comput. Methods Appl. Mech.Eng. , 193(6-8):561–574, 2004. doi:10.1016/j.cma.2003.10.009.[54] A. Oishi and S. Yoshimura. A new local contact search method using a multi-layer neuralnetwork.
Comput. Model. Eng. Sci. , 2007. doi:10.3970/cmes.2007.021.093.1955] A. Oishi and G. Yagawa. A surface-to-surface contact search method enhanced by deeplearning.
Comput. Mech. , 65(4):1125–1147, 2020. doi:10.1007/s00466-019-01811-2.[56] L. Manevitz, M. Yousef, and D. Givoli. Finite-Element Mesh Generation Using Self-Organizing Neural Networks.
Comput. Civ. Infrastruct. Eng. , 12(4):233–250, 1997.doi:10.1111/0885-9507.00060.[57] L.A. Gyurova and K. Friedrich. Artificial neural networks for predicting sliding frictionand wear properties of polyphenylene sulfide composites.
Tribol. Int. , 44(5):603–609, 2011.doi:10.1016/j.triboint.2010.12.011.[58] K.M. Hamdia, H. Ghasemi, Y. Bazi, H. AlHichri, N. Alajlan, and T. Rabczuk. A novel deeplearning based method for the computational material design of flexoelectric nanostructureswith topology optimization.
Finite Elem. Anal. Des. , 2019. doi:10.1016/j.finel.2019.07.001.[59] D. Nowell and P.W. Nowell. A machine learning approach to the prediction of fretting fatiguelife.
Tribol. Int. , 141:105913, 2020. doi:10.1016/j.triboint.2019.105913.[60] I.I. Argatov and Y.S. Chai. An artificial neural network supported regression model for wearrate.
Tribol. Int. , 138:211–214, 2019. doi:10.1016/j.triboint.2019.05.040.[61] D.J.C. MacKay. A Practical Bayesian Framework for Backpropagation Networks.
NeuralComput. , 4(3):448–472, 1992. doi:10.1162/neco.1992.4.3.448.[62] F. Burden and D. Winkler. Bayesian Regularization of Neural Networks. In
Methods Mol.Biol. , pages 23–42. 2008. doi:10.1007/978-1-60327-101-1 3.[63] L.M. Saini and M.K. Soni. Artificial neural network based peak load forecasting using Lev-enbergMarquardt and quasi-Newton methods.
IEE Proc. - Gener. Transm. Distrib. , 149(5):578, 2002. doi:10.1049/ip-gtd:20020462.[64] M. Kayri. Predictive Abilities of Bayesian Regularization and LevenbergMarquardt Algo-rithms in Artificial Neural Networks: A Comparative Empirical Study on Social Data.
Math.Comput. Appl. , 21(2):20, 2016. doi:10.3390/mca21020020.[65] M. Lefik and B.A. Schrefler. Artificial neural network as an incremental non-linear constitutivemodel for a finite element code.
Comput. Methods Appl. Mech. Eng. , 192(28-30):3265–3283,2003. doi:10.1016/S0045-7825(03)00350-5.[66] H.-F Li and S.-Y. Lee. Mining frequent itemsets over data streams using efficient windowsliding techniques.
Expert Syst. Appl. , 36(2):1466–1477, 2009. doi:10.1016/j.eswa.2007.11.061.[67] J.L. Ticknor. A Bayesian regularized artificial neural network for stock market forecasting.
Expert Syst. Appl. , 40(14):5501–5506, 2013. doi:10.1016/j.eswa.2013.04.013.[68] S. Koroglu, P. Sergeant, and N. Umurkan. Comparison of analytical, finite element and neuralnetwork methods to study magnetic shielding.
Simul. Model. Pract. Theory , 18(2):206–216,2010. doi:10.1016/j.simpat.2009.10.007.[69] K. Yang and M. Xiong. Prediction of CH 4 adsorption on different activated carbons bydeveloping an optimal multilayer perceptron artificial neural network.
Energy Sources, PartA Recover. Util. Environ. Eff. , 41(17):2061–2072, 2019. doi:10.1080/15567036.2018.1549161.2070] S. Gouravaraju, R.A. Sauer, and S.S. Gautam. On the presence of a critical detachmentangle in gecko spatula peeling - a numerical investigation using an adhesive friction model.
J. Adhes. , 2020. doi:http://dx.doi.org/10.1080/00218464.2020.1746652.[71] R.A. Sauer and S. Li. A contact mechanics model for quasi-continua.
Int. J. Numer. MethodsEng. , 71(8):931–962, 2007. doi:10.1002/nme.1970.[72] J.C. Mergel, R. Sahli, J. Scheibert, and R.A. Sauer. Continuum contact models for coupledadhesion and friction.
J. Adhes. , 95(12):1101–1133, 2019. doi:10.1080/00218464.2018.1479258.[73] R.A. Sauer and P. Wriggers. Formulation and analysis of a three-dimensional finite elementimplementation for adhesive contact at the nanoscale.
Comput. Methods Appl. Mech. Eng. ,198(49-52):3871–3883, 2009. doi:10.1016/j.cma.2009.08.019.[74] J. Bonet and R.D. Wood.
Nonlinear Continuum Mechanics for Finite Element Analysis .Cambridge University Press, London, 2nd edition, 2008.[75] H.B. Demuth, M.H. Beale, O. De Jess, and M.T. Hagan.
Neural network design . MartinHagan, 2nd edition, 2014.[76] A.N. Tikhonov. Solution of ill-posed problems and the regularization method.
Dokl. Akad.Nauk SSSR , 151:501–504, 1963.[77] F. Dan Foresee and M.T. Hagan. Gauss-Newton approximation to Bayesian learn-ing. In