Spatio-Temporal Activation Function To Map Complex Dynamical Systems
SSpatio-Temporal Activation Function To Map Complex Dynamical Systems
Parth Mahendra
Reading School, Erleigh Road, Reading, Berkshire, RG1 5LW, UK.
Most of the real world is governed by complex and chaotic dynamical systems. All of these dynamical systemspose a challenge in modelling them using neural networks. Currently, reservoir computing, which is a subsetof recurrent neural networks, is actively used to simulate complex dynamical systems. In this work, a twodimensional activation function is proposed which includes an additional temporal term to impart dynamicbehaviour on its output. The inclusion of a temporal term alters the fundamental nature of an activationfunction, it provides capability to capture the complex dynamics of time series data without relying onrecurrent neural networks. a a) From 27th September 2020, Department of Computer Science,University of Warwick, CV4 7AL, UK. a r X i v : . [ c s . N E ] S e p . INTRODUCTION Most of the real world is governed by complex andchaotic dynamical systems, vibration engineering, car-diac arrhythmia and stock prices. All of these dynamicalsystems pose a challenge in modelling via using neuralnetworks.There are several types of neural networks, such as feed-forward, CNN (Convolutional Neural Networks), RNN(Recurrent Neural Networks), Autoencoders, etc. Feed-forward and convolutional neural networks are made upof neurons connected in layers, which take inputs fromthe previous layer and pass on the output to the nextlayer. Recurrent neural networks differ from both feed-forward and convolutional neural networks as the neu-rons in them are connected in a “recurrent” manner, andmay have feedback loops between the sets of neurons.The feedback loop allows the network additional capabil-ity to capture temporal dynamics and learn time seriesdata. Reservoir computing, which is a subset of recurrentneural networks, was first proposed by Jaeger . In thereservoir computing approach , the large pool of neu-rons act as a “reservoir” where the input data is imparteda dynamic behaviour, to generate output.The primary motivation behind this research is to modelcomplex dynamics without relying on a recurrent neuralnetwork by including a temporal part in an activationfunction. The activation function fluctuates based on atime-dependent parameter that can be generated in dif-ferent ways such as the logistic or cubic map. II. SPATIO-TEMPORAL ACTIVATIONFUNCTION
Each type of network may differ in structural topol-ogy, but all share the same basic building block - the
FIG. 1. Activation Function and its derivative for φ ( t ) = 1, φ ( t ) = 2 . neuron and an activation function. In neural networks,the activation function acts as a way to map the weightedsum of the inputs going into each neuron to a useful out-put. There are four major types of activation functions:binary step, sigmoid, tanh, and ReLU. The neural net-works vary in different ways, for example, the sigmoidfunction generates output that is normalised between 0and 1, while tanh is normalised between − z , and also of a temporal term. The proposed2patio-temporal function S ( z, t ) is S ( z, t ) = 11 − exp ( − φ ( t ) z ) (1)The derivative of S ( z, t ) being dS ( z, t ) dz = φ ( t ) S ( z )(1 − S ( z )) (2)The derivative of the new activation function, S ( z, t ), issimilar to the Gaussian probability distribution, and thetemporal term φ ( t ) plays the role of varying its standarddeviation. Figure 1 shows the activation function andits derivative for constant φ ( t ) = 1 and φ ( t ) = 2 .
8. Thetemporal term φ ( t ) in the present case is a linear functionof a time-dependant parameter, α ( t ), as follows φ ( t ) = φ + kα ( t ) (3)where φ and k are constants. Figure 4 shows theschematic of the activation function.In this case, α ( t ) can follow the dynamics of the logisticmap based on the growth parameter r as follows α ( t + 1) = rα ( t )(1 − α ( t )) (4)The function f ( α ) = rα (1 − α ) is the logistic map, f : [0 , (cid:55)→ [0 , . < r <
4, giving rise tochaotic dynamical behaviour as shown in Figure 2. Itshould be noted that Eqn. 4 is the time derivative of α ( t ) while Eqn. 2 is the derivative of the sigmoid func-tion, and both follow a similar functional form, one intime domain t and other in input domain z . Interest-ingly, the derivative of the activation function in Eqn. 2is also the logistic map.The temporal function φ ( t ) is normalised between FIG. 2. Chaotic dynamical plot of α ( t ) for r = 4.FIG. 3. Bifurcation diagram of the logistic map ( φ min , φ max ) as follows φ ( t ) = φ min + ( α ( t ) − α min )( α max − α min ) ( φ max − φ min ) (5)where α max and α min are the maximum and minimumvalue taken by function α ( t ). III. RESPONSE OF SPATIO-TEMPORALACTIVATION FUNCTION
In the case of the logistic map, when r = 4, α min is 0and α max is 1. These values vary and can be read fromthe logistic bifurcation diagram as shown in Figure 3.The value of φ max and φ min depends on the range of thedata you want to generate. Increasing the range betweenthe φ max and φ min will increase the range of the neuron’soutput. For example, with α max = 1 and α min = 0, the3unction φ ( t ) is φ ( t ) = φ min + α ( t )( φ max − φ min ) (6)Figure 6 shows the output from a neuron displaying thechaotic behaviour using φ max = 1 . φ min = 0 .
9. Thestandard deviation corresponding to this range of φ max and φ min is found to be 0 . φ max and φ min .By increasing the range of the bounds ( φ max − φ min ),the chaotic behaviour of the resulting output of the neu-ron varies with larger deviations around the mean, e.g.for φ max = 1 . φ min = 0 .
8, the standard devia-tion increases proportionately to 0 . x t +1 = rx t − x t , which alsoproduces chaotic behaviour, for 2 . < r <
3, as shown inFigure 8.As described earlier, α min and α max need to be set cor-rectly by checking the range of the bifurcation diagramfor the corresponding chaotic function being used. It canbe worked out by looking at the maximum and minimumvalue of the graph for certain r values. Table I gives α min and α max for different values of r for the logistic map, x t +1 = rx t (1 − x t ). Table II gives the α min and α max values for the cubic map, x t +1 = rx t − x t . Both of theseare rough estimates measured by generating time seriesdata for different values of r , which can be improved by FIG. 4. Neuron with the spatio-temporal activation function r α min α max TABLE I. α max and α min for different values of r in thelogistic map. generating longer time series data. IV. NUMERICAL EXPERIMENT
A numerical experiment was done with a flat time se-ries as the input and chaotic time series as the output, asshown in Figure 9. A single neuron using the proposedspatio-temporal activation function was trained againstthis chaotic data, while the input was a flat value, with4 α min α max TABLE II. α max and α min for different values of r in thecubic map.FIG. 5. Standard deviation corresponding to different rangesof ( φ max − φ min ). parameters φ max = 3 . φ min = − . α max and α min were 1 and 0 respectively.After training, the same flat input data was fed to theneural network and the generated output time series datacompared well with the actual chaotic time series data as FIG. 6. Output of the neuron displaying the chaotic dynam-ical behaviour using flat input data. FIG. 7. Autocorrelation of the resulting output of the neuron.FIG. 8. Bifurcation diagram of the cubic map. shown in Figure 10.In this study, the Lyapunov exponent , λ , is used asmeasure of chaos. The Lyapunov exponent of the func-tion f ( x ) is defined as follows: λ = 1 n n − (cid:88) i =0 ln ( | f (cid:48) ( x i ) | ) (7) FIG. 9. Chaotic time series and flat input data. IG. 10. Chaotic time series vs Output of neural network.FIG. 11. Instances of activation function with different φ values during the generation of the chaotic time series output. where x n +1 = f ( x n ). For the chaotic data, the Lyapunovexponent, λ = − .
01, while the neural network’s output’s λ = − .
98. Thus, the new activation function is able tomap the chaotic time series data successfully.Figure 11 shows the different instances of the activationfunction while generating the output time series as de-picted in Figure 10. An important thing to notice isthat the activation function occasionally flips because the φ min taken in this example is negative. Though it maybe odd, this helps the neuron generate chaotic data thatmatches the range of the expected output. V. CONCLUSION
In the present research work, a two-dimensional acti-vation function is proposed, by including an additionaltemporal term to impart dynamic behaviour on the out-put without relying on recurrent neural networks. Thetemporal term can be either a logistic/cubic map or anyother chaotic map. The derivative at any instance of theactivation function is similar to the Gaussian probabilitydistribution with a varying, time-dependent standard de-viation. Consequently, when the temporal term changeswith time, the activation function fluctuates, leading todynamical behaviour in the output. The new activationfunction is able to successfully map the chaotic data.Further research is required to investigate the spatio-temporal activation function in a multilayered feed for-ward neural network and its ability to capture chaos us-ing logistic/cubic map.
REFERENCES H. Jaeger Short term memory in echo state networks.
GMD-Report 152, German National Research Institutefor Computer Science , 2002. H. Jaeger and H. Haas Harnessing Nonlinearity: Pre-dicting Chaotic Systems and Saving Energy in WirelessCommunication.
Science , 304, 2004. S. H. Herzog, W. Florentin and P. Ulrich Data-DrivenModeling and Prediction of Complex Spatio-TemporalDynamics in Excitable Media.
Frontiers in AppliedMathematics and Statistics , 4, 2018. L.M. Berliner Statistics, Probability and Chaos.