Brain-Inspired Stigmergy Learning
BBrain-Inspired Stigmergy Learning
Xing Hsu, Zhifeng Zhao, Rongpeng Li, Honggang ZhangCollege of Information Science and Electronic EngineeringZhejiang University, Zheda Road 38, Hangzhou 310027, ChinaEmails: { hsuxing, zhaozf, lirongpeng, honggangzhang } @zju.edu.cn Abstract —Stigmergy has proved its great superiority in termsof distributed control, robustness and adaptability, thus beingregarded as an ideal solution for large-scale swarm controlproblems. Based on new discoveries on astrocytes in regulatingsynaptic transmission in the brain, this paper has mappedstigmergy mechanism into the interaction between synapses andinvestigated its characteristics and advantages. Particularly, wehave divided the interaction between synapses which are notdirectly connected into three phases and proposed a stigmergiclearning model. In this model, the state change of a stigmergyagent will expand its influence to affect the states of others.The strength of the interaction is determined by the levelof neural activity as well as the distance between stigmergyagents. Inspired by the morphological and functional changesin astrocytes during environmental enrichment, it is likely thatthe regulation of distance between stigmergy agents plays acritical role in the stigmergy learning process. Simulation resultshave verified its importance and indicated that the well-regulateddistance between stigmergy agents can help to obtain stigmergylearning gain.
Index Terms —Stigmergy, Astrocytes, Synapses, CalciumWaves, Neural Networks, Artificial Intelligence, Machine Learn-ing
I. I
NTRODUCTION
Stigmergy was first introduced by French entomologistPierre-Paul Grass`e in 1950s [1] [2] when studying the behaviorof social insects. The word stigmergy is a combination ofthe Greek words “stigma” (outstanding sign) and “ergon”(work), indicating that some activities of agents are triggeredby external signs, which themselves may be generated by agentactivities [3]. Stigmergy allowed Grass`e to explain why insectsof very limited intelligence, without apparent communications,can collaboratively tackle complex tasks, such as building anest. Another definition given by Heylighen is [4]: stigmergyis an indirect, mediated mechanism of coordination betweenactions in which a perceived effect of an action stimulates theperformance of a subsequent action.Stigmergy has been widely studied in the behavior of socialinsects [5] [6]. But this concept is rarely mentioned in thebrain, which has been regarded as the most complex system sofar. As important glial cells in brain’s Central Nervous System(CNS), astrocytes are traditionally placed in a subservientposition, which supports the physiology of the associatedneurons. However, recent experimental neuroscience evidencesindicate that astrocytes also interact closely with neurons andparticipate in the regulation of synaptic neurotransmission[7]. These evidences have motivated new perspectives for theresearch of stigmergy in the brain. There is a large number of complicated biochemical reac-tions between synapses and astrocytes to support the imple-mentation of various brain functions. Basically, each astrocytecontains hundreds or thousands of branch microdomains, andeach of them encloses a synapse [8], as illustrated in Fig. 1.The synaptic activity will elevate the concentration of Ca in the corresponding microdomain. Ca and inositol-1,4,5-triphospate ( IP ), as important messengers within astrocytes,are believed to expand the influence of synaptic activities[9]. The range of influence is determined by the level ofsynaptic activity as well as the distance between coupledbranch microdomains. Besides, these branch microdomainswith the elevated concentration of Ca will provide anadjustment for the wrapped synapses. Therefore, as explainedin Section II, general stigmergy can be mapped into this neuralprocess in which astrocytes play the role of medium carriersto provide adjustments for the involved synapses.The interaction between several synapses, which is mainlymediated by the propagation of Ca in cytosol within as-trocytes, can be divided into three important phases. Thesephases will be concretely described in Section III so as toconstitute the stigmergic system model. In this fundamentalmodel, the strength of interaction between stigmergy agentsis determined by the level of stimulations as well as thedistance between them, which is consistent with the strengthof interaction between synapses via astrocytes. Inspired bythe morphological and functional changes in astrocytes duringenvironmental enrichment, it is likely that the regulation of dis-tance between stigmergy agents is critical to obtain stigmergylearning gain. Accordingly, the importance of the regulationis verified in two different stigmergic scenarios.The remainder of this paper is organized as follows. InSection II, new discoveries on astrocytes in regulating synaptictransmission will be introduced and the existence of generalstigmergy in the brain will be explored. In Section III, threeimportant phases within the interaction between synapseswill be described. The interaction is referred to set up thestigmergic system model and learning algorithm. In Section IV,the simulations on the performance of the proposed stigmergylearning are carried out in two different stigmergic scenarios,in order to verify its effectiveness and advantages. Finally, weconclude this paper with a summary.II. S TIGMERGY IN THE B RAIN
In CNS, general stigmergy can be mapped into the inter-action between synapses, which is mainly mediated by Ca a r X i v : . [ c s . N E ] N ov ithin astrocytes. As important medium carriers, astrocytes arecoupled together by the gap-junction to comprise a nervousregulation. A. Glial Cells in CNS
In the process of nerve conduction, action potentials repre-sented by the purple dotted arrow in Fig. 1 are conductedalong the axon to the pre-synaptic terminal. Then a largequantity of neurotransmitters will be released into synapticcleft through exocytosis. These molecules will diffuse andbind with various receptors on the surface of post-synapticterminal. Besides, they can also diffuse and bind with receptorsof surrounding glial cells, which will release neuromodulatorsin return [10]. In essential, there are three types of glial cellsin CNS: microglia, oligodendrocytes, and astrocytes [11].Microglia, as illustrated in Fig. 1, are macrophages in CNS.Their key roles are immune surveillance as well as respondingto infections or other pathological states such as neurologicaldiseases or injury [12] [13]. For the synaptic activity, microgliaplay the role of supervision and protection.Oligodendrocytes can contribute to the plasticity of nervoussystems in the process of nerve conduction. An action potentialneeds to spend a certain amount of time reaching the pre-synaptic terminal. Many factors affect the conduction velocity,such as the thickness of myelin sheath, the axon diameter andthe spacing and width of the Ranvier nodes [14]. Increasingthe thickness of myelin sheath can significantly improve thevelocity, which helps to form the saltatory conduction. In thisway, high-speed nerve pulses jump along the axon towards thepre-synaptic terminal, leading to a faster conduction. Oligo-dendrocytes play a critical role in this process because theycan regulate the production of lecithin, which is an importantsubstance for the compound of myelin [15], as illustrated inFig. 1. In this sense, the process of nerve conduction can beseen as the adjustment of the arrival time of different nervepulses, which can be achieved by continuously changing thethickness of myelin sheath on each axon branch.Astrocytes are enriched with various receptors on the sur-face in order to support the implementation of different func-tions [16]. The phenomenon that the synaptic terminals as wellas the cleft are wrapped by surrounding astrocytes gives riseto the structure of tripartite synapse [17], which is illustratedwith details in Fig. 1. In Fig. 1, the pre- and post-synapticterminals are represented by the blue parts. The branchmicrodomain within astrocytes is represented by the yellowpart. The transient calcium elevation in the microdomain mayresult from the binding with glutamate (Glu) which is releasedfrom the pre-synaptic terminal or the propagation of calciumwaves from other microdomains. There is a large number ofbiochemical interactions between astrocytes and synapses, andvarious neuromodulators will be released from astrocytes dueto the transient calcium elevation. These neuromodulators canact on purinergic A (or A ) receptors on the pre-synapticterminal to reduce (or increase) the number of exocytosis.Besides, Ca with high concentration can diffuse to theother microdomains in the manner of calcium waves withinastrocytes [18]. In this way, synapses wrapped by different branch microdomains can interact with each other while thepropagation of calcium waves has constituted the main methodof communications. B. Regulation with Two Different Types
When an action potential reaches the pre-synaptic terminal,a large quantity of Glu will be released into the synapticcleft. These molecules will diffuse and act on metabotropicglutamate receptors (mGluRs) which are located at adjacentbranch microdomains, evoking the production of a fix amountof IP [19]. This process is schematically illustrated in theupper part of Fig. 2. The concentration of IP within astro-cytes is believed as the key factor to evoke the elevation ofintracellular calcium [9]. Moreover, as shown in Fig. 2, IP isconsidered as the second messenger to trigger the release of Ca from endoplasmic reticulum (ER). ER can be consideredas a reservoir with higher concentration of Ca than that incytosol.A basic model in [20] has been used to describe thedynamics of Ca in cytosol due to the binding of IP with IP receptors ( IP Rs ) in ER. There are three flows which areshown in the ER area in Fig. 2. J Leak represents the leakage-flux of Ca from ER into cytosol which is directly propor-tional to the concentration gradient of Ca between ER andcytosol. J Pump represents the pump-flux from cytosol into ERwhich needs to consume energy to maintain a concentrationgradient. J Channel represents the channel-flux from ER intocytosol which is generated due to the binding of IP with IP Rs . The elevated concentration of Ca in cytosol willfurther increase the open probability of IP Rs and ryanodinereceptors (RyRs) [21], comprising of the mechanism knownas Calcium-Induced Calcium-Release (CICR). Nevertheless,excessive concentration of Ca in cytosol will bring downthe open probability of IP Rs and RyRs, and the pump-flux J Pump will become the main factor until a concentrationgradient is re-established.Calcium waves can propagate between astrocytes to incurcalcium oscillations [22]. There are many studies trying todescribe and model the properties of the gap-junction betweenvarious astrocytes [23], as illustrated in Fig. 2. A largenumber of observations indicate that the gap-junction betweenastrocytes has a smaller conductance for Ca , but a largerone for IP [24]. Therefore, the above-mentioned IP is themain factor to promote the propagation of calcium wavesbetween astrocytes. Besides, the activation of phospholipase C δ is also required for the propagation of calcium waves [24].An intuitive map of the transient calcium elevation resultingfrom IP in or between astrocytes is generalized in Fig. 2.On the other hand, neuroscience experiments have ex-pressed different characteristics of Ca between soma andmicrodomains. Typically, calcium elevations occurring in themicrodomains are much more frequent and transient thanthose in the soma [25]. Researchers in [26] indicated thatthere should be Transient Receptor Potential Ankyrin type 1(TRPA1) or receptor-gated Ca -permeable ions channels inthe astrocyte membrane, through which Ca could flux intothe cell from the extracellular matrix. Recent studies indicate ig. 1. An intuitive diagram of the tripartite synapse.Fig. 2. The transient calcium elevation resulting from IP . that there are actually two different types of regulations withinastrocytes [27]. The short-range regulation in response to low-intensity stimulus is induced by rapid and short-term calciumelevations. The long-range regulation in response to high-intensity stimulus is induced by slow and long-term calciumelevations. The former provides the regulation within the scaleof several synapses locally, and Ca influx through thereceptor-gated ions channels can be the main factor. The latterprovides the regulation among different astrocytes with thepropagation of calcium waves through the gap-junction, and IP can be the main factor. In this paper, we focus our attentionon the short-range regulation. C. Astrocytes as Regulation Networks
Astrocytes occupy a fundamental position in the synap-tic activity. It is suggested that the efficiency of synaptictransmission through the pre-synaptic terminal will be greatlydecreased without the calcium signal [10]. The microdomainwith elevated calcium will generate an effect for the wrappedsynapse. Many researchers tried to decode the calcium signal[28] [19]. Receptors on the membrane of post-synaptic termi-nal have low affinity. But the interaction between synapses andastrocytes is granted by receptors with high affinity and slow
TABLE IT HE M AIN S YMBOLS AND A CRONYMS .Acronym DescriptionCNS Central Nervous SystemGlu Glutamate IP Inositol-1,4,5-triphospateER Endoplasmic Reticulum IP Rs Inositol-1,4,5-triphospate receptorsRyRs Ryanodine receptorsCICR Calcium Induced Calcium ReleasemGluRs Metabotropic glutamate receptorsAMPA α -Amino-3-hydroxy-5methy1-4-isoxazolepropionic acidNMDA N-methil-D-aspartic acidTRPA1 Transient Receptor Potential Ankyrin type 1SOM Self-Organizing Mapping desensitization. It means that the influence from synapses andastrocytes will not disappear immediately.In general, the arriving time of consequent action potentialsat a certain synapse can be regarded as a discrete-time pulsesequence. Each of them can change the synaptic state intoexcitatory or inhibitory. The synaptic state change will gen-erate calcium elevations in surrounding microdomains whichwill provide feedback in return. The synapse will graduallyrecover to its original state until the arrival of the nextaction potential. In this situation, the duration of the elevationdepends on the length of the arriving time interval, and ashorter one will produce a longer duration. Therefore, thelevel of synaptic activities can be measured by the level ofcalcium elevations. Researchers in [8] found that increasingthe level of synaptic activities would lead Ca diffusing intothe adjacent microdomains, and a persistent high level wouldeventually make Ca be full of the whole astrocyte, whichis depicted in Fig. 3. In Fig. 3, (a) represents the response ofastrocytes under low intensity stimulus. The red solid arrowrepresents the diffusion direction of Ca while the blackdotted arrow represents the feedback effect. Fig. 3 (b) is theintermediate result of increasing the intensity of stimulus. Fig.3 (c) shows the final diffusion effect caused from a synapsehich is stimulated by consequent action potentials. Fig. 3. Different levels of calcium elevations caused by different levels ofsynaptic activities.
Many low levels of calcium elevations, which can generatethe above-mentioned short-range regulations within the scaleof several synapses, can jointly comprise a large state changewhich can be detected in the whole astrocyte. Furthermore,these calcium elevations form a Ca concentration mapwithin astrocytes. This consideration brings about the con-cept that astrocytes can act as an encoder to encode thetemporal properties of synaptic activities into spatial patterns.Meanwhile, astrocytes can be regarded as a spatial regulationnetwork, in which the activity of a synapse can influence thestates of other synapses not only in the adjacent area, butalso in the distant regions with the help of calcium waves.As described in Fig. 4, this regulation network can providea cross regulation for nervous system. Different from thenerve conduction, synapses by means of the spatial regulationnetwork of astrocytes can activate other neighbor neuronsbetween which there is no direct connection. Fig. 4. A cross regulation provided by astrocytes for nervous system.
D. Stigmergy in the Brain
In the hippocampal stratum radiatum, the detailed 3D re-construction work shows that 80% synapses are coupled withthe branch microdomains, and astrocytes almost completelywrap synapses which are rich in docked vesicles [29]. Alarge number of synapses with certain functions are coupledtogether through astrocytes to form a potentially collaborativenervous system. Calcium waves comprise the main methodof communications between synapses which are not directlyconnected. Accordingly, general stigmergy can be mapped intothe mechanism of cooperative interaction between synapses. In the brain’s nervous system, various synapses can be re-garded as different stigmergy agents, and a map of Ca con-centration within astrocytes can be regarded as the medium.Action potentials can change the synaptic state into excitatoryor inhibitory, which will generate different levels of calciumelevations in the corresponding microdomains. This processcan be regarded as leaving traces in the medium as in generalstigmergy. With the help of Ca and IP , calcium waves canexpand its influence throughout astrocytes. The superpositionof different calcium elevations is linear, thus the effect oflocal traces can be integrated to adjust the whole stigmergicenvironment. An illustrative comparison between general stig-mergy and the mechanism of stigmergic interactions betweensynapses and astrocytes is illustrated in Fig. 5. Fig. 5. A comparison between general stigmergy and the mechanism ofstigmergic interactions between synapses and astrocytes.
Astrocytes can be regarded as significant medium carriers,which maintain the map of Ca concentration. Astrocytes canalso provide the regulation for the involved synapses, whoseimplementation benefits from a large number of receptors withdifferent types between synapses and astrocytes. This effectcan be regarded as the condition provided by the mediumfor stigmergy agents. Besides, the concentration of Ca inastrocytes will decay with time, which comprises a negativefeedback loop and provides stability for the nervous systemwith controlled cycles. Because of a limited range of influence,only regulations reflecting the right condition of the nervoussystem will superpose and have a longer duration. Through thiskind of stigmergic process, astrocytes integrate the calciumelevations generated by different synapses and provide cross-regulation for various individual synapses in the nervoussystem.III. S TIGMERGY L EARNING M ECHANISM AND M ODEL
Based on the aforementioned analyses, the interaction pro-cess within the scale of several synapses, which are notdirectly connected, can be modeled. The implementation oftheir interactions, which is regarded as the short-range regu-lation, mainly relies on the propagation of Ca throughoutastrocytes. Hereinafter, we divide this process into three phaseswhich are described in Fig. 6.The first phase which represents the generation of cal-cium elevation resulting from the activated synapse in themicrodomain is indicated by I in Fig. 6. At first, the releaseof Glu due to the arriving of an action potential is modeledby [30]: [ T Neur ] = T max − V d − V base K N ) (1) ig. 6. Three phases included in the interaction between synapses. They arerespectively numbered by I, II and III. where [ T Neur ] is the concentration of Glu in synaptic cleft, and T max represents its maximum. V d is the voltage of dendrite inthe Pinsky-Rinzel model [31]. V base and K N are parametersused to modify the sigmoid function curve. Then Glu willdiffuse and act on receptors on the membrane of branchmicrodomain to increase the concentration of Ca : d [ Ca ] dt = v Ca ∗ [ T Neur ] n k nCa + [ T Neur ] n − τ Ca ([ Ca ] − [ Ca ] ∗ ) (2)where v Ca and k Ca are regulating parameters. n is an adjustingfactor. τ Ca is a decay constant. [ Ca ] ∗ represents the concen-tration of Ca at equilibrium in cytosol. The first item in theequation expresses the increment of Ca concentration in themicrodomain. The second item indicates that the concentrationalso decreases with time because of the concentration gradientof Ca between cytosol and the extracellular matrix.The second phase which considers the passive diffusion of Ca from one microdomain to others is indicated by II inFig. 6. The passive diffusion of Ca can be calculated bythe Telegraph Equation [32]: τ d ∂ c ( x, t ) ∂t + ∂c ( x, t ) ∂t = D ∇ c ( x, t ) + b ( x , t ) (3)where τ d is the relaxation factor accounting for a finite propa-gation speed. c ( x, t ) is the concentration of Ca at location x and time t . D is the diffusion coefficient. Furthermore, b ( x , t ) representing the change rate of concentration at the initial pointis given by [33]: b ( x , t ) = dc ( x , t ) dt (4)The third phase which considers the regulation provided byastrocytes with elevated calcium for synapses is indicated byIII in Fig. 6. The relationship between the concentration of Ca in the corresponding microdomain and the amplitudeof slow inward currents in the pre-synaptic terminal has beenused to describe the regulation [19]: I current = k I Θ(ln y ) ln yy = [ Ca ] − I th (5)With regard to Eq. (5), there is a threshold value I th for theconcentration of Ca before providing a regulation for the pre-synaptic terminal. k I is a scale factor. Θ represents theHeaviside function. A natural logarithmic function is usedto describe the strength of regulation, which will generatean effect until the concentration of Ca is below a certainthreshold. Therefore, this function determines a scope for thepropagation of calcium waves. Fig. 7. The strength of synaptic interaction at different distances.
We can further integrate the above three phases togetherto describe the strength of synaptic interaction at differentdistances, which is represented by the Diffusion curve in Fig.7. In Fig. 7, the Diffusion curve has been normalized to matchthe degree of the Gaussian function. Compared with the Gaus-sian function which is widely used as neighborhood functionin Self-Organizing Mapping (SOM) of neural networks, theDiffusion curve has a similar downward trend. Regardless ofthe initial stimulus intensity, synapses with larger distanceswill have smaller amplitude of responses through the Diffusionprocess. We will take advantage of this relationship betweensynapses to coordinate the behaviours of stigmergy agents.More specifically, rooted in the above three interactivephases, a stigmergic learning mechanism is proposed andillustrated in Fig. 8, in which the communications betweendifferent stigmergy agents (i.e. synapses) represented by dif-ferent colors are indirect. When getting a stimulus input,a stigmergy agent will leave traces (i.e. calcium elevation)which is expressed by the red solid arrow in the outsideenvironmental medium to affect the state of other agents.As illustrated in Fig. 8, the amplitude of response for theinteractive influence indicated by the blue dotted arrow isdetermined by the inter-synapse distance x between stigmergyagents as well as the intensity of initial stimulus s . In thenervous system, the intensity of initial stimulus is consistentwith the level of synaptic activity while the synaptic distanceis determined by the distance between the coupled branchmicrodomains within astrocytes.Environmental enrichment, which is the stimulation of thebrain by its physical and social surroundings, is known to in- ig. 8. The stigmergic learning mechanism. duce the increases in synaptic and spine densities. Researchersin [34] found that great changes would present in astrocyticmorphology and a large number of branch microdomainswould appear in this process. Astrocytes by virtue of theseemerging microdomains can coordinate synaptic activitiesand change the amplitude of responses between these neu-rons. Therefore, the inter-synapse distance adaptation betweenstigmergy agents in the stigmergic learning mechanism canbe leveraged and well-regulated to support mutual efficientcollaboration.Furthermore, a multi-agent cooperation approach is consid-ered to take advantage of the regulation of inter-synapse dis-tance between stigmergy agents to obtain stigmergic learninggain. Within this approach, stigmergy agents are continuouslyselected out for certain tasks in each turn until their commonobject requirements are reached. In particular, we leverage andmodify the strategy proposed in [35] in each selection round: p i,j ( t ) = s nj ( t ) s nj ( t ) + αθ ni,j ( t ) + βϕ ni,j ( t ) (6) p i,j ( t ) = s nj ( t ) s nj ( t ) + αθ ni,j ( t ) ∗ βϕ ni,j ( t ) (7)where p i,j ( t ) is the probability of i th stigmergy agent beingselected for j th task. s j ( t ) is the emergency degree of j th task. α , β and n are adjusting factors. θ i,j ( t ) is the statevalue of i th agent for j th task. ϕ i,j ( t ) is a heuristic factor.Comparing with Eq. (6), in the modified strategy Eq. (7), “ + ”has been changed into “ ∗ ” in order to remove the asynchronousvariation which will cause the jitter at steady state. After eachaction, the update process of s j ( t ) is modified as: R j ( t ) = R j ( t −
1) + (cid:88) m ∈ S j ( t − r m,j ( t ) (8) s j ( t ) = R j ( t ) /T j (9)where r m,j ( t ) is the reward that m th agent obtains in j th task at time t from the outside environmental medium. R j ( t ) is the sum of all rewards at time t . T j is the expected objectrequirement for task j . S j ( t − is the set of stigmergy agentswhich participate in j th task at time t − . As more stigmergyagents participate in this task, s j ( t ) will get bigger (approach to 1) and thus provide a stimulated collaboration with higherintensity.The state value of different stigmergy agents for the sametask can be different in Eq. (7). After taking an action,this state value will be updated according to the followingequations: θ i,j ( t ) = θ i,j ( t −
1) + ∆ θ i,j ( t − (10) ∆ θ i,j ( t −
1) = ρ ∗ ( 1 | S j ( t − | (cid:88) m ∈ S j ( t − r m,j − r i,j ) (11)where ρ is a scale factor. ∆ θ i,j ( t − can be positiveor negative, which corresponds to a low or high rewardrespectively. According to the proposed stigmergic learningmechanism, the state change of a stigmergy agent will expandits influence to affect the state of other agents. Therefore, thestate value should be further updated by: θ i,j ( t ) = θ i,j ( t ) + ∆ θ i,j ( t ) (12) ∆ θ i,j ( t ) = (cid:88) k ∈ π i ( t − D ( d k,i ( t − ∗ ∆ θ k,j ( t − ∗ ρ (13)where ρ is a scale factor. π i ( t −
1) = { X k | k (cid:54) = i, d k,i ( t −
1) + f actor, otherwise (14) φ = ∆ θ i,j ( t ) ∗ ∆ θ k,j ( t − (15)where f actor is a constant. To some extent, the distance be-tween stigmergy agents also represents the similarity of theseagents participating in the same task. Therefore, the strengthof interactions will be larger if the similarity between twostigmergy agents is higher. The regulation of inter-synapse dis-tance can adjust the strength of synaptic interactions betweenstigmergy agents and thus bring the systematic stigmergylearning gain.IV. N UMERICAL S IMULATION AND R ESULTS
In order to verify the effectiveness and advantages of theproposed stigmergy learning model, a number of numericalsimulations with different kinds of tasks have been carriedout. . The Stigmergy Learning Gain
In the first simulation, there is only one task to be finished.During the initialization, the distance between various neuralagents is set as the median value in the range, which canrepresent the similarity of these agents participating in thistask. The same method is also applied to the setting of the statevalue. A random reward is assigned to each neural agent whichwill not be changed during the whole simulation process.Several neural agents are allowed to take an action together asa batch. There is a fixed cost for each action which is the samefor all agent individuals. Besides, an ability value is randomlyassigned to each neural agent indicating the number of actionsit can still take, which is also utilized as the heuristic factor inEq. (7) and normalized to match the degree of the state value.In each turn, several neural agents are selected out accordingto the selection probability (Eq. (7)) as a batch to participatein the target task until the overall reward is above the objectrequirement. After each turn, a feedback is returned, whichequals to the sum of all rewards of neural agents that areselected out in last turn. This feedback is used to regulate thedistance between neural agents. The target of the simulationis to satisfy the object requirement with the maximum utilityvalue (reward/cost). The main parameters within the simula-tion are shown in Table II.
TABLE IIT HE M AIN P ARAMETERS .Item Description
Agent number 30Object requirement 1100Batch size 5Agent reward [1 , , α β
2n 2 ρ . ρ . According to the selection probability, neural agents withhigher rewards are assumed to have smaller state values andthus more likely to be selected. These neural agents willshorten the distance between those agents with the same higherrewards to form a cluster. Based on the proposed stigmergylearning model (Eq. (7) - (15)), aiming at forming a spatialneural cluster, the task is repeatedly carried out for 500 timesto regulate the distance between neural agents. The state valueand the average distance of neural agents are given in Fig. 9. InFig. 9, neural agents are arranged in descending order accord-ing to the state values. Similarly, the average distance whichrepresents the average distance value of a neural agent fromall the other agents is arranged according to the correspondingorder. It can be observed that the neural agents with lowerstate values will have smaller average distances. Therefore,a neural cluster can be automatically formed by agents withhigher rewards. Members in the neural cluster have smaller distances than the others, whose stimulus resulting from thestate change can thus be more easily responded.
Fig. 9. The state value and the average distance of neural agents.
The obtained neural cluster can be used to generate thestigmergy learning gain. We have compared the utility valueof each batch with that in general stigmergy, in which thedistance adjustment between stigmergic agents as well as theformation of the neural cluster are not taken into account.The comparison results are provided in Fig. 10. In Fig. 10,the curve without distance adjustment represents the utilityvalue of the traditional stigmergy mechanism while the curveswith distance adjustment represent the utility values of theproposed stigmergy learning model. m in Fig. 10 representsthe maximum of ∆ θ i,j ( t ) while n represents the maximumof ∆ θ i,j ( t ) . Because of the limitation of the ability value, thelast few rounds in each scheme have to adopt neural agentswith lower rewards, which cause the decline of each curvein the end. With the existence of the regulated distance, thetask is started with higher efficiency and completed earlier.Accordingly, neural agents with higher rewards are more easilyselected in the task and further activate those with the samehigher rewards because of the neural clustering merit. There-fore, as a key in the proposed stigmergy learning model, the Fig. 10. The system gain provided by the regulated distance between neuralagents. egulation of distance between various neural agents throughspatial clustering can bring expected gain for the learningsystem.
B. The Impact of Distance Regulation
Different from the first simulation, the aim of the secondsimulation is to test if we can adjust current states of neuralagents to converge to the target pattern (e.g. a target picture ofArabic numerals). The selection of neural agents is differentwith that in the first simulation, namely, agents arerandomly selected out to form a neural group in each turn.The current state of each neural agent is redefined by: y j = Θ( (cid:88) j (cid:54) = i v i ∗ D ( d i,j ) − base ) (16)where base is a constant. Θ represents the Heaviside function. v i represents the input of i th agent. y j represents the outputof j th agent. y j = 1 means that the current state of j th agentis excitatory while y j = 0 means that the current state of j th agent is inhibitory. Self-feedback is not considered here.The relationship between neural agents is illustrated in Fig.11 (a), in which distances between neural agents are directed.Regardless of the initial input, according to Eq. (16), thecurrent state of a neural agent is determined by the distancefrom the other agents. Fig. 11. The relationship between neural agents.
In each turn, all neural agents will be given a unit input todetermine the current states of neural agents in the group.These states can be further used to calculate the feedbackwhich equals to the sum of all rewards of neural agents in thegroup. The reward for each state of neural agent is provided inTable III. In Table III, the value of a pixel refers to the binaryvalue in the original picture. The size of the original pictureis × , in which each pixel is represented by the state ofa neural agent in the corresponding location.According to the selection method, different neural groupsthat may contain the same members are selected out indifferent turns. Neural agents in each group will regulate thedistance to adjust their current states according to the corre-sponding feedback. Concretely, as shown in Fig. 11 (b), wecompare the feedback of two neural groups in two continuousturns and change the distance between any two members indifferent neural groups according to the result. The distance TABLE IIIT HE R EWARD F OR E ACH S TATE .state pixel reward1 1 01 0 -10 1 +10 0 0 from neural agents in the group with larger feedback to theothers in the group with smaller one will increase a constant ineach turn. As mentioned before, the distance between neuralagents describes the strength of interactions between them anda shorter distance indicates a higher strength. In this way,neural agents which should be excitatory for the target patternwill get shorter distances and be more easily activated.
Fig. 12. The learning process of the sitgmergy learning system.
The relevant simulation results are shown in Fig. 12. n is thenumber of learning iterations. d represents the correspondingaverage distance from the others to the neural agent whichshould be excitatory at each iteration step. Each picture inFig. 12 describes the current states of all neural agents duringthe learning process. White points in each picture indicate Fig. 13. The utility value during the stigmergy learning process. hat neural agents at those locations are excitatory whileblack points indicate that neural agents at those locationsare inhibitory. As the number of iterations increases, thelearning system gradually converges to the clear target patternof number 4 or 8. At the same time, the average distance d decreases gradually, indicating that the average amplitudeof response of these neural agents increases gradually. After n = 2400 , the learning system starts to change its regulationand to express another target pattern of number 8. As before,the sitgmergy learning system finally learns the target patternof number 8 after 7000 iterations. Fig. 13 shows the similaritybetween the original target picture and the one formed by thelearning system during the whole process.The above results have proved that the proposed stigmergylearning system can be adjusted to learn the target patterns,which is accomplished by activating the relevant sets of neuralagents. Regardless of the initial stimulus, the activation of aneural agent is determined by the distance to other neuronindividuals. Note that the distance represents the strength of in-teractions between neural agents which will be activated moreeasily with shorter distances from the others. In summary, theregulation of inter-synapse distance plays an important rolefor the cooperation of neural agents in the stigmergy learningmechanism. V. C ONCLUSIONS
Stigmergy phenomena are widely discovered in naturalcolonies and perform well through the way of collectivecollaboration. Inspired by the new discoveries on astrocytesin synaptic transmission, we have explored and mapped stig-mergy in the regulation of synaptic activities in the brain. Inparticular, the interaction between neural agents (synapses) isdivided into three important phases and a stigmergic learningsystem model has been put forward. We have found thatthe regulation of distance between neural agents plays animportant role in the proposed model. The well-regulateddistance between neural agents can bring gain for the systemand help to learn the target patterns. Its importance has beenverified in two different simulations. Please note that theinteraction between synapses within a certain range has beenregarded as the short-range regulation. But for the long-rangeregulation, the participation of IP must be taken into account,which will be our future research direction.R EFERENCES[1] P. Grass, “La reconstruction du nid et les coordinations inter - indi-viduelles chez Bellicositermes natalis et Cubitermes sp. La thorie de lastigmergie,”
Insectes Sociaux , vol. 6, no. 1, pp. 41–80, 1959.[2] U. Gllner, “Grass, PierreP.: Fondation des Socit Construction. Termi-tologia. 2. 624 S., 452 Fig., 28 Tab., Masson, Paris, New York, Barcelona,Milan, Mexico, Sao Paulo, 1984,”
Deutsche Entomologische Zeitschrift ,vol. 32, no. 45, pp. 379–379, 1985.[3] I. F. Informatik, H. Dipl.-Inf, and S. A. Br, “Return From The Ant -Synthetic Ecosystems For Manufacturing Control,” in
PhD Thesis , 2000.[4] F. Heylighen, “Stigmergy as a generic mechanism for coordination:definition, varieties and aspects,” 2011.[5] M. Dorigo, G. D. Caro, and L. M. Gambardella, “Ant algorithm fordiscrete optimization,”
Artificial Life , vol. 5, no. 2, pp. 137 – 172, 1999.[6] I. Kassabalidis, M. A. Elsharkawi, R. J. Marks, P. Arabshahi, and A. A.Gray, “Swarm Intelligence for Routing in Communication Networks,”
IEEE Globecom , vol. 6, pp. 3613–3617, vol. 6, 2001. [7] P. G. Haydon and G. Carmignoto, “Astrocyte Control of Synaptic Trans-mission and Neurovascular Coupling,”
Physiological Reviews , vol. 86,no. 3, pp. 1009–1031, 2006.[8] A. Araque, G. Carmignoto, P. G. Haydon, S. H. Oliet, R. Robitaille, andA. Volterra, “Gliotransmitters travel in time and space.”
Neuron , vol. 81,no. 4, pp. 728–739, 2014.[9] F. Mesiti, P. A. Floor, and I. Balasingham, “Astrocyte to Neuron Commu-nication Channels With Applications,”
IEEE Transactions on Molecular,Biological and Multi-Scale Communications , vol. 1, no. 2, pp. 164–175,2015.[10] M. Navarrete and A. Araque, “Basal synaptic transmission: astrocytesrule!”
Cell , vol. 146, no. 5, pp. 675–7, 2011.[11] L. Correia, A. M. Sebastio, and P. Santana, “On the role of stigmergyin cognition,”
Progress in Artificial Intelligence , pp. 1–8, 2017.[12] D. Erny, A. L. H. de Angelis, and M. Prinz, “Communicating systemsin the body: how microbiota and microglia cooperate,”
Immunology , vol.150, no. 1, 2016.[13] R. M. Ransohoff, “How neuroinflammation contributes to neurodegen-eration,”
Science , vol. 353, no. 6301, p. 777, 2016.[14] S. Pajevic, P. J. Basser, and R. D. Fields, “Role of Myelin Plasticity inOscillations and Synchrony of Neuronal Activity,”
Neuroscience , vol. 276,no. 6, pp. 135–147, 2013.[15] R. D. Fields, “A new mechanism of nervous system plasticity: activity-dependent myelination,”
Nature Reviews Neuroscience , vol. 16, no. 12, p.756, 2015.[16] M. V. Sofroniew and H. V. Vinters, “Astrocytes: biology and pathology,”
Acta Neuropathologica , vol. 119, no. 1, pp. 7–35, 2010.[17] A. Araque, V. Parpura, R. P. Sanzgiri, and P. G. Haydon, “Tripartitesynapses: glia, the unacknowledged partner,”
Trends in Neurosciences ,vol. 22, no. 5, p. 208, 1999.[18] P. Bezzi and A. Volterra, “A neuron-glia signalling network in the activebrain,”
Current Opinion in Neurobiology , vol. 11, no. 3, pp. 387–394, 2001.[19] S. Nadkarni and P. Jung, “Dressed neurons: modeling neural-glialinteractions.”
Physical Biology , vol. 1, no. 2, pp. 35–41, 2004.[20] Y. X. Li and J. Rinzel, “Equations for InsP3 receptor-mediated [Ca2+]ioscillations derived from a detailed kinetic model: a Hodgkin-Huxley likeformalism,”
Journal of Theoretical Biology , vol. 166, no. 4, pp. 461–73,1994.[21] I. Siekmann, P. Cao, J. Sneyd, and E. J. Crampin, “Data-driven mod-elling of the inositol trisphosphate receptor (IPR) and its role in calciuminduced calcium release (CICR),”
Quantitative Biology , 2015.[22] M. J. Berridge and A. Galione, “Cytosolic calcium oscillators,”
FasebJournal Official Publication of the Federation of American Societies forExperimental Biology , vol. 2, no. 15, pp. 3074–82, 1988.[23] M. Goldberg, M. D. Pitt, V. Volman, H. Berry, and E. Benjacob,“Nonlinear Gap Junctions Enable Long-Distance Propagation of PulsatingCalcium Waves in Astrocyte Networks,”
PLOS Computational Biology ,vol. 6, no. 8, pp. 1–14, 2010.[24] J. Pouilloux, “Anti-phase calcium oscillations in astrocytes via inositol(1, 4, 5)-trisphosphate regeneration,”
Cell Calcium , vol. 39, no. 3, pp. 197–208, 2006.[25] K. Kanemaru, H. Sekiya, M. Xu, K. Satoh, N. Kitajima, K. Yoshida,Y. Okubo, T. Sasaki, S. Moritoh, and H. Hasuwa, “In vivo visualizationof subtle, transient, and local activity of astrocytes using an ultrasensitiveCa(2+) indicator.”
Cell Reports , vol. 8, no. 1, pp. 311–318, 2014.[26] E. Shigetomi, X. Tong, K. Y. Kwan, D. P. Corey, and B. S. Khakh,“TRPA1 channels regulate astrocyte resting calcium and inhibitory synapseefficacy through GAT-3,”
Nature Neuroscience , vol. 15, no. 1, pp. 70–80,2012.[27] N. Bazargani and D. Attwell, “Astrocyte calcium signaling: the thirdwave,”
Nature Neuroscience , vol. 19, no. 2, pp. 182–189, 2016.[28] P. M. De, V. Volman, H. Levine, and E. Ben-Jacob, “Multimodal en-coding in a simplified model of intracellular calcium signaling,”
CognitiveProcess , vol. 10, no. 1, pp. 127–127, 2009.[29] R. Ventura and K. M. Harris, “Three-dimensional relationships betweenhippocampal synapses and astrocytes.”
The Journal of Neuroscience ,vol. 19, no. 16, p. 6897, 1999.[30] A. Destexhe, Z. F. Mainen, and T. J. Sejnowski, “Synthesis of modelsfor excitable membranes, synaptic transmission and neuromodulation usinga common kinetic formalism.”
Journal of Computational Neuroscience ,vol. 1, no. 3, pp. 195–230, 1994.[31] P. F. Pinsky and J. Rinzel’S,
Intrinsic and Network Rhythmogenesis ina Reduced Traub Model for CA3 Neurons . Springer-Verlag New York,Inc., 1995.[32] Y. M. Ali and L. C. Zhang, “Relativistic heat conduction,”
InternationalJournal of Heat and Mass Transfer , vol. 48, no. 12, pp. 2397–2406, 2005.33] M. Pierobon and I. F. Akyildiz, “A physical end-to-end model formolecular communication in nanonetworks,”
IEEE Journal on SelectedAreas in Communications , vol. 28, no. 4, pp. 602–611, 2010.[34] G. G. Viola, L. Rodrigues, J. C. Amrico, G. Hansel, R. S. Vargas,R. Biasibetti, A. Swarowsky, C. A. Gonalves, L. L. Xavier, and M. Achaval,“Morphological changes in hippocampal astrocytes induced by environ-mental enrichment in mice,”
Brain Research , vol. 1274, pp. 47–54, 2009.[35] M. Dorigo, E. Bonabeau, and G. Theraulaz,