Decisional Processes with Boolean Neural Network: the Emergence of Mental Schemes
Graziano Barnabei, Franco Bagnoli, Ciro Conversano, Elena Lensi
aa r X i v : . [ c s . A I] J a n Decisional Processes with Boolean Neural Network: theEmergence of Mental Schemes ∗ Graziano Barnabei
Dept. of Systems and Computer Science, University of Florence, via S. Marta 3.50139 Firenze, Italy; CSDC, Via G. Sansone, 1. 50019 Sesto F.no, Firenze, Italy.email: [email protected];
Franco Bagnoli
Dept. of Energy, University of Florence, via S. Marta 3. 50139 Firenze, Italy;CSDC and INFN, sez. Firenze. email: franco.bagnoli@unifi.it;
Ciro Conversano, Elena Lensi
Dept. of Psychiatry, Neurobiology, Pharmacology and Biotechnology, Universityof Pisa, Via Roma, 67. 56126 Pisa, Italy.Human decisional processes result from the employment of selectedquantities of relevant information, generally synthesized from environmen-tal incoming data and stored memories. Their main goal is the productionof an appropriate and adaptive response to a cognitive or behavioral task.Different strategies of response production can be adopted, among whichhaphazard trials, formation of mental schemes and heuristics. In this pa-per, we propose a model of Boolean neural network that incorporates thesestrategies by recurring to global optimization strategies during the learn-ing session. The model characterizes as well the passage from an unstruc-tured/chaotic attractor neural network typical of data-driven processes toa faster one, forward-only and representative of schema-driven processes.Moreover, a simplified version of the Iowa Gambling Task (IGT) is intro-duced in order to test the model. Our results match with experimental dataand point out some relevant knowledge coming from psychological domain.PACS numbers: 87.85.dq ∗ Presented at Summer Solstice 2009 International Conference on Discrete Models ofComplex Systems Gdansk, Poland, 22-24 June 2009 (1)
1. Introduction
Humans usually categorize incoming information into stable conceptswhich can be upgraded, related and nested one into another. The charac-teristics of information are analyzed and classified into (i.e. they activate)existing concepts but, whenever they would represent a novelty, they willinduce the formation of a new concept or the upgrade of the existing ones.This adaptive modality of knowledge organization makes cognitive systemable to classify, store and employ at best incoming information, in order tosolve the eventual cognitive demand during next steps of processing. Sub-sequently, human cognitive or behavioral responses to a given set of inputsare built following several and different decisional strategies. The role ofcontext, the kind of information and past experiences are central for thechoice of what kind of decision making will be made. In general, we cantake into consideration at least five strategies of output production:
1) Reflexive responses : direct associations of inputs with an out-put pattern. They require no attentional resources and are out of possiblecontrols. Typical examples are reflexive genetically implemented motor re-sponses (e.g. the evade reflex) and associative behaviors (e.g. the Pavlov’sdog salivation reflex).
2) Automatic processes : standardized quick responses associated toa frequent activation of simple concepts while the behavioral relevance ofthe input/event is under an “alert red-line” (i.e. it requires a behavioralresponse but not a direct attentional monitoring, for instance during thevehicle driving).
3) Routine processes : processes triggered by several related conceptsor complex events sufficiently frequent to constitute a stereotypical routine.Routines can be solved by a script [1, 2] and need the emergence of mentalschemes [2, 3, 4], namely representations of complex concepts or eventseasily connectable to a fast cognitive or behavioral response. Note thateven if the strategy requires an attentive control, it doesn’t involve thesame set of cognitive ability needed during a problem solving task, like theresolution of a syllogism or the Wason selection task [5].
4) Reasoning : higher cognitive strategy of understanding and produc-tion, mainly used during the problem solving. Given a set of premises,humans seem to employ rules like those involved in formal logic [6], whichestablish the correct formal solution. The propositional reasoning makes nodistinctions about the contents of a statement, but deals only with its syn-tactical structure. Unfortunately, human judgements are sometimes veryfar from correct formal solutions. For these reasons, a theory of mentalmodels [7, 8] has been proposed, which claims that hypothetical-deductivereasoning have three stages of thought: an understanding of the premises which leads up to model construction, a formulation of provisional conclu-sions, a revision procedure that verifies if other models are possible. Errorsoccur because of working memory limitedness: the bigger is the number ofmodels that we have to menage, the harder the problem becomes. So, errorsare conclusions not rigorously verified.
5) Heuristic behavior : modus operandi typical of situations in whichthere is either a lack of information, or different mental schemes run inconflict, or there is no time to reasoning, or the task is too difficult. Inthese cases, individuals adopt strategies more similar to an attempt ratherthan to the formal solution. Note that, in some cases, the use of heuristics ismandatory and constitutes a cognitive bias . Most important heuristics inpsychology are anchoring/adjustment, availability, and representativeness[10].While strategies 1-4 belong to a hierarchy of use and exploitation ofcognitive resources, heuristics take place only after strategy 3, and if thereare no conditions to apply the other ones or their application fails. Finally, itis possible to bring back the aforementioned considerations into the simplecognitive model reported in Fig. 1. Neglecting earlier input systems andconcept stages, our main purpose is to formalize into a neural model thecognitive constructs of mental scheme and heuristic, showing how these canmodify the production of a response. Besides we will propose a simplifiedversion of the IGT [11, 12] in order to test our model.The paper is organized as follow. In the next section we introduce theneural model, fourth section is devoted to model fitting, and in the finalsections we show the main results of our simulations and discuss aboutfuture perspectives.
2. The Model
The model assumes that cognitive activities during a task resolution canbe represented as a Boolean neural network, whose nodes do not necessar-ily correspond to single biological neurons but rather to organized sets ofneurons, named functional areas. We choose this formalization in order toremain within the framework of neural domain.The basic computational entities, namely the formal neurons, are de-scribed by the following parameters:1. σ : the internal and external state of activation ∈ { -1, +1 } b : the threshold ∈ [0, 1] While the abstract version of the Wason selection task leads to a correct performanceof 3.9%, the concrete one leads to a correct performance of 91%. These results areinterpreted recurring to the availability heuristic applied on past experiences.
Fig. 1. The cognitive model. Flowchart symbols as defined in [14] c : the connectivity degree, i.e. the number of afferent synapses of theneuronThe N bipolar neurons are linked by connections, named synapses, eachbearing a weight ∈ { -1, 0, +1 } . At time t , the i -th node computes incomingsignals from afferent neurons and, at time t + 1, produce a signal, i.e. fires,according to the following update law: σ t +1 i = sgn N X j =1 w ij · σ ti c i − b i (1)where sgn( x ) returns the sign of real number x , w ij is the incomingweighted synapse of i -th neuron from the j -th one, and c i = P j | W ij | . Thisformalization makes the adopted formal neuron similar to that of McCulloch& Pitts [9]. Dynamics: search of an asymptotic configuration . By gener-ating an arbitrary ~σ and ~b , and a connection structure W with entries w ij uniformly distributed in { -1, 0, +1 } , we are able to define the start- The case 0-weight corresponds to the absence of link. We don’t allow auto-synapses,or self-recurring links. ing configuration ζ t , composed by ( W,~b, ~σ t ) , i.e. the initial condition ofthe dynamics. Neurons are synchronously updated by applying iterativelyEq. (1) for a sufficiently large time t meas , the maximum convergence time allowed . If W is asymmetric and according to its asymmetry degree, aperiodical orbit of length l is reached after a transient τ , giving the asymp-totic configuration ζ t meas , composed by ( W,~b, ~σ t meas ). In the following, allthe procedures will make use of this concept of asymptotic configurationand its eventual distance from the correct behavior. This is motivated fromthe assumption that only stable asymptotical configurations can account forstability and invariance of response typical of human cognitive processes. Inprinciple, t c can be viewed as the time needed by the cognitive elaboration. Training phase: a problem of global optimization . We choose asubset of n Boolean functions as possible instances of the training problem.For each γ -th function we select N γI input neurons from which the remainingones take entries for the update dynamics and just one, that is N γO = 1,output neuron from which, after the attainment of a ζ t meas , we will measurethe computed output.Moreover, for each γ -th function we generate the corresponding trainingset ε γ , for which all possible examples are given by coupling one of theinputs ξ γ,inpµ with the respective wanted output ξ γ,outµ , for µ = 1 , ..., P γ .The presentation of the entire training set ε is consequently just one trainingepoch . At the end of each epoch, the error signal E ( ζ t meas ) will be: We have just introduced the vector notation for the matrix of synaptic weights W =( w ij ) ij , the state vector ~σ t = ( σ t , . . . , σ tN ), the threshold vector ~b = ( b , . . . , b N ) andthe connectivity vector ~c = ( c , . . . , c N ). In principle the longest convergence time t c should be 2 N , the maximal periodicalorbit of a finite size discrete system of N unities having 2 possible values. But forreasons of feasibility of simulations, fixing a reasonable t max , t meas will correspondto min { t c , t max } . Consequently, given n functions to learn, the input vector I will have N I = n X γ =1 N γI elements, the output vector O will have N O = n X γ =1 N γO = n elements, and the size of ε will be P = n X γ =1 P γ , where P γ = 2 N γI . Notably, the input vector I cannot changeentries during dynamical update ⇒ T becomes 2 N − N I . We notice that, since wedo not need to test our network on the complementary subset of ε γ , we can freelypresent, for each γ -th instance of the problem, the entire correspondent training set ε γ during the training phase. E = P l w N X i = N − N O +1 P X µ =1 l w X j =1 (cid:0) S outi,µ − ξ outi,µ (cid:1) + τ w −
110 if t c ≤ t max · N O otherwise (2)where S outi,µ and ξ outi,µ are respectively the computed and the wanted outputfor the µ -th pattern of ε . The addiction of a term for transient will justifythe passage from a computationally slow attractor neural network, proto-typical of data-driven processes, to a faster and oriented one representativeof schema-driven processes. Each asymptotic configuration having E = 0,is one of the possible configurations able to solve the n incoming instancesof the problem and therefore the presented task.As there aren’t clues as the way ζ t meas have to be modified, we choose toproceed by trial and error . Consequently, the learning problem can be asso-ciated with a problem of global optimization, where E is the object functionto be minimized. The choice of the optimization strategy will produce dif-ferent behaviors of the network during the learning phase and consequently,we will be able to associate them to different cognitive strategies of taskresolution. Haphazard trials.
No optimization strategy is applied. Starting froman arbitrary generated condition, a series of local perturbations are pro-duced, by modifying just one entry either in W or b selected at randomwith uniformly distributed probability. For each perturbation, the corre-sponding E ( ζ t meas ) is calculated. The resulting trend is a random walk of E . This non strategic behavior is directly affected by the N involved func-tional areas, namely by the size of the solution space. Consequently thisstrategy can need a very long time to reach the solution of the problem andto produce the correct response to the task. Emergence of mental schemes.
The optimized asymptotical config-uration must be able to store the n presented instances as mental schemes,which future activation will produce a fast and cognitively cheap response.We choose to formalize the emergence of mental schemes as a simulated an-nealing procedure with geometrical cooling ratio cl , fixed once for all at 0 . E as energy. By starting from the arbitrary generated ζ t meas ,a local perturbation is produced with same modalities of the haphazard tri-als . The resulting ζ ′ t meas differs in E of a value ∆ E from the previous one.For the acceptation of the new configuration, we refer to the Metropolis We remark that the first term in first case of Eq. (2) is just the normalized averageHamming distance generalized for the entire orbital length. algorithm. The new configuration is kept with probability: p = ( E < − ∆ E/T ) otherwise (3)where T will modulate the cooling schedule . The perturbation procedurecontinues until E ( ζ t meas ) = 0 for a reasonable number of epochs. At thispoint the system have inferred and stored the n instances. When a futurebehavioral situation will pose the same set of input, the stable reachedconfiguration will be reactivated and, having respectively τ and E equals tozero and l equal to 1, it will produce one of the fastest and correct responsesadmissible by the task.
3. Results
Figure 2 shows how the attainment of a configuration having E = 0can depend on the size of the feasible region, in our model function of N .During a series of local perturbations all accepted, the greater is N themore difficult becomes the search of a configuration able to solve the task.Anyhow, this dependency is also found by applying the algorithms of globala b N = 5
Fig. 2. Error signal during perturbation phase. Each time step is just one localperturbation. The circles marks the occasional configurations with E = 0. a) N = 5; b) N = 10. optimization during the learning phase.Results of the optimization procedure are presented in Fig. 3. Com-paring Fig. 3a with Fig. 2b, it is clear what happens to the slow dynamicsduring optimization. Having at the beginning a large temperature T , allmoves can be accepted in spite of their respective error E , allowing the The starting T is fixed once for all at 5. Each 10 epochs we sample the acceptanceprobability ap . If ap = 0 . ± . T k +1 = cl · T k . passage among basins. By decreasing T , only moves that decrease the error E begin to be accepted, see Eq. (4), causing a more exhaustive explorationof the small- E of the basin up to the reaching of the global minimum. Thedependence of E from n and N shown in Fig. 3b,c can be easily reported tothe task difficulty, typically correlated with the number of instances of thelearning problem and the number of involved functional areas.a b c ETN=10 0 200 400 600 800 1000123 Time n=1n=2n=3n=4N=10 0 500 1000 1500 2000 25001234 Time
N=5N=10N=15N=20n=1
Fig. 3. Mean error signal over 30 learning sessions with Metropolis algorithm.Each time step corresponds to the acceptance of just one local perturbation. a)Modulation of T on E , fixed N = 10, n = 1; b) fixed N = 10, n variable; c) N variable, fixed n = 1. Figure 4 shows the passage from an initial unconstrained dynamics toan optimized one, ruled by the learning of a scheme. This transition alsocorresponds to a passage from a ζ t having one of possible τ and l (Fig. 4a),to a ζ t meas having τ and l respectively equal to zero and one (Fig. 4b).a b Time N e u r on s Time N e u r on s Cycle
Fig. 4. Dynamics transition. Neurons activation (dark/black = +1; light/white =-1) as function of time. Each time step is just one application of (1).
4. Testing the model
In this section we introduce a simplified version of IGT in order to testqualitatively the predictions of the model.The task consists of trials where a subject must select a card, reportingboth a term of winning and and a term of loss, from 4 decks (A’, B’, C’ andD’, respectively) of 60 cards. Main goal of the subject is to maximize itsbudget after a 100 trials session. The temporal series of decks are different;decks A’ and B’ promise strong winning in the short period but strongerlosses in the long period, while C’ and D’, promising small winnings butalso smaller losses, assure a better budget at the end of the task.For our purpose, the task to perform by our network take in considera-tion only the two native decks B’ (min = -2330; max = 170; mean = -62.5)and D’ (min = -310; max = 95; mean = -31.25), which length is maintainedto 60. While the choice of the first card is random, the following ones aregiven by the output of the optimized network, composed by 5 neurons, oneof which is the input and one the output. At each trial T r , the networkcomputes all the
T r − S out and the terms of winning or loss in the corresponding tempo-ral series as wanted outputs ξ out . This procedure assumes in psychologicalterms an infinite working memory, and explicitly makes use of the heuristicof availability; in absence of relevant information about the covered cards,humans tend to use available information stored from past trials. Once opti-mized, the output computed by the asymptotical configuration will becomethe choice of the new trial, and its value is registered for future choices.Figure 5 shows typical behavior of the network while performing thesimplified IGT. The winnings early promised by B’ prematurely influencesthe network response in favor of deck B’ but, after the first severe losses, thefunctional E associated to B’ becomes too large and the computed outputswitch in favor of D’. From this moment on, almost all the perturbations tothe asymptotical configuration ζ t meas are rejected by the simulated anneal-ing and the network quickly produce its choice trial by trial. It is interestingto point out that the transition from a regime distinguished by choices inB’ to one distinguished by choice in D’ happens approximately at the sametime of humans [12, 13], implying that the time scales in which the mentalmodel becomes effective are comparable between humans and our model.Moreover after the first great loss given by B’, network tends to perseverewith choices in the same deck, as both normal subjects and pathological The original IGT is a psychological task used into larger test batteries to studyqualitatively behavior of normal subjects during simulated real-life decision making.In neuropsychological practice, it is administered to pathological gamblers. TRIAL
BudgetPickedE BD Fig. 5. Behavior of the network over 30 runs. Each time step is just one trial. Redstarred dots are most frequent choice of B’, green for D’. gamblers. This strange behavior can be interpreted from a psychologicalpoint of view as a persistence of the use of the heuristics of anchoring andadjustment during earlier trials after the loss, while in our network is dueto the fact that the error quotas related to B’ and D’ becomes comparable.
5. Conclusion and future perspectives
We have presented preliminary results of application of a Boolean modelof neural network to relevant cognitive strategies involved in decision makingtasks. At moment only mental schemes have been studied. The choice of asuch formalization is due to the possibilities that Boolean neural networksoffers in terms of robustness, ease of simulation and easy generation ofsamples for data fitting.The model appears to capture the most relevant psychological knowl-edge regarding the domain of application. By shifting from an unstructuredslow attractor neural network to a quicker forward-only one, it hold in re-spect of learning studies about task complexity and the number of employedcognitive resources. As defined, mental schemes become fast and adaptivecognitive strategies of behavioral response.Regarding the model fitting, the behavior of our network on the simpli-fied version of the IGT produces results qualitatively comparable with thoseof humans.Future studies will be targeted to include into the model aspects regard-ing probabilistic and hypothetical-deductive reasoning, while future appli-cations will take in consideration pathological gambling. Appendix A
Implementation algorithm