An Entropic Associative Memory
AAn Entropic Associative Memory
Luis A. Pineda Universidad Nacional Aut´onoma de M´exico
Gibr´an Fuentes Universidad Nacional Aut´onoma de M´exico
Rafael Morales Universidad de Guadalajara
Abstract
Natural memories are associative, declarative and distributed. Symbolic com-puting memories resemble natural memories in their declarative character, andinformation can be stored and recovered explicitly; however, they lack the asso-ciative and distributed properties of natural memories. Sub-symbolic memoriesdeveloped within the connectionist or artificial neural networks paradigm are as-sociative and distributed, but are unable to express symbolic structure and infor-mation cannot be stored and retrieved explicitly; hence, they lack the declarativeproperty. To address this dilemma, we use Relational-Indeterminate Computingto model associative memory registers that hold distributed representations ofindividual objects. This mode of computing has an intrinsic computing entropywhich measures the indeterminacy of representations. This parameter deter-mines the operational characteristics of the memory. Associative registers areembedded in an architecture that maps concrete images expressed in modality-specific buffers into abstract representations, and vice versa, and the memorysystem as a whole fulfills the three properties of natural memories. The systemhas been used to model a visual memory holding the representations of hand- [email protected] [email protected] [email protected] Preprint submitted to Elsevier September 29, 2020 a r X i v : . [ c s . A I] S e p ritten digits, and recognition and recall experiments show that there is a rangeof entropy values, not too low and not too high, in which associative memoryregisters have a satisfactory performance. The similarity between the cue andthe object recovered in memory retrieve operations depends on the entropy ofthe memory register holding the representation of the corresponding object.The experiments were implemented in a simulation using a standard computer,but a parallel architecture may be built where the memory operations wouldtake a very reduced number of computing steps. Keywords:
Associative Memory, Relational-Indeterminate Computing,Computing Entropy, Table Computing, Cognitive Archictecture
1. Associative Memory
Natural memories of humans and other animals with a developed enoughneural system are associative [1]. An image, a word or an odor can start achain of remembrances on the basis of their meanings or contents. Naturalmemories contrast strongly with standard computer memories in that the latterconsists of place-holders –containing strings of symbols that are interpretedas representations– that are accessed through their addresses. Computationalmodels of associative memories have been extremely difficult to create withinthe symbolic paradigm, and although there have been important attempts usingsemantic networks since very early [2], and production systems more recently[3], practical symbolic associative memories are still lacking.This limitation was one of the original motivations for the parallel distributedprocessing program, including connectionist systems and neural networks [4],which questioned explicitly the capability of Turing Machines to properly ad-dress associative memories, among other high level cognitive functions. Thesubject has been one main subject matter within artificial neural networks andthere have been very influential proposals, such as Hopfield’s model [5] or the See the introduction of the cited Rumelhart’s book.
2. Relational-Indeterminate Computing
The present associative memory systems is defined with a novel mode of com-puting that is referred to here as Relational-Indeterminate Computing (RIC)[11, 12]. The basic object of computing in this mode is the mathematical rela-tion, such that an object in the domain may be related to several objects in thecodomain. The specification is presented by Pineda [11] as follows: Let the sets A = { a , ..., a n } and V = { v , ..., n m } , of cardinalities n and m ,be the domain and the codomain of a finite relation r : A → V . The objects inthe domain and codomain are referred to here as the arguments and the values respectively. For purposes of notation, for any relation r we define a function R : A × V → { , } –the relation in lower case and the function in upper caseletters– such that R ( a i , v j ) = 1 or true if the argument a i is related to the value For a more general discussion see [13]. j in r , and R ( a i , v j ) = 0 or false otherwise.In this formalism, evaluating a relation is construed as selecting randomlyone among the values associated to the given argument. In the same way that“ f ( a i ) = v j ” is interpreted as stating that the value of the function f for theargument a i is v j , “ r ( a i ) = v j ” states that the value of the relation r for theargument a i is an object v j that is selected randomly -with an appropriatedistribution– among the values for which R ( a i , v j ) is true.RIC has three basic operations: abstraction , containment and reduction . Let r f and r a be two arbitrary relations from A to V , and f a be a function with thesame domain and codomain. The operations are defined as follows: • Abstraction: λ ( r f , r a ) = q , such that Q ( a i , v j ) = R f ( a i , v j ) ∨ R a ( a i , v j )for all a i ∈ A and v j ∈ V –i.e., λ ( r f , r a ) = r f ∪ r a . • Containment: η ( r a , r f ) is true if R a ( a i , v j ) → R f ( a i , v j ) for all a i ∈ A and v j ∈ V (i.e., material implication), and false otherwise. • Reduction: β ( f a , r f ) = f v such that, if η ( f a , r f ) holds f v ( a i ) = r f ( a i ) forall a i , where the random distribution is centered around f a , as elaboratedbelow. If η ( f a , r f ) does not hold, β ( f a , r f ) is undefined –i.e., f v ( a i ) isundefined– for all a i .Abstraction is a construction operation that produces the union of two rela-tions. A function is a relation and can be an input to the abstraction operation.Any relation can be constructed out of the incremental abstraction of an ap-propriate set of functions. The construction can be pictured graphically byoverlapping the graphical representation of the included functions on an emptytable, such that the columns correspond to the arguments, the rows to the valuesand the functional relation is depicted by a mark in the intersecting cells.The containment operation verifies whether all the values associated to anargument a i in r a are associated to the same argument in r f for all the ar-guments, such that r a ⊆ r f . The containment relation is false only in case4 a ( a i , v j ) = 1 and R f ( a i , v j ) = 0 –or if R a ( a i , v j ) > R f ( a i , v j )– for at least one( a i , v j ).The set of functions that are contained in a relation, which is referred to hereas the constituent functions , may be larger than the set used in its construction.The constituent functions are the combinations that can be formed by takingone value among the ones that the relation assigns to an argument, for all thearguments. The table format allows to perform the abstraction operation bydirect manipulation and the containment test by inspection. The constructionconsists on forming a function by taking a value corresponding to a marked cellof each column, for all values and for all columns. The containment is carriedon by verifying whether the table representing the function is contained withinthe table representing the relation by testing all the corresponding cells throughmaterial implication.For this, the abstraction operation and the containment test are productive.This is analogous to the generalization power of standard supervised machinelearning algorithms that recognize not only the objects included in the trainingset but also other objects that are similar enough to the objects in such set.Reduction is the functional application operation. If the argument function f a is contained in the relation r f , reduction generates a new function such thatits value for each of its arguments is selected from the values in the relation r f for the same argument. In the basic case, the selection function is the identityfunction –i.e., β ( f a , r f ) = f a . However, β is a constructive operation such thatthe argument function f a is the cue for another function recovered from r f ,such that v j is selected from { v j | ( a i , v j ) ∈ r f } using and appropriate randomdistribution function centered around f a ( a i ). If f a is not contained in r f thevalue of such functional application operation is not defined.Relations have an associated entropy, which is defined here as the averageindeterminacy of the relation r . Let µ i be the number of values assigned tothe argument a i in r ; let ν i = 1 /µ i and n the number of arguments in thedomain. In case r is partial, we define ν i = 1 for all a i not included in r . The5 omputational entropy e ( r ) –or the entropy of a relation– is defined here as: e ( r ) = − n n (cid:88) i =1 log ( ν i ) . A function is a relation that has at most one value for any of its arguments,and its entropy is zero. Partial functions do not have a value for all the argu-ments, but this is fully determined and the entropy of partial functions is alsozero.
3. Table Computing
The implementation of RIC in table format is referred to as
Table Computing [12]. The representation consists of a set of tables with n columns and m rows,where each table is an Associative Register that contains a relation between aset of arguments A = { a , ..., a n } and a set of values V = { v , ..., v m } . Let[ R k ] t be the content of the register R k at time t and ← an assignment operatorsuch that R k ← R j assigns [ R j ] t to [ R k ] t +1 , where j and k may be equal. Thiscorresponds to the standard assignment operator of imperative programminglanguages. The machine also includes the conditional operator if relating acondition pred to the operations op and op –i.e., if pred then op else op ,where op is optional. The initialization of a register R such that all its cellsare set to 0 or to 1 is denoted R ← R ← R ← f denotesthat a function f : A → V is input into the register R . The system also includesthe operators λ , η and β for computing the corresponding operations. Theseare all the operations in table computing.Let K be a class, O k a set of objects of class K , and F O a set of functionswith n arguments and m values, such that each function f i ∈ F O represents aconcrete instance o i ∈ O k in terms of n features, each associated to one of m possible discrete values. The function f i may be partial –i.e., some features mayhave no value.Let R k be an associative memory register and R k − i/o an auxiliary inputand output register, both of size n × m . The distributed representation of the6bjects O k is created by the algorithm Memory Register ( f i , R k ) for all f i ∈ F O as follows: • Memory Register ( f i , R k ):1. R k − i/o ← f i R k ← λ ( R k , R k − i/o )3. R k − i/o ← o ∈ O k –or of class K – represented by thefunction f , is performed by the algorithm Memory Recognize as follows: • Memory Recognize ( f, R k ):1. R k − i/o ← f
2. If η ( R k − i/o , R k ) then ( R k − i/o ← • Memory Retrieve ( f, R k ):1. R k − i/o ← f
2. If η ( R k − i/o , R k ) then R k − i/o ← β ( R k − i/o , R k ) else R k − i/o ← igure 1: Associative Memory Architecture The interpretation conventions state that the content of the associative reg-ister is interpreted as an abstract concept, and the content of the input auxiliaryregister is interpreted as a concrete concept –the concept of an individual object.
4. Architecture
Table Computing was used to implement a novel associative memory system.The conceptual architecture is illustrated in Figure 1. The main componentsare: • A set of
Associative Memory Registers (AMRs) of size n × m for n ≥ m ≥ Auxiliary Registers (Aux. Reg.) eachhaving a
Control I/O unit. • A central control unit sending the operation to be performed to all AMRs,and receiving the final status of the operation (i.e., whether it was suc-cessful or not) and the entropy of each AMR (not shown in the diagram).8
A bus with n tracks, each representing a characteristic or feature, with itscorresponding value: a binary number from 0 to 2 m − • An input processing unit constituted by: – A input modal pixel buffer with the concrete representation of imagesproduced by the observations made by the computing agent directly.For instance, the input buffer can contain k pixels with 256 graylevels represented by integers from 0 to 255. – An analysis module mapping concrete representations to abstractmodality-independent representations constituted by n characteris-tics with their corresponding real values. – A quantizing and sampling module mapping the real values of the n characteristics into 2 m levels, represented by binary digits of length m , which are written on the corresponding track of the bus. • An output processing unit constituted by: – A Digital/Real conversion module that maps binary numbers in atrack of the bus to real numbers, for the n tracks. – A synthesis module mapping abstract modality-independent repre-sentations constituted by n characteristics with their correspondingvalues into concrete representations with k pixels with their values. – An output modal buffer with the concrete representation of imagesproduced by the synthesis module. The contents of the output bufferare rendered by an appropriate device, and constitute the actions ofthe system.The bus holds the representations of functions with domain A = { a , ..., a n } and range V = { v , ..., v n } where 0 ≤ v j ≤ m − • Input Protocol :1. Sense the object o i and place it in the input modal buffer;2. Produce its abstract representation f i through the analysis module;3. Produce its corresponding quantized representation and write it onthe bus;4. Write the value of all n arguments diagrammatically into the corre-sponding row of the auxiliary register R k − i/o ; • Output Protocol :1. Write the content of the auxiliary register R k − i/o on the bus as binarynumbers with m digits for all n arguments;2. Transform the digital values of the n characteristics into real values;3. Generate the concrete representation of the object and place it onthe output buffer through the synthesis module.The core operations of the memory register, memory recognition and mem-ory retrieve algorithms are carried on directly on the AMRs and their corre-sponding auxiliary registers in two or three computing steps –i.e., the opera-tions λ , η and β in addition to the corresponding assignment operations. Theassignments R k − i/o ← f i , R k − i/o ← R k − i/o ← • M emory Register ( f i , R k ): The register AM R k is set on, and the remain-ing AMRs are set off; the input protocol is performed; the M emory Register operation is performed. 10
M emory Recognize ( f i , R k ): All AMRs are set on; the input protocolis performed; the M emory Recognize operation is performed; all AMRssend its status and entropy to the control unit; if no AMR’s recognitionoperation is successful the object is rejected. • M emory Retrieve ( f i , R k ): All AMRs are set on; the input protocol isperformed; the M emory Retrieve operation is performed; all AMRs sendits status and entropy to the control unit; all AMRs but the selected oneare set off; the output protocol is executed; the recovered object is placedon the output buffer.
5. Analysis and Synthesis
The Analysis module maps the concrete information that is sensed from theenvironment and placed in the input buffer –where the characteristics stand forexternal signals– into the abstract representation that characterizes the mem-bership of the object within a class. Both concrete and abstract representationsare expressed as functions but while in the former case the arguments have aspatial interpretation –such as the pixels of an image– in the latter the functionsstand for modality-independent information.The analysis module in the present architecture is constituted by a neuralnetwork with three convolutional layers [14]. The training phase was configuredby adding a classifier, which was a fully connected neural network (FCNN) withtwo layers. The analysis module was trained in a standard supervised mannerwith back-propagation, as illustrated in Figure 2.Once the analysis module has been trained the FCNN is removed and the ob-jects in the input buffer can be mapped into their corresponding representationas a set of abstract features through a bottom-up “analysis operation”. The in-formation is feed into the AMRs through the output of the convolutional layer.The purpose of the analysis module is to map concrete images into abstractrepresentations, but not to perform classification.11 igure 2: Training the Analysis Module
The diagram shows the case in which the input has 784 inputs, correspondingto a pixel buffer of size 28 ×
28, each taking one out of 256 gray levels, while itsoutput is a function with 64 arguments with real values.The objects recovered from the AMRs are mapped into the correspondingconcrete representations and placed in the output buffer by the synthesis mod-ule. This consists of a transposed convolutional network that computes theinverse function of the input convolutional network. The two neural networkstogether conform what is known as a convolutional autoencoder [15, 16].If the function produced by the analysis module is input directly into thesynthesis module, the image in the output buffer should be the same as the oneoriginally placed in the input buffer. However, the transposed convulationalnetwork was trained independently and the recovered image is slightly different.
6. A Visual Memory for Hand Written Digits
The associative memory system was simulated through the construction ofa visual memory for storing distributed representations of hand written digitsfrom “0” to “9”. The system was built and tested using the MNIST database. In this resource each digit is defined as a 28 x 28 pixel array with 256 gray levels.There are 70,000 instances available. The instances of the ten digit types aremostly balanced. http://yann.lecun.com/exdb/mnist/ • Training Corpus (
TrainCorpus ): For training the analysis and synthesisconvolutional and transposed networks (57 %). • Remembered Corpus (
RemCorpus ): For filling in the Associative MemoryRegisters (33 %). • Test Corpus (
TestCorpus ): For testing (10 %).The corpus partitions were rotated through a standard 10-fold cross-validationprocedure.Four experiments supported by a given analysis and synthesis modules wereperformed:1. Experiment 1: Define an associative memory system including an AMRfor holding the distributed representation of each one of the ten digits. De-termine the precision and recall of the individual AMRs and of the overallsystem. Identify the size of the AMRs with satisfactory performance.2. Experiment 2: Investigate whether AMRs can hold distributed representa-tions of more than one individual object. For this an associative memorysystem including an AMR for holding the distributed representation oftwo “overlapped” digits is defined. Determine the precision and recall ofthe individual AMRs and of the overall system.3. Experiment 3: Determine the overall precision and recall for differentlevels of entropy of the AMRs, for the AMR with the best performanceidentified in Experiment 1.4. Experiment 4: Retrieve objects out of a cue for different levels of entropyand generate their corresponding images –with the same AMR used inexperiment 3. Assess the similarity between the cue and the recoveredobject at different levels of entropy.In all four experiments each instance digit is mapped into a set of 64 featuresthrough the analysis module. Hence, each instance is represented as a function13 i with domain { a , ..., a } where each argument a i is mapped to a real value v i –i.e. f i ( a i ) = v j . The values are quantized in 2 m levels. The tables or associativeregisters have sizes of 64 × m . The parameter m determines the granularityof the table. The experiments one and two were performed with granularities2 m for 0 < = m < = 9. So, the memory was tested with 10 granularities in eachsetting. The source code for replicating the experiments, including the detailedresults and the specifications of the hardware used, are available in Github at https://github.com/LA-Pineda/Associative-Memory-Experiments. Compute the characteristics of AMR of size 64 × m for 0 ≤ m ≤ RemCorpus in their corresponding register throughthe
M emory Register operation;2. Test the recognition performance of all the instances of the test corpusthrough the
M emory Recognize operation;3. Compute the average precision, recall and entropy of individual memories.4. Select a unique object to be recovered by the
M emory Retrieve operation;compute the average precision and recall of the integrated system whenthis choice has been made.The average precision, recall and entropy of the ten AMRs are shown in 3 (a).The precision for the smallest AMR with only one row are 10% –the proportionof the test data of each digit– and recall is 100% –as all the information isconfused and everything is accepted. The precision grows with the size of theAMRs and has a very satisfactory value up from 32 rows. The recall, on itspart, remains very high until the granularity of the table is too fine and it startsto decrease slightly. The entropy is increased almost linearly with the AMRssize, starting from 0 where the relations have only one value.The average precision, recall and entropy of the integrated system is shownin 3 (b). The precision has a similar pattern to the one above, but the recalllowers significantly in AMRs with a small m –the precision and recall are the14 igure 3: Results of Experiment 1 same practically for m ≤
4. The reason of this decrease is that when the size ofthe AMR is small, there is a large number of false positives, and several AMRsdifferent from the right one may accept the object; however, one register mustbe selected for the memory retrieve operation, and there is no information todecide which one. This decision was made using the AMR with the minimalentropy, although this choice does not improve over a random choice using anormal distribution.The average number of accepting AMRs for each instance per AMR size isshown in Figure 3 (c). As can be seen this number goes from 10 for AMRswith one row to 1 for AMRs with 8 and 16 rows, where the precision is veryhigh because every AMR recognizes only one instance in average. This effect isfurther illustrated in Figure 3 (d).
In this experiment each associative register holds the representation of twodifferent digits: “0” and “1”, “2” and “3”, “4” and “5”, “6” and “7” and “8” and15 igure 4: Results of Experiment 2 “9”. The procedure is analogous to experiment 1. The results of the experimentare shown in 4. The results of both experiments are also analogous, with theonly difference that the entropy of the AMRs holding two digits are largerthan the entropies of the AMRs holding a single digit. This experiment showsthat it is possible to create associative memories holding overlapped distributedrepresentations of more than one individual object, that have a satisfactoryperformance.
The purpose of this experiment was to investigate the performance of anAMR with satisfactory operational characteristics in relation to its entropy orinformation content. Experiment 1 shows that AMRs with sizes 64 ×
32 and64 ×
64 satisfy this requirement. As their performance are practically the same,the smallest one was chosen for a basic economy criteria.The AMRs were filled up with varying proportions of the
RemCorpus –1 %,2 %, 4 %, 8 %, 16 %, 32 %, 64 % and 100 %– as shown in Figure 5. The entropy16 igure 5: Results of Experiment 3 increases according to the amount of remembered data, as expected. Precision isvery high for very low entropy values and it decreases slightly when the entropyis increased, although it remains very high when the whole of
RemCorpus isconsidered. Recall, on its part, is very low for very low levels of entropy butincreases very rapidly when the AMR is filled up with more data.
The final experiment consists on assessing the similarity of the objects re-trieved from the memory out of a cue. In the basic scenario, if the cue matchesperfectly the recovered object, the image in the output buffer should be the sameas the image placed in the input. However, memory retrieve is a constructiveoperation that renders a novel object which may be somewhat different thanthe cue. The object is constructed by the β operation, as described above. Inthe present experiment a random triangular distribution is used for selectingthe values of the arguments of the retrieved object out of the potential valuesof the AMR for the corresponding arguments.The hypothesis is that the increase of memory recall goes in hand with higherentropy, but the space of indeterminacy of the AMR impacts negatively in theresemblance or similarity between the cue and the retrieved object. Figure5 suggest that this effect is significant for memories with a low entropy –i.e.,17 igure 6: Similarity between the cue and the recovered digits as a function of the entropy e ≤ . β as the identity func-tion –although such choice would remove the constructive aspect of the memoryretrieve operation, and memory recognition and memory retrieve would amountto the same operation. The decoded image is very similar to the cue, but it isnot an exact copy. The synthesis module should compute the inverse function ofthe one computed by the analysis one but the convolutional and the transposednetworks are trained independently, and this is only an approximation.The remaining images, from top to bottom, correspond to the retrievedobjects for the nine levels of the RemCorpus that are considered (the codifiedimage corresponds to e = 0). The rows for the ten digits suggest that the highestsimilarity is achieved when the entropy is very low.The overall behavior of the system suggests that the AMRs are very toler-ant for memory recognition, but very restrictive for retrieving objects. This isconsistent to the general intuition that recognizing objects in memory is mucheasier than retrieving them. 18 . Experimental Setting The programming for all the experiments was carried out in Python 3.8on the Anaconda distribution. The neural networks were implemented withTensorFlow 2.3.0, and most of the graphs produced using Matplotlib. Theexperiments were run on an Alienware Aurora R5 with an Intel Core i7-6700Processor, 16 GBytes of RAM and NVIDIA GeForce GTX 1080 graphics card.The images shown in Figure 6 were selected one column per experimental run,while the criteria for selection was, when possible, to have some resemblance tothe corresponding digit in the first (1%) and last (100%) stage.Regarding the neural networks, the classifier and the decoder were trainedseparately. Firstly, the classifier was trained (convolutional section plus fullyconnected section) using MNIST data. Secondly, the fully connected layerswere removed from the classifier and the features for all images in MNIST weregenerated and stored using the trained convolutional part only. Thirdly, thedecoder (transposed convolutional) was trained using the features as input dataand the original images as labels. Finally, the trained decoder was used to gen-erate the images from the features produced by the memory retrieve algorithmin Experiment 4.
8. Discussion
In this paper a memory system that is associative and distributed but declar-ative is presented. Individual instances of represented objects are characterizedat three different levels: i) as concrete modality-specific representations in theinput and output buffers –which can be sensed or rendered directly– or, alterna-tively, as functions from pixels to values; ii) as abstract modality-independentrepresentations in a space of characteristics, which are functions standing in aone-to-one relation to their corresponding concrete representations in the firstlevel; and iii) as distributed representations holding the disjunctive abstractionof a number, possibly large, of instance objects expressed in the second level.The first level can be considered as declarative and symbolic; the second is19till declarative but is independent of representation, so can hold and integraterepresentations of objects presented in different modalities; and the third is asub-symbolic structure holding the abstraction of a set of objects of the secondlevel.The memory register and recognition operations use only the logical disjunc-tion and material implication, that are performed by direct manipulation, cellto cell in the tables, and information is taken and placed on the bus by directmanipulation too, enhancing the declarative aspect of the system.The associative property depends on the dual role played by the interme-diate representations that express content and at the same time select theircorresponding Associative Memory Registers through the memory recognitionand recovery operations. The memory register operation is analogous to thetraining phase of supervised machine learning, and it presupposes an attentionmechanism that selects the AMR in which the information is input. Adressingthis restriction is left for further work.The analysis and synthesis modules mapping concrete into abstract repre-sentations and vice versa are serial computations from a conceptual perspective–although their internal micro-operations can be performed in parallel usingGPUs– but the memory operations manipulate the symbols stored in the corre-sponding cells of the tables directly, taking very few computing steps, which canbe performed in parallel if the appropriate hardware is made available. In thepresent architecture the memory operations involve the simultaneous activationof all the associative memory registers, and this parallelism takes places not onlyat the algorithmic and implementation levels but also at the computational orfunctional system level, in Marr’s sense [17].The analysis and synthesis mechanisms are implemented here through stan-dard deep neural networks, but this is a contingency. From the conceptualperspective this functionality can be achieved with other modes of computing,that map concrete representations into the abstract characteristics space byother means.The functionality of the memory proper can be also distinguished from the20nput and output processes in terms of the indeterminacy of the computingobjects. The analysis and synthesis modules compute a function whose domainand range are sets of functions, and these processes are fully determined: providethe same value for the same argument always. Hence, these are zero entropycomputations. However, the distributed representations stored in the memoryregisters have a degree of indeterminacy, which is measured by the computingentropy.The entropy is a parameter of the system performance, as can be seen inthe four experiments. First, it measures the operational range of the associativeregisters, as shown in experiments 1 and 2. If the entropy is too low precisionand recall are low overall, but if it is too high, recall is also diminished. However,there is an entropy level in which both precision and recall are pretty satisfac-tory. The experiments showed that memory registers with sizes of 64 ×
32 and64 ×
64, with entropy of 3 . . .
4. However, if the information is increased withnoise, the entropy has a very large value, and although recognition recall willnot decrease, the information is confused and recognition precision will lowersignificantly. Hence, there is again a range of entropy, not too low and not toohigh, in which the amount of information is rich, and the memory is effective.The experiment 4 asked the question of how similar are the objects recoveredfrom the memory retrieval operation to the cue or key used as the retrievaldescriptor. The results show that high similarity is only achieved for very lowlevels of entropy. In the basic case, when the entropy is zero, the retrievedobject is the same as the cue, and memory recognition and memory retrievalare not distinguished. This corresponds to
Ramdom Access Memories (RAM)of standard digital computers, where the content of a RAM register is “copied”but not really extracted or recovered in a memory read operation.Natural memories are constructive in the sense that the memory retrieveoperation renders a genuine novel object. This is the reason to define the β op-erator using a random distribution. Whenever the cue is accepted the retrievaloperation selects an object whose representation is within the relation’s con-stituent functions. The retrieved object may or may not have been registeredexplicitly, but it is the product of a construction operation always.However, the similarity experiment showed that high resemblance betweenthe cue and the recovered object is only achieved when the entropy has verylow values. If the entropy is zero, the retrieved object is a “photographic copy”–Figure 6 shows some distortions, but these are due to the disparity betweenthe analysis and synthesis modules. The reconstructions resemble well the cuefor entropy around 2 –using only 1% or 2% of the test corpus, but then on thesimilarity is quite random. This result suggests that although there is flexibilityin memory recognition, memory retrieval is quite constrained, and hence a muchharder operation. The entropy range also suggests that retrieval goes from“photographic” to “recovered objects” to “imaged objects” to noise. Once again,operational memories have an entropy range in which the entropy is not too lowand not too high. This pattern seems to be very general and is referred to as22 he Entropy Trade-off [11].The study of memory mechanisms for storing the representations of individ-ual objects is central to cognition and computer applications, such as informa-tion systems and robotics. The sense data may be presented to cognition ina natural format in which spatial and temporal information may be accesseddirectly, but may be stored as a highly abstract modality-independent repre-sentation. Such representations may be retrieveed directly by perception in theproduction of interpretations, by thought in decision making and planning, andby the motricity or motor module when motor abilities are deployed.Associative memory mechanisms should support long term memory, bothepisodic and semantic, and may be accessed on demand by working memory.Such devices may be used in the construction of episodic memories and compos-ite concepts that may be stored in associative memory registers themselves, orin higher level structures that rely on basic memory units. Associative memorymodels may be essential for the construction of lexicons, encyclopedic memories,and modality-specific memories, such as faces, prototypical shapes or voices,both in cognitive studies and applications.The present investigation addressed the design and construction of the fullassociative memory system for a simple domain, and the current result can beseen as a proof of concept. We left the investigation of larger and more realisticdomains for further work.The present investigation addressed also the case in which images wherecomplete objects; however, this is only the basic condition, as interpretation isoften performed in noisy environments and with incomplete information. Forinstance, the objects may be partially covered and/or seen from different per-spectives. The cues or descriptors generated in such conditions would be muchpoorer information, memory recognition would be harder, and the objects recov-ered from memory would involve larger reconstructions than the present case.The investigation of the associative memory system for incomplete informationis also left for further research. 23 . Acknowledgments We thank Iv´an Torres and Ra´ul Peralta for their help in the implementationof preliminary recognition experiments. The first author also acknowledges thepartial support of grant PAPIIT-UNAM IN112819, M´exico.
References [1] J. R. Anderson, G. H. Bower, Human Associative Memory: A Brief Edition,Lawrence Erlbaum Associates, Publishers, Hillsdale, New Jersey, 1980.[2] M. R. Quillian, Semantic memory, in: M. Minsky (Ed.), Semantic Infor-mation Processing, MIT Press, 1968, pp. 227–270.[3] J. R. Anderson, D. Bothell, M. D. Byrne, S. Douglass, C. Lebiere, Y. Qin,An integrated theory of the mind, Psychological Review 111 (2004) 1036–1060.[4] D. E. Rumelhart, J. L. McClelland, the PDF Research Group, ParallelDistributed Processing, Explorations in the Microstructure of Cognition,Vol.1: Foundations, The MIT Press, Cambridge, Mass., 1986.[5] J. J. Hopfield, Neural networks and physical systems with emergent col-lective computational abilities, Proceedings of the National Academy ofSciences of the USA 79 (8) (1982) 2554–2558.[6] B. Kosko, Bidirectional associative memories, IEEE Transactions on Sys-tems, man, and Cybernetics 18 (1) (1988) 49–60.[7] Q. Ma, H. Isahara, Semantic networks represented by adaptive associativememories, Neurocomputing 34 (2000) 207–225.[8] A. Graves, G. Wayne, I. Danihelka, Neural turing machines, CoRRabs/1410.5401. arXiv:1410.5401 .URL http://arxiv.org/abs/1410.5401 doi:10.1038/nature20101 .[10] J. A. Fodor, Z. W. Pylyshyn, Connectionism and cognitive architecture: Acritical analysis, Cognition 28 (1-2) (1988) 3–71.[11] L. A. Pineda, Entropy, computing and rationality (2020). arXiv:2009.10224 .[12] L. A. Pineda, The mode of computing, CoRR abs/1903.10559. arXiv:1903.10559 .URL http://arxiv.org/abs/1903.10559 [13] L. A. Pineda, Racionalidad Computacional, Academia Mexicana de Com-putaci´on, A. C., Ciudad de M´exico, 2020, to be published.[14] Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (11) (2015)436–444. doi:10.1038/nature14539 .[15] G. E. Hinton, R. R. Salakhutdinov, Reducing the Dimensionality of Datawith Neural Networks, Science 313 (5786) (2006) 504–507. doi:10.1126/science.1125249 .[16] J. Masci, U. Meier, D. Cire¸san, J. Schmidhuber, Stacked Convolu-tional Auto-Encoders for Hierarchical Feature Extraction, in: T. Honkela,W. Duch, M. Girolami, S. Kaski (Eds.), Artificial Neural Networks andMachine Learning – ICANN 2011, Vol. 6791 of Lecture Notes in ComputerScience, Springer, 2011, pp. 52–59. doi:10.1007/978-3-642-21735-7_7doi:10.1007/978-3-642-21735-7_7