Feature extraction without learning in an analog Spatial Pooler memristive-CMOS circuit design of Hierarchical Temporal Memory
NNoname manuscript No. (will be inserted by the editor)
Feature extraction without learning in an analog Spatial Poolermemristive-CMOS circuit design of Hierarchical TemporalMemory
Olga Krestinskaya · Alex Pappachen James
Received: date / Accepted: date
Abstract
Hierarchical Temporal Memory (HTM) is aneuromorphic algorithm that emulates sparsity, hierar-chy and modularity resembling the working principlesof neocortex. Feature encoding is an important step tocreate sparse binary patterns. This sparsity is intro-duced by the binary weights and random weight assign-ment in the initialization stage of the HTM. We pro-pose the alternative deterministic method for the HTMinitialization stage, which connects the HTM weightsto the input data and preserves natural sparsity of theinput information. Further, we introduce the hardwareimplementation of the deterministic approach and com-pare it to the traditional HTM and existing hardwareimplementation. We test the proposed approach on theface recognition problem and show that it outperformsthe conventional HTM approach.
Keywords
Hierarchical Temporal Memory · Memris-tors · Spatial Pooler · Rule based approach · Analogcircuits
Hierarchical Temporal Memory (HTM) is a neuromor-phic algorithm that emulates the structure and func-tionality of the cortical neural networks [11]. HTM canserve as a tool for intelligent data processing in edgecomputing devices. The increase in the number of edgecomputing devices and Internet of things (IoT) appli-cations in the recent years lead to the demand to in-troduce on sensor processing using analog hardware.
Therefore, the translation of the HTM algorithm intoanalog hardware can produce the promising solution tothe computational speed problems [2,7,12].HTM is divided into two parts: HTM Spatial Pooler(HTM SP) and HTM Temporal Memory (HTM TM).The HTM SP has been proven to be useful for visualdata processing and classification problems, whereasthe HTM TM is used as a prediction and learning algo-rithm. In this work, we focus on the SP part of HTM.The main functionality of the HTM SP is to form thesparse distributed pattern from the input data and per-form the feature encoding. The recent works show thatit is useful for feature extraction and pattern recogni-tion problems [12].In this work, we investigate the initialization stageof the HTM SP and proposed the rule-based determin-istic approach instead of the random weight approachfor the initial weight assignment. The main purpose ofthe rule-based approach is to connect the input to theHTM weights, which allows to preserve natural sparsityand structural information from the inputs. Moreover,we propose the hardware implementation for the rule-based approach and compare it with the conventionalrandom weight approach in terms of power dissipationand on-chip area requirements. Also, we test the systemlevel implementation of the proposed approach on theface recognition problem and show the improvementsin the recognition accuracy [10,12].This paper is organized into 8 sections. Section 2provides the overview of the HTM algorithm and intro-duces the mathematical framework of HTM. Section 3illustrates the difference between the conventional ap-proach and the proposed rule-based approach. Section4 discussed the hardware implementation of the HTMSP and illustrates the proposed hardware architecture.Section 5 shows how system level HTM SP algorithm a r X i v : . [ c s . ET ] M a r Olga Krestinskaya, Alex Pappachen James can be used for the face recognition problem. Section 6shows the results of the system-level and analog hard-ware implementations. Section 7 provided the discus-sion of the proposed rule-based method and correspond-ing analog hardware. Section 8 concludes the paper.
Fig. 1
The hierarchical structure of HTM. The exampleshows 3-level HTM, where each level consists of the regionswith columns, and columns are created from the cells. HTMis connected to the input space by the synapses.
If the connected synapse is connected to the active in-put it is considered to be active connected. In the over-lap phase the number of active connected synapses iscomputed. In the inhibition stage, the k columns withhighest overlap values become active (assigned as high,1), and the other columns are inhibited (assigned aslow, 0). In the learning phase, the HTM SP weightsof the synapses are updated based on the Hebb‘s lean-ing rules. After the update process, all phases, exceptinitialization phase, are repeated.2.2 Mathematical framework of the HTM SPThe arrangement of the input of the HTM SP and out-put space that is arranged into mini-columns is shownin Fig.2. The parameter x j denotes the j -th input neu-ron in the input space, and the y i refers to the i thoutput SP mini-column in the output space, which isconnected to the region of the input space with the po-tential connections.The synapse of the i -th SP mini-column is locatedin a hypercube of the input space centered at x ci withthe edge length of γ . The potential connections are de-fined in Eq.1, where ı ( x j ; x ci , γ ) = 1 , ∀ x j ∈ ( x ci , γ ), and z ij ∼ U (0 , z is selected randomly and follows theuniform distribution rule ( U has a range: [0 , ρ denotes the assigned percentage of TM feature extraction without learning 3
Fig. 2
The arrangement of the HTM SP input space and theoutput space, containing the mini-columns. inputs that are considered to be potential connectionswithin the hypercube of the input space.
P I ( i ) = { j | ı ( x j ; x ci , γ ) and ( z ij < ρ ) } (1)For all synapses the synaptic permanence value (weight)is assigned. The synaptic permanence from the j -th in-put to the i -th SP mini-column is represented by thematrix S ij ∈ [0 ,
1] shown in Eq.2. If the synapse is lo-cated within the potential the region of potential in-puts, the synaptic permanence value S ij is assigned asthe uniform random distribution between 0 and 1; oth-erwise, the synaptic permanence is 0, so the synapse isnot connected. S ij = (cid:26) U (0 ,
1) if j ∈ P I ( i )0 otherwise (2)All the connected synapses are represented by thebinary matrix B shown in Eq.3. Based on the synapticpermanence value, the synapse is either connected ornot connected. If their value is greater the thresholdvalue θ c , the synapse is connected and B = 1, and viseversa. The threshold θ c shows the percentage of theconnected synapses. B ij = (cid:26) S ij ≥ θ c N i of the HTM SP ofthe i -th SP mini-column is determined. The parameter (cid:107) y i − y j (cid:107) refers to the the Euclidean distance betweenthe mini-columns i and j , and the parameter φ controlsthe inhibition radius. N i = { j | (cid:107) y i − y j (cid:107) < φ, i (cid:54) = j } (4)In the overlap phase of the HTM SP, the activationof the SP mini-columns for a particular input pattern Z is determined. The input overlap calculation is shownin Eq.5, where β i is a boosting factor that refers to theexcitability of the SP mini-column. o i = β i (cid:88) j B ij Z j (5)In the inhibition phase, the activation of the SPmini-columns occurs. The activation depends of twoconditions: the value of the input overlap of the SPmini-column should be above the threshold θ s and withinthe top s percent considering the other SP mini-columnsin the inhibition neighborhood. The selection of theactive column is shown in Eq.6, where the parameter α i is the activity of the SP mini-columns, prctile is apercentile function, and NO ( i ) = { o j | j ∈ N ( i ) } withthe target activation density s . The activation of thecolumns is implemented according to the k -winners-take-all rule considering all mini-columns in the par-ticular neighborhood. α i = 1 , if ( o i ≥ prctile( NO ( i ) , − s )) and ( o i ≥ θ s )(6)In the original HTM algorithm, the parameter k ,can be changed based on the desired number of winningcolumns for a particular application [8]. However, inmost of the existing hardware implementations of theHTM SP, k = 1 due to the limitations of the Winner-Takes-All(WTA) circuits [7].In the learning phase of the HTM SP, feed-forwardconnections are learned using Hebb‘s learning rule andthe boosting factor is updated. The Hebb‘s rule for theconnection learning implies that the permanence valueof the connections is either increased or decreased bythe value ρ . The update process of the boosting factoris performed considering time-average activity the SPmini-columns ¯ α i ( t ) and recent activity of the SP mini-columns < ¯ α i ( t ) > [8]. Eq.7 shows the calculation of thetime-average activity of the SP mini-columns in time t ,where T is the number of considered previous inputs,and α i ( t ) is a current activity the i -th mini-column.¯ α i ( t ) = ( T − × ¯ α i ( t −
1) + α i ( t ) T (7)Equation 8 shows the calculation of the recent ac-tivity. < ¯ α i ( t ) > = 1 | N ( i ) | (cid:88) j ∈ N ( i ) ¯ α i ( t ) (8) Olga Krestinskaya, Alex Pappachen James
Equation 9 refers to the update process of the boost-ing factor, where η controls the adaptation of the HTMSP. β i ( t ) = e − η ( ¯ α i ( t ) − < ¯ α i ( t ) > ) (9) To improve the initialization phase of the HTM SP,we proposed the rule-based approach for the weightsassignment instead of the uniform weight distribution.In the rule-based approach, we establish the connectionbetween the input space and the synaptic permanencevalues (weights of the synapses). The Eq. 10 shows howsynaptic permanence weights are assigned in the rulebased approach. Eq. 10 is used instead of Eq. 2 and Eq.3. S ij = (cid:26) j ∈ P I ( i ) and P I ( i ) > mean ( P I )0 otherwise (10)In the rule-based approach, the synaptic permanencevalue is assigned based on the mean value of the inputswithin the input space region with the potential con-nections. If the input is greater than the mean of thethe inputs within this neighborhood, the synaptic per-manence S ij is 1, otherwise S ij = 0.In this work, we focus on the first three phases ofthe HTM SP: initialization, overlap and inhibition. TheAlgorithm 1 summaries the proposed approach. Lines 2-18 represent the HTM SP initialization stage, lines 20-22 refer to the overlap stage, and lines 24-27 correspondto the inhibition stage of the HTM SP. k largest overlap values. Inthe modified version of the HTM SP, the selection ofthe winning columns occurs based on the mean valueof the overlap in the inhibition region. If the overlapvalue of the column is greater than the mean value ofthe overlaps in the inhibition region, the columns is Algorithm 1
The HTM SP algorithm (cid:46) HTM SP initialization
2: Define the size of input neighborhood with potential con-nections, x ci , γ , ρ , η , θ c , size of the local inhibition region, θ s
3: Determine φ by multiplying the average number of con-nected input spans of all the SP mini-columns by thenumber of mini-columns per inputs.4: z ij ∼ U (0 , if ∀ x j ∈ ( x ci , γ ) then ı ( x j ; x ci , γ ) = 17: for ı ( x j ; x ci , γ ) and ( z ij < ρ ) do P I ( i ) = j if j ∈ P I ( i ) and P I ( i ) > mean ( P I ) then S ij = 111: else S ij = 013: B ij = S ij for | y i − y j | < φ, i (cid:54) = j do N i = j (cid:46) HTM SP oveplap o i = β i B ij Z j for j ∈ N ( i ) do NO ( i ) = o j (cid:46) HTM SP inhibition if ( o i ≥ prctile( NO ( i ) , − s )) and ( o i ≥ θ s ) then α i = 123: else α i = 0 activated, otherwise, it is inhibited. The modified ap-proach for the inhibition region is represented in Eq.11, which is used instead of the Eq. 6. α i = 1 , if ( o i ≥ mean ( o j ) | j ∈ N ( i )) (11)As it is proven in [12] that the modified HTM ap-proach results in higher accuracy and reduced on-chiparea and power consumption, in this work, we focus onthe modified HTM algorithm and check the effect ofthe rule-based initialization approach for the modifiedHTM hardware implementation. The overall architec-ture of the modified HTM illustrated in Fig.3. The re-ceptor blocks correspond to the initialization and over-lap calculation phases of the HTM SP and the inhibi-tion block refers to the HTM SP inhibition phase.The inhibition phase consists of the memristive thresh-old calculation block and threshold comparison block.In the threshold calculation block, the threshold valueis determined as a mean of all input overlap values,which corresponds to the modified HTM SP approach.The value of the memristors in the threshold calculationblock are the same. The threshold comparison blockconsists of the set of comparators and inverters. Eachoverlap voltage, corresponding to a particular connec-tion in the inhibition block, is followed by a single com-parator and inverter. The comparator is based on the TM feature extraction without learning 5
Fig. 3
The HTM SP structure containing receptor block andinhibition block [12]. low voltage amplifier with 6 transistors and the currentsource. If the value of the overlap of a single column V RBj is greater than the overall mean of all overlaps V AV G , the output of the comparator is low, and viseversa. To invert the output of the comparator and nor-malize it to a certain level, the CMOS inverter is ap-plied. The output of the inverter in the output of theinhibition block for the particular columns, which showwhether the columns columns are activated or exhib-ited.4.2 Random weight approach implementationThe difference between the traditional random weightapproach and rule-based approach occurs in the recep-tor block of the hardware implementation of the HTMSP. The implementation of the traditional approach isillustrated in Fig. 4.The receptor block structure for the conventionalHTM SP approach consists of the randomization of theweight synapses and the receptor block mean calcula-tor. The randomization of the weights of the synapsesrefer to the initialization stage, where the weights arecompletely randomized. This is implemented with thememristive the set of the memristive mean circuits,where the resistances of the memristors are assignedrandomly. Separate sets of the memristors in the blockof random weight synapses refer to several random it-erations to ensure that the weights are completely ran-domized. The receptor mean block performs the sum-mation of all the columns for the overlap calculation.
Fig. 4
The HTM SP receptor block structure for the randomweight approach [12].
Fig. 5
The HTM SP receptor block structure for the rule-based approach.
The parameter V RBj corresponds to the final overlapof the particular column. The tradition summation ofthe overlap values in the HTM SP algorithm is replacedwith the mean calculation on hardware, which does nothave any impact on the performance of the modifiedHTM SP. The resistances of the memristors in the re-ceptor block mean are the same.4.3 Rule-based approach implementationIn this work, we proposed the analog hardware imple-mentation of the rule-based approach for the HTM SP.If the tradition hardware implementation of the HTMSP is based on the memristive circuits, the rule basedapproach is based on the CMOS circuits. The proposedreceptor block is shown in Fig.5.In the proposed architecture, the memristive meancalculation block and CMOS comparator circuit corre-spond to the initialization phase on the rule-based HTM
Olga Krestinskaya, Alex Pappachen James
SP approach. The memristive mean calculation blockcalculates the average of the inputs in the mean of theinputs from the set of the potential inputs. The voltage V mean refers to the threshold for assigning the poten-tial inputs as connected or disconnected. The CMOScomparator circuit compares the input of the particularcolumn with the mean of all columns from the potentialinputs. If the input is greater than the threshold, theoutput of the comparator V comp is low and vice versa.The CMOS analog switch block refers to the implemen-tation of the overlap stage of the HTM SP. The voltage V out = V in , if the comparator output V comp is low,which corresponds to the case when the column is con-nected. The voltage V out = 0, when V comp is high, whichmeans that the columns is disconnected. The voltage V out refers to the overlap value of the column. In this work, we apply the HTM SP with two differ-ent initialization stage approach for the face recognitionproblem. The overall system implementation of the facerecognition module with the HTM SP is illustrated inFig.6. The input RGB images are read by the imagesensor and applied to the input data controller. In thisstage, the sampling process occurs if it is required andthe sampled images are preprocessed. In this method,we use only RGB to gray-scale conversion as a prepro-cessing step. In the existed HTM SP face and speechrecognition systems [12], the standard deviation filter isapplied in the preprocessing stage to improve the recog-nition process. However, in this work, we show the effectof the different approaches for the initialization stage;therefore, we remove the filtering stage to obtain theactual results from the HTM SP.After the controller, the image is applied to theHTM SP stage, which performs the encoding of theimage and outputs the sparse binary image with thepreserved important image features. The output datacontroller controls, where the images are directed in thetraining and testing stages. In the training stage, theoutput from the HTM SP is preserved in the trainingtemplate storage. The training continues until all im-age class templates are preserved. In the testing stage,the output data controller directs the images into thecomparison circuit. The comparison circuit can be im-plemented as a memristive pattern matcher, which com-pares all templates form the training template storagewith the current input image. Finally, the image classis determined.The algorithmic implementation of the face recog-nition system approach is shown in Algorithm 2 in Ap-pendix.
Table 1
The average and maximum recognition accuraciesfor different databases for traditional random weight and pro-posed rule based approaches.
Dataset Random weightapproach Rule basedapproach meanaccuracy maximumaccuracy meanaccuracy maximumaccuracyAR 47.992 % 59.231 % 82.855% 83.231%YALE 91.852 % 98.667 % 85.538 % 86.308 %ORL 49.056 % 69.000 % 85.5389 % 86.308 %
TM feature extraction without learning 7
Fig. 6
The overall system implementation of the face recognition module with the HTM SP.(a) (b) (c) (d) (e)
Fig. 7
Simulation results for the random weight approach:(a) input image, (b) grayscale image, (c) binary weights, (d)HTM SP overlap output and (e) HTM SP inhibition output.(a) (b) (c) (d) (e)
Fig. 8
Simulation results for the rule-based approach with 2inputs in the receptor region: (a) input image, (b) grayscaleimage, (c) binary weights, (d) HTM SP overlap output and(e) HTM SP inhibition output.
Table 2
Comparison of the random weight and rule baseapproaches in terms of the on-chip area and maximum powerconsumption of a single receptor block.
Approach On-chiparea Powerdissipation
Random weight approach 0 . µm . pW Rule-based approach 13 . µm µW circuit output and Fig.10(d) illustrates the final outputof a single receptor block.Table 2 compares the on-chip area and power dissi-pation for random weight and rule-based approaches. As it was illustrated in Section 6, the proposed rule-based approach outperforms the traditional HTM ran-dom weight approach. This can be explained by the fact (a)(b)(c)
Fig. 9
Simulation results for the face recognition for twomethods for different databases: (a) AR, (b) ORL and (c)YALE. that the rule-based approach that draws the correlationbetween the HTM SP weights to the input space. Themain goal of the HTM SP is to create the SDR fromthe input. However, the facial images contain the nat-ural sparseness. The rule based approach ensures thepreservation of this natural sparseness of the images,which results in the increase of the recognition accu-racy. In addition, this allows to preserve the structuralinformation from the images, such as edges.The hardware implementation of the rule-based ap-proach required larger on-chip area and power consump-tion, comparing to the traditional random weight method.
Olga Krestinskaya, Alex Pappachen James(a)(b)(c)(d)
Fig. 10
Timing diagram for the proposed receptor block forthe rule-based HTM approach: (a) inputs from the neighbor-hood, (b) main input and mean of the inputs, (c) comparatoroutput and (d) receptor block output.
However, to achieve high recognition accuracy in therule-based approach, the image filtering stage is not re-quired, which is performed on the separate software.Moreover, the rule-based approach does not require theprogramming of the memristors to the random weights,which can be achieved combining either software-basedor mixed-signal random number generation approach.The programming of the memristors requires additionaltime and reduces the processing speed. Also, the highaccuracy of the rule-based approach result allows to re-move the learning phase from the HTM SP, which canbe implemented using digital or analog circuits and re-quires a significant amount of extra power and on-chiparea [12].
In this paper, the hardware implementation of a rule-based approach for the initialization phase of the HTM SP has been proposed. The proposed rule-based ap-proach allows to achieve significant increase in recogni-tion accuracy. The maximum accuracy is approximately86%, which is equivalent to the processing of the HTMSP with the learning phase. The on-chip area and powerrequirements to implement the rule-based initializationphase of the HTM SP are 13 . µm and 135 µW for asingle receptor block, respectively. Appendix
In Algorithm 2, line 2 refers to the preprocessing stage,lines 3-17 refer to the HTM SP processing, lines 18-19correspond to the training phase and lines 20-22 showsthe testing (recognition) phase.
Algorithm 2
System level implementation of HTM Define neighborhood N x = grayscale ( x ) (cid:46) PRE-PROCESSING for p inhibition regions do (cid:46) HTM SP for k image blocks do for all i ∈ W do if x ( i ) (cid:62) mean ( x ( i ) ∈ N ) then W ( i ) = 18: else W ( i ) = 010: image.block ( j ) = W ( j ) × image.block ( j )11: threshold.block = mean ( y.image.blocks )12: for y image blocks do if image.block ( y ) > threshold.block then inhibition.region ( y ) = 115: else inhibition.region ( y ) = 017: x ( p ) = inhibition.region ( p )18: if training phase then Store image to the training template else if testing phase then
Compare image to all atored templates
Determine image class
References
1. Ahdid, R., Safi, S., Manaut, B.: Approach of facial sur-faces by contour. In: Multimedia Computing and Systems(ICMCS), 2014 International Conference on, pp. 465–468(2014). DOI 10.1109/ICMCS.2014.69112842. Csapo, A.B., Baranyi, P., Tikk, D.: Object categorizationusing vfa-generated nodemaps and hierarchical temporalmemories. In: Computational Cybernetics, 2007. ICCC2007. IEEE International Conference on, pp. 257–262.IEEE (2007)3. Cui, Y., Ahmad, S., Hawkins, J.: The htm spatial pooler:a neocortical algorithm for online sparse distributed cod-ing. bioRxiv p. 085035 (2017)TM feature extraction without learning 94. Fan, D., Sharad, M., Sengupta, A., Roy, K.: Hierarchi-cal temporal memory based on spin-neurons and resis-tive memory for energy-efficient brain-inspired comput-ing. IEEE transactions on neural networks and learningsystems (9), 1907–1919 (2016)5. George, D., Hawkins, J.: A hierarchical bayesian modelof invariant pattern recognition in the visual cortex. In:Neural Networks, 2005. IJCNN’05. Proceedings. 2005IEEE International Joint Conference on, vol. 3, pp. 1812–1817. IEEE (2005)6. Hawkins, J., Blakeslee, S.: On intelligence. 2004. NewYork St. Martins Griffin pp. 156–87. Ibrayev, T., James, A.P., Merkel, C., Kudithipudi, D.:A design of htm spatial pooler for face recognition usingmemristor-cmos hybrid circuits. In: 2016 IEEE Interna-tional Symposium on Circuits and Systems (ISCAS), pp.1254–1257 (2016)8. Inc., N.: Hierarchical temporal memory including htmcortical learning algorithms. Tech. rep. (2006)9. Irmanova, A., James, A.P.: Htm sequence memory forlanguage processing. In: Poster session presented atIEEE International Conference on Rebooting Comput-ing (ICRC 2017) (2017)10. James, A., Irmanova, A., Ibrayev, T.: Design of discrete-level memristive circuits for hierarchical temporal mem-ory based spatio-temporal data classification system. IETCyber-Physical Systems: Theory & Applications (2017)11. James, A.P., Fedorova, I., Ibrayev, T., Kudithipudi, D.:Htm spatial pooler with memristor crossbar circuits forsparse biometric recognition. IEEE Transactions onBiomedical Circuits and Systems (2017)12. Krestinskaya, O., Ibrayev, T., James, A.P.: Hierarchicaltemporal memory features with memristor logic circuitsfor pattern recognition. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems PP (99),1–1 (2017). DOI 10.1109/TCAD.2017.274802413. Martınez, A., Benavente, R.: The ar face database. Rap-port technique24