[PDF] Neural Storage: A New Paradigm of Elastic Memory

Abstract

Storage and retrieval of data in a computer memory plays a major role in system performance. Traditionally, computer memory organization is static - i.e., they do not change based on the application-specific characteristics in memory access behaviour during system operation. Specifically, the association of a data block with a search pattern (or cues) as well as the granularity of a stored data do not evolve. Such a static nature of computer memory, we observe, not only limits the amount of data we can store in a given physical storage, but it also misses the opportunity for dramatic performance improvement in various applications. On the contrary, human memory is characterized by seemingly infinite plasticity in storing and retrieving data - as well as dynamically creating/updating the associations between data and corresponding cues. In this paper, we introduce Neural Storage (NS), a brain-inspired learning memory paradigm that organizes the memory as a flexible neural memory network. In NS, the network structure, strength of associations, and granularity of the data adjust continuously during system operation, providing unprecedented plasticity and performance benefits. We present the associated storage/retrieval/retention algorithms in NS, which integrate a formalized learning process. Using a full-blown operational model, we demonstrate that NS achieves an order of magnitude improvement in memory access performance for two representative applications when compared to traditional content-based memory.

Full PDF

NNeural Storage: A New Paradigm of Elastic Memory

Prabuddha Chakraborty and Swarup BhuniaDepartment of Electrical & Computer EngineeringUniversity of Florida, Gainesville, FL, USA

Abstract —Storage and retrieval of data in a computermemory plays a major role in system performance.Traditionally, computer memory organization is ‘static’– i.e., they do not change based on the application-speciﬁc characteristics in memory access behaviourduring system operation. Speciﬁcally, the associationof a data block with a search pattern (or cues) as wellas the granularity of a stored data do not evolve. Sucha static nature of computer memory, we observe, notonly limits the amount of data we can store in a givenphysical storage, but it also misses the opportunity fordramatic performance improvement in various applica-tions. On the contrary, human memory is characterizedby seemingly inﬁnite plasticity in storing and retrievingdata – as well as dynamically creating/updating theassociations between data and corresponding cues. Inthis paper, we introduce Neural Storage (NS), a brain-inspired learning memory paradigm that organizes thememory as a ﬂexible neural memory network. In NS,the network structure, strength of associations, andgranularity of the data adjust continuously during sys-tem operation, providing unprecedented plasticity andperformance beneﬁts. We present the associated stor-age/retrieval/retention algorithms in NS, which inte-grate a formalized learning process. Using a full-blownoperational model, we demonstrate that NS achievesan order of magnitude improvement in memory accessperformance for two representative applications whencompared to traditional content-based memory.

I. Introduction

Digital memory is an integral part of a computer system.It plays a major role in deﬁning system performance.Memory access behaviour largely depends on the nature ofthe incoming data and the speciﬁc information-processingtasks that operate on the data. Applications rangingfrom wildlife surveillance [2] to infrastructure damagemonitoring [3], [4] that collect, store and analyze dataoften exhibit distinct memory access (e.g., storage andretrieval of speciﬁc data blocks) behaviour. Even withinthe same application, such behaviour may change withtime. Hence, these systems with variable and constantlyevolving memory access pattern can beneﬁt from a mem-ory organization that can dynamically tailor itself to meetthe requirements. Furthermore, many applications dealwith multi-modal data (e.g., image and sound) [5], [6] andin such applications, the data storage/access requires spe-cial considerations in terms of their temporal importanceand inter-modality relations. A data storage frameworkwhich can eﬃciently store and retrieve multi-modal datais crucial for these applications.Many computing systems, speciﬁcally the emergentinternet of things (IoT) edge devices, come with tight Fig. 1: Neural Storage: A new paradigm of elastic memory.constraints on memory storage capacity, energy and com-munication bandwidth [7]. These systems often deal witha huge inﬂux of data with varying degree of relevance tothe application. Hence, storing and transmitting less usefuldata at higher quality may not be optimal. Due to theserequirements, it is important for a memory framework tobe eﬃcient in terms of energy, space and transmissionbandwidth utilization by focusing on what is importantfor the speciﬁc application.Based on these observations, an ideal data storageframework for these applications should be: • Flexible and dynamic in nature to accommodate forthe constantly evolving application requirements andscenarios. • Able to emulate a virtually inﬁnite memory thatcan deal with a huge inﬂux of sensor data which iscommon in case of many IoT applications. • Able to eﬃciently handle multi-modal data in thecontext of the application-speciﬁc requirements. • Geared towards increasing storage, transmission andenergy utilization eﬃciency. a r X i v : . [ c s . A I] J a n raditional memory frameworks [1] (both address-operated and content-operated) are not ideal for meetingthese requirements due to lack of ﬂexibility in their mem-ory organization and operations. In an address-operatedmemory, each address is associated with a data unit. Andfor a content-operated memory, each data-search-pattern(cue/tag) is associated with a single data unit. Hence, inboth cases, the mapping is one-to-one and does not evolvewithout direct admin/user interference. Data in a tradi-tional memory is also stored at a ﬁxed quality/granularity.When a traditional memory runs out of space, it caneither stop accepting new data or remove old data basedon a speciﬁc data replacement policy. All these traits ofa traditional memory are tied to its static nature whichmakes it not suitable for modern applications that haveevolving needs and requirements as established earlier. Forexample, let us consider a wildlife image-based surveillancesystem which is geared towards detecting wolves. Anyimage frame that does not contain a wolf is considered tobe of lower importance than any frame containing at leastone wolf. However, a traditional memory, due to lack ofdynamism in terms of data granularity management, willstore the image frames at the same quality regardless oftheir importance to the application. Additionally, due tolack of dynamism in memory organization, searching for awolf image will take the same eﬀort/time as it would takefor searching any rarely accessed and unimportant image.To meet the requirements of many modern applications,it is attractive to incorporate ﬂexibility and dynamism inthe digital memory which we believe can be best achievedthrough statistics-guided learning. Artiﬁcial Intelligence(AI) and Machine Learning (ML) are widely used to solvediﬀerent problems where static algorithms are not ideal.Similarly, meeting the dynamic memory requirements cannot be possible using static algorithms. Hence incorpora-tion of intelligence may be an ideal solution for addressingcurrent digital memory limitations.We draw inspiration from human biological memorywhich has many useful properties which can be beneﬁcialfor a digital memory as well. A human brain due to‘plasticity’ [8], [9], undergoes internal change based onexternal stimuli and adapts to diﬀerent scenarios presentedto it. Data stored in a human brain is lossy in nature andare subject to decay and feature-loss. However, importantmemories decay at a slower rate and repetition/primingcan lead to prolonged retention of important mission-critical data [10], [11]. Human memory also receives andretains data from multiple sensory organs and intelligentlystores this multi-modal data for optimal performance [10].If these intelligence guided human memory properties canbe realized in a digital memory with the help of ML, thenit would be ideal for the emergent applications.With this vision in mind, we put forward a paradigm-shifting content-operated memory framework, NeuralStorage (NS) which mimics the intelligence of human brainfor eﬃcient storage and speedy access. In NS, the memorystorage is a network of cues (search-patterns) and data,we term Neural Memory Network (NMN). Based on the feedback generated from each memory operation we usereinforcement learning to (1) optimize the NMN data/cueorganization and (2) adjust the granularity (feature qual-ity) of speciﬁc data units. NS is designed to have the sameinterface as any traditional Content Addressable Memory(CAM). This allows NS to eﬃciently replace traditionalCAMs in any application as shown in Fig. 1. Applicationswhich are resistant to imprecise data storage/retrieval anddeals with storing data of varying importance will beneﬁtthe most from using NS.For quantitatively analyzing the eﬀectiveness of usingNS as a memory system, we implement a NS memorysimulator with an array of tunable hyperparameter. Werun diﬀerent real-life applications using NS and observethat the NS framework utilizes orders of magnitude lessspace, and exhibits higher retrieval eﬃciency while incur-ring minimal impact on the application performance.In summary, we make the following contributions:1) We present a new paradigm of learning computermemory, called NS, that can track data access pat-terns to dynamically organize itself for providinghigh eﬃciency in terms of data storage and retrievalperformance. We describe the learnable parametersand the learning process, which is incorporated intothe store, retrieve and retention operations.2) For quantitatively analyzing the capabilities of NS,we present a memory performance simulator with anarray of tunable hyperparameters.3) We present a formal process to select and customizeNS for a target application. We also provide a com-prehensive performance analysis of NS using two sep-arate datasets representative of real-life applicationsand demonstrate its merit compared to traditionalmemory.The rest of the paper is organized as follows: Sec-tion II discusses diﬀerent state-of-the-art digital memoryframeworks and provides motivations for the proposedintelligent digital memory design. Section III describes indetails the proposed memory framework. Section IV quan-titatively analyzes the eﬀectiveness of the NS frameworkthrough multiple case-studies. Section V concludes themain paper. Appendix A provides additional details aboutthe proposed hyperparameters. Appendix B provides de-tailed algorithms for diﬀerent NS operations/proceduresreferenced in the main paper. Appendix C provides de-tails about the experimental setup and hyperparametersused during the case studies. Appendix D analyzes thedynamism of NS in greater depth. Appendix E discussesdiﬀerent applications which may beneﬁt from using NS. II. Background and Motivation

In this section, we shall ﬁrst discuss the major diﬀer-ence between our propose memory framework (NS) andexisting similar technologies. After that, we will providemotivations that led to the development of NS.ABLE I: Comparison between NS and traditional memory frameworks. The dynamic nature of NS, guided bycontinuous reinforcement learning, makes it adaptable to the application requirements and the usage scenarios.

Dynamism AssociativityMemory Organization Learning DataResolution Association < Cue, Data > <

Data, Data > <

Cue, Cue > SpaceEﬃciencyAddress Operated Memory

N/A Fixed User deﬁned N/A N/A N/A Low

BCAM, TCAM, Associative

N/A Fixed User deﬁned One-to-One N/A N/A Low

NS (Proposed)

Continuous Changes Based onAccess Pattern Changes Based onAccess Pattern Many-to-Many Many-to-Many Many-to-Many High

A. Computer Memory: A Brief Review

Computer memory is one of the key componentsof a computer system [12]. Diﬀerent types of memoryhave been proposed, implemented and improved over thedecades. However, digital memories can still be can bebroadly divided into two categories based on how datais stored and retrieved: (1) address operated and (2)content operated [1]. In an address operated memory (forexample a Random Access Memory or RAM [12], [13]),the access during read/write is done based on a memoryaddress/location. During data retrieval/load, the memorysystem takes in an address as input and returns thedata associated with it. Diﬀerent variants of RAM suchas SRAM (Static Random Access Memory) and DRAM(Dynamic Random Access Memory) are widely used [12].In a content operated memory, on the contrary, memoryaccess during read/write operations is performed based ona search pattern (i.e. content).Fig. 2: Taxonomy of content operated memory used incomputer systems. The proposed memory organizationfalls under the content-addressable memory category andis suitable for diverse application domains as shown.A COM (Content Operated Memory) [14], [1] doesnot assign any speciﬁc data to a speciﬁc address duringthe store operation. During data retrieval/load, the userprovides the memory system with a search pattern/tagand the COM searches the entire memory and returnsthe address in the memory system where the requireddata is stored. This renders the search process extremelyslow if performed sequentially. To speed up this process ofcontent-based searching, parallelization is employed whichgenerally requires additional hardware. And adding morehardware makes the COM a rather expensive solution lim-iting its large-scale usability. A COM can be implementedin several ways as shown in Fig. 2, each with its own set ofadvantages and disadvantages. In an associative memory,the data are stored with a varying degree of restrictions. In a direct-mapped memory, each data can only be placedin one speciﬁc memory location. The restriction is lessstringent in case of set-associative memory and in-case offully associative memory, any data can reside anywhere inthe memory. On the other hand, Neuromorphic associativememory behaves in a similar way as the standard associatememory at a high level but at a low-level, it exploits deviceproperties to implement neuronal behaviour for increasedeﬃciency [15]. A CAM (Content-Addressable Memory) issimilar to an associative memory in regards to its read andwrite behaviour however the implementation is diﬀerent.In COM, there is a requirement for replacing old data unitswith new incoming data units in-case the memory runs outof space. The data unit/block to replace is determinedbased on a predeﬁned replacement policy. CAM is themost popular variant of COM and is being used for decadesin the computing domain but the high-level architectureof a CAM has not evolved much. Instead, researchers havemostly focused on how to best physically design a CAMto improve overall eﬃciency. SRAM bitcells are used asa backbone for any CAM cell [14], [1]. Extra circuitryis introduced to perform the parallel comparison betweenthe search pattern and the stored data. This is typicallyimplemented using an XOR operation. The high degree ofparallelism increases the circuit area and power overheadsalong with the cost. Cells implemented using NAND gatesare more energy-eﬃcient at a cost of speed. NOR gatebased cells are faster but more energy-intensive.Traditional CAMs are designed to be precise [1]. No datadegradation happens over time and in most cases, a perfectmatch is required with respect to the search pattern/tagto qualify for a successful retrieval. This feature is essentialfor certain applications such as destination MAC addresslookup for ﬁnding the forwarding port in a network device.However, there are several applications in implantable,multimedia, Internet-of-Things (IoT) and data miningwhich can tolerant imprecise storage and retrieval. TernaryContent Addressable Memory (TCAM) is the only COMwhich allows partial match using a mask and are widelyused in layer 3 network switches.NS and CAM are both content operated memory frame-works. However, there are several diﬀerences between atraditional CAM and NS and some of the importantones are highlighted in Table I. For both Binary ContentAddressable Memory (BCAM) and Ternary Content Ad-dressable Memory (TCAM): (1) there are no learning com-ponents, (2) data resolution remains ﬁxed unless directlyig. 3: Memory organization of NS with two memory hives. Data neurons (DN) store data and cue neurons (CN)store cues/tags. Each memory hive stores a speciﬁc data type (e.g., image, sound etc.). Localities are designed to storedata consisting of speciﬁc features. We refer to this memory organization, the weighted graph of DN(s) and CN(s), asNeural Memory Network (NMN).manipulated by the user, (3) associations between search-pattern (tag/cue) and data remain static unless directlymodiﬁed, (4) only a one-to-one mapping relation existsbetween search-pattern/cue and data units. Consequently,space and data fetch eﬃciency is generally low and weprovide supporting results for this claim in Section IV.Apart from standard computer memory organizations,researchers have also looked into diﬀerent software levelmemory organizations for eﬃcient data storage and re-trieval. Instance retrieval frameworks are some such soft-ware wrappers on top of traditional memory systemsthat are used for feature-based data storage and retrievaltasks [16]. These systems are mostly used for storing andretrieving images. During the training phase (code-bookgeneration), visual words are identiﬁed/learned based oneither SIFT features or CNN features of a set of im-age data. These visual words are, in most cases, clustercentroids of the feature distribution. Insertion of data inthe system follows and is generally organized in a tree-like data structure. The location of each data in thisdata structure is determined based on the visual words(previously learned) that exist in the input image. Duringthe retrieval phase, a search-image is provided and in anattempt to search for similar data in the framework, thetree is traversed based on the visual words in the searchimage. If a good match exists between the search imageand a stored image, then that speciﬁc stored image isretrieved. The learning component in an instance retrievalframeworks is limited to the code-book generation phasewhich takes place during initialization. Furthermore, oncea data unit is inserted in the framework, no more locationand accessibility change is possible. No association existsbetween data units and granularity of data units do notchange. On contrary, the overall dynamism and possibili-ties of a NS framework are much more.Another software level memory organization proposed by Niederee et. al. outlines the beneﬁt of forgetfulnessin a digital memory[17]. However, due to the lack ofquantitative analysis and implementation details, it isunclear how eﬀective this framework might be.

B. Motivation: Taking Inspiration from Human Memory

Computer and human memory are both designed to per-form data storage, retention and retrieval. Although thefunctioning of human memory is far from being completelyformalized and understood, it is clear that it is vastlydiﬀerent in the way data is handled. Several propertiesof the human brain have been identiﬁed which allows it tobe far more superior than traditional computer memory incertain aspects. We believe that if some of these proper-ties can be realized in a digital computer memory thenmany applications can beneﬁt greatly. In the followingsubsections, we will look into some of the most importantproperties of the human brain and envision their potentialdigital counterparts.

1) Virtually Inﬁnite Capacity:

The capacity of the hu-man brain is diﬃcult to estimate. John von Neumann, inhis book “The computer and the brain” [18], estimatedthat human brain has a capacity of 10 bits with theassumptions: (1) All the inputs to the brain in its entirelifetime are stored forever, and (2) there are 10 neu-rons in our brain. Researchers now even believe that ourworking memory (short-term memory) can be increasedthrough “ plasticity ”, provided certain circumstances. Ac-cording to L¨ovd´en et al., “... increase in working-memorycapacity constitutes a manifestation of plasticity ...” [9].On top of that, due to intelligent pruning of unnecessaryinformation, a human brain is able to retain only the keyaspects of huge chunks of data for a long period of time.If a digital memory can be designed with this humanbrain feature, then the computer system, through intel-ligent dynamic memory re-organization (learning-guidedlasticity) and via pruning features of unnecessary data(learned from statistical feedback), can attain a stateof virtually inﬁnite capacity. For example, in a wildlifeimage-based surveillance system which is geared towardsdetecting wolves, the irrelevant data (non-wolf frames)can be subject to compression/feature-loss to save spacewithout hampering the eﬀectiveness of the application.

2) Imprecise/Imperfect Storage and Access:

The idea ofpruning unnecessary data, as mentioned in the previoussection, is possible because the human brain operates inan imprecise domain in contrary to most traditional digitalmemory frameworks. Human brain retrieval operation isimprecise in most situations[10] but intelligent featureextraction, analysis, and post-processing almost nullify theeﬀect of this impreciseness. Also, certain tasks may notrequire precise memory storage and recall. For these tasksonly some high-level feature extracted from the raw datais suﬃcient.Hence, supporting the imprecise memory paradigm in adigital memory is crucial for attaining the virtually inﬁnitecapacity and faster data access. For example, a wildlifeimage-based surveillance system can operate in the impre-cise domain because some degree of compression/feature-reduction of images will not completely destroy the high-level features necessary for its detection tasks. This canlead to higher storage and transmission eﬃciency.

3) Dynamic Organization:

We have mentioned thatplasticity can lead to increased memory capacity but italso provides several other beneﬁts in the human brain.According to Lindenberger et al., “Plasticity can be deﬁnedas the brain’s capacity to respond to experienced demandswith structural changes that alter the behavioural reper-toire.” [8]. Hence plasticity leads to better accessibilityof important and task-relevant data in the human brain.And the ease-of-access of a particular memory is adjustedwith time-based on an individual’s requirements. This ideais also similar to priming [11] and it was observed thatpriming a human brain with certain memories helps inquicker retrieval.If we can design a digital memory which can re-organizeitself based on data access patterns and statistical feed-back, then there will be great beneﬁts in terms of reducingthe overall memory access eﬀort. For example, a wildlifeimage-based surveillance system which is geared towardsdetecting wolves will have to deal with retrieval requestsmostly related to frames containing wolves. Dynamicallyadjusting the memory organization can enable faster ac-cess to data which are requested the most.

4) Learning Guided Memory Framework:

Ultimately,the human brain can boast of so many desirable qualitiesmainly due to its ability to learn and adapt. It is safe tosay, the storage policies of the human brain also vary fromperson to person and time to time [10]. Depending on theneed and requirement, certain data are prioritized overothers. The process of organizing the memories, featurereduction, storage and retrieval procedure changes overtime based on statistical feedback. This makes each human brain unique and tuned to excel at a particular task at aparticular time.Hence, the ﬁrst step towards mimicking the propertiesof the human brain is to incorporate a learning componentin the digital memory system. We envision that using thislearning component, the digital memory will re-organizeitself over time and alter the granularity of the data tobecome increasingly eﬃcient (in terms of storage, retentionand retrieval) at a particular task. For example, a wildlifeimage-based surveillance system which is geared towardsdetecting wolves will greatly beneﬁt from a memory sys-tem which can learn to continuously re-organize itselfto enable faster access to application-relevant data andcontinuously control the granularity of the stored datadepending on the evolving usage scenario.

III. Neural Storage Framework

To incorporate dynamism and embody the desirablequalities of a human brain in a digital memory, we havedesigned NS. It is an intelligent, self-organizing, virtuallyinﬁnite content addressable memory framework capable ofdynamically modulating data granularity. We propose anovel memory architecture, geared for learning, along withalgorithms for implementing standard operations/taskssuch as store, retrieve and retention.

A. Memory Organization

The NS memory organization can be visualized as anetwork and we refer to it as “Neural Memory Network”(NMN). The NMN, as shown in Fig. 3 consists of multiplehives each of which is used to store data of a speciﬁcmodality (type). For example, if an application requiresto store image and audio data, then the NS frameworkwill instantiate two separate memory hives for each datamodality. This allows the search to be more directed basedon the query data type. It is hypothesized that humanmemories are grouped up together to form small localitiesbased on data similarity. We capture this idea by creatingsmall memory localities within each hive that are designedto store similar data. The fundamental units of the NMNare (1) cue neurons and (2) data neurons. Each cue neuronstores a cue (data search pattern or tag) and each dataneuron stores an actual data unit. Each data neuron isassociated with a number denoting its ‘memory strength’which governs the data feature details or quality of thedata inside it. A cue is essential a vector representing acertain concept and it can be of two types: (1) Coarse-grained cue and (2) Fine-grained Cue. Coarse-grained cuesare used to navigate the NMN eﬃciently while searching(data retrieve operation) for a speciﬁc data and whilenavigating, the ﬁne-grained cues are used to determine thedata neuron(s) which is/are suitable for retrieval. A cue isa vector representing a particular concept but for the sakeof simplicity, we shall use speciﬁc words when referringto certain cues. For example, in a wildlife surveillancesystem, cue neurons may contain vectors correspondingto a “Wolf”, “Deer”, etc but when talking about theseue-neurons we shall refer to them directly with the nameof the concept they represent. The data neurons, for thisexample, will be image frames containing wolves, deer,jungle background, etc. Furthermore, if the system isdesigned to detect wolves, then the NS framework can beconﬁgured to have a memory locality for wolf-frames andone for non-wolf frames.Each hive comes with its own Cue Bank which storescue neurons arranged as a graph. The cue neuron anddata neuron associations ( < cue neuron, cue neuron > and < cue neuron, data neuron > , < data neuron, dataneuron > ) change with time, based on the memory ac-cess pattern and framework hyperparameters. To facilitatemulti-modal data search, connections between data neu-rons across memory hives are allowed. For example whensearched with the cue “Wolf” (the visual feature of a wolf),if the system is expected to fetch both images and sounddata related to the concept of “Wolf”, then this above-mentioned ﬂexibility will save search eﬀort.It is important to note that, the entire memory orga-nization can be viewed as a single weighted graph whereeach node is either a data neuron or a cue neuron. Theassociations in the NMN are strengthened and weakenedduring store, retrieve and retention operations. With time,new associations are also formed and old associationsmay get deleted. The data neuron memory strengths arealso modulated during memory operations to increasestorage eﬃciency. The rigidity provided by hives, localitiesand cue-banks can be adjusted based on the applicationrequirements. B. Parametric Space

We propose several parameters for NS to help modulatehow it functions. These parameters are of two types: (1)

Learnable Parameters which changes throughout thesystem lifetime guided by reinforcement-learning and (2)

Hyperparameters which are determined during systeminitialization and changed infrequently by the memoryuser/admin.

1) Learnable Parameters:

We consider the followingparameters as learnable parameters for NS:1)

Data Neuron and Cue Neuron WeightedGraph:

The weighted graph (NMN) directly impactsthe data search eﬃciency (time and energy). Theelements of the graph adjacency matrix are consideredas learnable parameters. If there are D number ofdata neuron and C number of cue neuron at any givenpoint of time, then the graph adjacency matrix willbe of dimension ( D + C, D + C ) .2) Memory Strength Array:

The memory strengthsof all the data neurons are also considered as learnableparameters. They jointly dictate the space-utilization,transmission eﬃciency and retrieved data quality.These parameters constantly evolve based on the systemusage via a reinforcement learning process. We will do adeeper dive into the learning process in Section III-C.

2) Hyperparameters:

We have also deﬁned a set ofhyperparameters which inﬂuences the NS memory organi-zation and the learning guided operations. These hyperpa-rameters can be set/changed by the user during the setupor during the operational stage. The ﬁrst hyperparameteris the number of memory hives and we propose thefollowing hyperparameters for each hive :1)

Number of localities : Each locality is used to storea speciﬁc nature of data. It is an unsigned integervalue.2)

Memory decay rate of each locality : Controlsthe rate at which data neuron memory strength andfeatures are lost due to inactivity.3)

Association decay rate of each locality : Controlsthe rate at which NMN associations losses strengthdue to inactivity.4)

Mapping between data features and localities :This mapping dictates the segregation of applicationrelevant data and their assignment to a locality witha low decay rate.5)

Data features and cue extraction AI (ArtiﬁcialIntelligence) models : These models are used toobtain more insights about the data. They should beselected based on the application and data-type beingprocessed.6)

Data neuron matching metric : Used during re-trieve operation for ﬁnding a good match and duringstore operation for data neuron merging. For example,this metric can be something like cosine similarity .7)

Neural elasticity parameters : Determines the ag-gressiveness with which unused data neurons are com-pressed in-case of space shortage.8)

Association weight adjustment parameter : Usedas a step size for increasing/decreasing associationweights inside the NMN. A higher value will increasethe dynamism but lower the stability.9)

Minimum association weight ( ε ) : It is an unsignedinteger which limits the decay of association weightbeyond a certain point.10) Degree of impreciseness ( ϕ ) : Limits the amount ofdata feature which is allowed to be lost due to memorystrength decay and inactivity. It is a ﬂoating-pointnumber in the range [ 0 - 100 ]. 0 implies data can getcompletely removed if needs arise.11) Frequency of retention procedure ( N ) : NS has aretention procedure which brings in the eﬀect of age-ing. This hyperparameter is a positive integer denot-ing the number of normal operations to be performedbefore the retention procedure is called once. A lowervalue will increase dynamism at a cost of overalloperation eﬀort (energy and time consumption).12) Compression techniques : For each memory hivewe must specify the algorithm to be used for com-pressing the data when required. For example, we canuse JPEG [19] compression in an image hive.More details can be found in Appendix A.ig. 4: The proposed reinforcement learning architectureused to incorporate learning in the NS framework. Eachmemory operation generates a feedback (E), which is usedto modify and optimize the NMN by changing its currentstate S to the newly computed state S ’. C. Learning Process

The learnable parameters , governing the NMN of NSare updated based on feedback from memory operations.The goal/objective of learning is to:1) Reduce space requirement while maintaining dataretrieval quality and application performance. Thisshould be achieved by learning the granularity atwhich each data neuron should be stored. Less im-portant data should be compressed and subject tofeature-loss for saving space while more importantdata should be kept at a good quality. Hence thislearning should be driven based on the access-pattern.2) Increase memory search speed by learning the rightNMN organization given current circumstances andaccess-pattern bias.In Fig. 4, we propose an external stimulus guided reac-tional reinforcement learning (RL) architecture for incor-porating intelligence in NS. The NS framework consists oftwo main components: (1) The Neural Memory Network(NMN) and (2) The NS Controller which manages theNMN.The initial state ( S ) of the NMN consists of no data-neuron (DN) and no cue-neuron (CN). During an opera-tion when a new cue is identiﬁed (not present in the cuebank), a new cue-neuron (CN) is generated for that cue.Similarly, when an incoming data cannot be merged withan existing data-neuron (DN), a new DN is created. Eachnew DN is initialized with a memory strength of 100%(this parameter dictates the data granularity/details forthe DN). When a new DN or CN is created, the new neu-ron is connected with all other exiting neurons (DNs andCNs) with an association weight of ε (a hyperparameterselected by the system admin/user). So in any state, allDNs and CNs form a fully connected weighted graph.At the end of each operation, a feedback (E) is generatedand sent to the NS Controller module along with the snap- shot of the current state of the NMN (S). S (essentiallythe Learnable Parameters ) has two components:1) S → A : The adjacency matrix for the entire NMN.2) S → M : The list of memory strengths of each dataneuron.For an NMN with n total neurons (DNs and CNs) and m number of DNs: S → A =  a a a ... a n a a a ... a n ... ... ... ... ...a n a n a n ... a nn  S → M = (cid:2) s s s s s ... s m (cid:3) S and E , along with the learning goals/objectives (O)drives the reaction function f ( O, E, S ). The outputs of thisfunction are:1) An association weight adjustment matrix (∆ A ) ofdimension ( n, n ).2) A memory strength adjustment vector (∆ M ) of di-mension (1 , m ).These 2 components constitute ∆ S = { ∆ A, ∆ M } .∆ A =  δa δa δa ... δa n δa δa δa ... δa n ... ... ... ... ...δa n δa n δa n ... δa nn  ∆ M = (cid:2) δs δs δs δs δs ... δs m (cid:3) We compute S as follows: S → A =  max ( ε, a − δa ) ... max ( ε, a n − δa n ) max ( ε, a − δa ) ... max ( ε, a n − δa ) ... ... ...max ( ε, a n − δa n ) ... max ( ε, a nn − δa n )  S → M =  min (100 , max ( ϕ, s − δs )) min (100 , max ( ϕ, s − δs )) ...min (100 , max ( ϕ, s m − δs m ))  Where, ϕ (degree of impreciseness, a hyperparameter se-lected by the system admin/user) is the minimum memorystrength a data-neuron can have.The memory state is updated with the newly computedone ( S ). The function f ( O, E, S ) for computing ∆ M and∆ A can be realized in many diﬀerent ways depending onthe implementation. The updates made to the matricesfor a given state, S can be made local in nature to reduceunnecessary computations and updates. The periodicityof the state update can also be controlled. For the cur-rent implementation of NS used for performing the case-studies, the reaction function is jointly implemented usingAlgo. 1, Algo. 2 and Algo. 4. The algorithms are discussedin Appendix B and the high level concept is provided inFig.5 (a). a)(b)(c) Fig. 5: (a) Flowcharts illustrating the major steps of store, retrieve and retention operations in NS. (b) Visualizationof the NS memory structure during a sequence of operations. (c) Visualization of a traditional CAM for an identicalsequence of operations.

D. Memory Operations

We have designed three fundamental NS memory opera-tions (Store, Retrieve, and Retention) which are analogousto similar operations that exist in any traditional CAM.The learning process is directly embedded as parts of theseoperations. The ﬂowcharts of these operations are shownin Fig. 5 (a) and the implementation level details can befound in Appendix B.

1) Store:

The store operation (as shown in Fig. 5 (a))starts by reading the input data (D) and insertion cues (C). Before storing the input data using a new dataneuron, the NS framework attempts to merge it withan existing data neuron with similar content. This sub-operation (merge attempt) is designed to eliminate storingsimilar data multiple times. During the merge attempt, aset of candidate data neurons (selected based on accessi-bility with respect to the insertion cues, C) are examinedfor a good match and the data neurons that do not matchare penalized by being made less accessible in the NMN.If a data neuron having a good match with the input dataD) is found, then that matching data neuron is assigned ahigher memory strength and made more accessible in theNMN. After the merge attempt, if a good match is notfound, a new data neuron is instantiated for the input data(D). If a new cue (not present in the cue bank) is foundamong C, then a new cue-neuron is instantiated for it.Depending on the merge attempt success/failure, the newdata neuron or the matching data neuron respectively isassociated (if already associated, then strengthened) withthe insertion cues (C).The learning aspect of this operation is guided by theinput data (D) and cues (C) provided. The candidate dataneurons for merging are selected using a graph traversalstarting from the insertion cues (C). The graph traversal isguided by the NMN structure, hence wrong candidate dataneuron selections are penalized by making those candidatedata neurons less accessible (by NMN modiﬁcation). Onthe other hand, selecting a candidate data neuron with agood match, with respect to the input data, is rewardedby making that candidate data neuron more accessible (byNMN modiﬁcation). Association of the insertion cues (C)with the matching data neuron or the new data neuroncan also be considered as a learning process.

2) Retrieve:

The retrieve operation (as shown in Fig. 5(a)) starts by reading the search cues (C). The search cuesconsist of a set of coarse-grained cues ( C ) and, optionally,a set of ﬁne-grained cues ( C ). Based on C , a set ofcandidate data neurons are selected and are checked foran acceptable match with respect to the ﬁne-grained cues.The candidate data neurons that do not match with anyﬁne-grained cue in C are made less accessible in the NMNand if a candidate data neuron matches with any ﬁne-grained cue in C , then it is made more accessible in theNMN. At the end of the search attempt, if a matchingdata neuron is located, then it is provided as output andit also gets associated with all the search cues (C) insidethe NMN. In absence of C , the retrieve operation returnsthe ﬁrst accessed candidate data neuron during the searchphase.Similar to the store operation, the learning in thisoperation is also driven by the candidate data neuronselection, which is primarily based on the NMN organiza-tion/structure. A wrong candidate selection is penalizedand a good candidate selection is rewarded by makingnecessary NMN state modiﬁcations. Association of searchcues (C) with the matching data neuron is also a part ofthe learning process.

3) Retention:

In a traditional CAM, data retentioninvolves maintaining the memory in a ﬁxed state. NS, onthe other hand, allows the NMN to change and restructureitself to show the eﬀect of ageing as shown in Fig. 5 (a). Allthe data neurons not accessed in the last N-operations (Nis a hyperparameter selected by the system admin/user),are weakened. Weakening a data neuron leads to datafeature loss. This sub-operation is a form of reinforcementlearning which considers the access pattern and determineswhich data neurons to shrink for saving space. The nextsub-operation is also learning-driven, where the accessibil- ity of unused data neurons are reduced based on the accesspattern.

E. Dynamic Behaviour of NS

In comparison to traditional CAM, NS is dynamic inseveral aspects. The NMN of NS changes after everyoperation and the eﬀect of ageing is captured using theretention procedure. In Fig. 5 (b), we illustrate the dy-namic nature of NS by displaying how the NMN changesduring a sequence of operations. Accessibility of diﬀer-ent data neurons are changed and memory strength ofdata neurons increase or decrease based on the feedback-driven reinforcement learning scheme. In contrast, as seenin Fig. 5 (c), the traditional CAM does not show anysign of intelligence or dynamism to facilitate data stor-age/retrieval. In Appendix D, a more detailed simulation-accurate depiction and description of NS’s dynamism areprovided.

F. NS Simulator

In order to quantitatively analyze the eﬀectiveness of NSin a computer system, we have designed and implementeda NS simulator. It has the following features: • It can simulate all memory operations and providerelative beneﬁts with respect to traditional CAM interms of operation cost. • The framework is conﬁgurable an array of hyperpa-rameters introduced in Section III-B2. • The NS simulator can be mapped to any applicationwhich is designed for using a CAM or a similarframework. • The simulator implements the learning paradigm asshown in Fig. 5 and more details can be found inAppendix B. • The NS simulator is scalable and can simulate amemory of arbitrarily large size. • To ensure correctness, the simulator software is vali-dated through manual veriﬁcation of multiple randomcase studies with a large number of random opera-tions.We deﬁne operation cost as the amount of eﬀort ittakes to perform a particular operation. It is clear thatthe iterative search-section of the NS operations (Fig. 5),dominates over the remaining sub-operational steps interms of eﬀort. Hence, we approximately consider theoperation cost for NS to be the number of times the search-section is executed for both store and retrieve operations.For the traditional CAM, we consider the operation costto be the number of data entries searched/looked-up. Forboth traditional CAM and NS, we do not consider anyparallelism while searching to ensure fairness. Also, thecost of writing the data to the memory for both NSand traditional CAM is not considered as a part of theoperation cost. For NS, the eﬀort of writing the data tothe memory is less than or equal (in the worst case, dueto data merging) to that of the traditional CAM.ig. 6: Results for the wildlife surveillance application which prioritizes deer images (scenario 1). (a) This plot shows thememory growth for traditional CAM and NS. The NS memory growth is logarithmic in nature, while the TraditionalCAM space utilization grows linearly. (b) The zoomed-in portion of the memory growth plot shows the space utilizationﬂuctuation due to access-pattern driven learning guided data compression in NS framework. (c) This plot shows thememory quality factor for NS and Traditional CAM at diﬀerent memory size limits. It is clear that NS can operatemore eﬃciently in space-constrained scenarios. (d,e) Shows the retrieval operation costs for the NS and the traditionalCAM. NS appears to be far superior due to dynamism.Fig. 7: Results for wildlife surveillance application which prioritizes fox/wolf images (scenario 2). We show similar plotsand highlight similar results as Fig. 6. In summary, NS appears to be more eﬃcient than traditional CAM.

G. Desirable Application Characteristics

Any application using a CAM or a similar frameworkcan be theoretically replaced with NS. However, certainapplications will beneﬁt more than others. The two maintraits of an application which will enhance the eﬀectivenessof NS are described below.

1) Imprecise Store & Retrieval:

Although NS can beconﬁgured to operate at 100% data precision, it is rec- ommended to use the framework in the imprecise modefor storage and search eﬃciency. Assume D = Set of dataneurons in the Memory at a given instance. For a givendata D i ∈ M , if D i is compressed (in lossy manner) to D i and ( size ( D i ) = size ( D i ) − (cid:15) ), then in order for theapplication to operate in the imprecise domain, it must bethat ( Quality ( D i ) = Quality ( D i ) − (cid:15) ). Where size(X) isthe size of the data neuron X and Quality(X) is the qualityf the data in the data neuron X, in light of the speciﬁcapplication. (cid:15) and (cid:15) are small quantities. For example,in a wildlife surveillance system, if an image containing awolf is compressed slightly, it will still look like an imagewith the same wolf.

2) Notion of Object(s)-of-Interest:

NS works best ifthere exists a set of localities within each hive whichare designated to store data containing speciﬁc objects.Each locality can be conﬁgured to have diﬀerent memorystrength decay rate based on the importance of the datawhich are designated to be stored in the respective locality.Note that the deﬁnition of an object in this context impliesspeciﬁc features of the data. For example, in case of animage data, the object can be literal objects in the imagebut for a sound data, the objects can be thought of asa speciﬁc sound segment with certain attributes. Assumethat D is a new incoming data which must be stored inthe Memory and OL = objects in data D. Then theremay be situations where, ∃ O , O ∈ OL | Imp ( O >Imp ( O Imp ( O i ) denotes the importance ofthe object

O i for the speciﬁc application. For example,in wildlife surveillance designed to detect wolves, framescontaining at least one wolf should be considered withhigher importance.

IV. Case Study

To evaluate the eﬀectiveness of NS, we choose twoimage datasets from two representative applications: (1)wildlife surveillance system and (2) a UAV-based securitysystem. For both applications, traditional CAM can beused for eﬃcient data storage and retrieval. We performeda comparative study between NS and traditional CAMfor these datasets in terms of several key memory accessparameters. To model NS behaviour for the target dataset,we use the simulator as described in Section III-F, whiletraditional CAM behaviour is modelled based on standardCAM organization [20]. Next, we present a quantitativeanalysis on the beneﬁts of using NS over traditional CAM.The hyperparameters and conﬁguration details of theNS simulator used in the case studies are described inAppendix C.

A. Wildlife Surveillance

Image sensors are widely deployed in the wilderness forrare species tracking and poacher detection. The wilder-ness can be vast and IoT devices operating in these regionsoften deal with low storage space, limited transmissionbandwidth and power/energy shortage. This demands ef-ﬁcient data storage, transmission and power management.Interestingly, this speciﬁc application is resistant to impre-cise data storage and retrieval because compression doesnot easily destroy high-level data features in images. Also,in the context of this application, certain objects such asa rare animal of a speciﬁc species are considered moreimportant than an image with only trees or an unim-portant animal. Hence this application has the desirablecharacteristics suitable for using NS and will certainly beneﬁt from NS’s learning guided preciseness modulationand plasticity schemes. Informed reduction of unnecessarydata features will also lead to less transmission bandwidthrequirements. Memory power utilization is proportional tothe time required to carry out a store, load and otherbackground tasks. And NS, due to its eﬃcient learning-driven dynamic memory organization, can help reducememory operation time and consequently can reduce over-all operation eﬀort. Furthermore, transmitting the optimalamount of data (instead of the full data) will lead to lesserenergy consumption as transmission power requirement isoften much higher than computation [21].

1) Dataset Description:

To emulate a wildlife surveil-lance application, we construct an image dataset froma wildlife camera footage containing 40 diﬀerent animalsightings. The details, of this dataset, are located at thepublicly available dataset repository that we have released[22]. We construct two diﬀerent scenarios for carrying outexperiments on this dataset: • Scenario 1:

The system user wishes to prioritizedeer images and perform frequent deer image retrievaltasks. • Scenario 2:

The system user wishes to prioritizefox/wolf images and perform frequent fox/wolf imageretrieval tasks.

2) Eﬀectiveness of NS in Comparison to TraditionalCAM:

Both the NS framework and the traditional CAMare ﬁrst presented with all the images in the datasetsequentially and then a randomly pre-generated accesspattern (based on the scenario) is used to fetch 10,000 im-ages (non-unique) sampled from the dataset. For scenario-1, as can be seen in Fig. 6 (a), NS has a clear advantageover traditional CAM in terms of total space utilization.We also observe in the zoomed-in graph, Fig. 6 (b), theNS total space utilization ﬂuctuates and slowly decreasesafter the end of the store phase. This is due to accesspattern guided optimal data granularity learning resultingin compression/feature-loss of less important data. InFig. 6 (d) and Fig. 6 (e), we plot the operation cost (asdescribed in Section III-F) during the ﬁrst 50 retrieveoperations for traditional CAM and NS respectively. Weobserve that the operation cost for the NS in comparisonto traditional CAM is signiﬁcantly lower.In Table II, we present the numerical details of allthe experiments. The average operation cost (store andretrieve combined) for NS is about 165 times less thanthat of traditional CAM. It is worth noting that thePSNR (Peak signal-to-noise ratio) and SSIM (Structuralsimilarity) of the fetched images during retrieve operationsfor NS have similar values as that of traditional CAM. Thisensures that using the NS framework for this applicationwill not aﬀect the eﬀectiveness of the application. Wenext perform the same experiments with locality-0 tunedto store only fox/wolf frames (scenario-2). In Fig. 7 andTable II, we observe similar trends.

3) Eﬀectiveness of NS in Constrained Settings:

Most ofthe image sensors used in a wildlife surveillance system aredeployed in remote locations and must make eﬃcient useig. 8: Results for UAV-based car surveillance application which prioritizes car images. We show similar plots andhighlight similar observations as Fig. 6. In summary, it appears that NS is more eﬃcient than traditional CAM interms of both storage and retrieval eﬃciency.TABLE II: Simulation results for diﬀerent case-studies showing the relative eﬀectiveness of NS with respect totraditional CAM. Judging by the PSNR and SSIM of the retrieved images, it is clear that NS provides similarperformance in terms of the quality of retrieved data with respect to traditional(Trad.) CAM. This leads us to believethat, the application performance will not be hampered from using NS. At the same time, NS is utilizing much lessspace and average operation cost is far lower in comparison to traditional CAM.

Wildlife Surveillance: Emphasis on Deer (Scenario-1)

Mem. Type PSNR SSIM Avg. Retrieval Op. Cost Avg. Store Op. Cost Avg. Op. Cost Final Memory Size (MB)Trad. CAM 37.65 0.79 5573.93 0 2786.96 2688.79NS 38.39 0.82 5.51 28.23 16.87 89.46

Wildlife Surveillance: Emphasis on Fox (Scenario-2)

Mem. Type PSNR SSIM Avg. Retrieval Op. Cost Avg. Store Op. Cost Avg. Op. Cost Final Memory Size (MB)Trad. CAM 37.11 0.75 3306.83 0 1653.41 2688.79NS 38.52 0.79 4.39 22.96 13.67 97.40

UAV-based Surveillance for Safety: Emphasis on Car

Mem. Type PSNR SSIM Avg. Retrieval Op. Cost Avg. Store Op. Cost Avg. Op. Cost Final Memory Size (MB)Trad. CAM 30.57 0.71 1870.41 0 935.20 801.14NS 32.42 0.78 6.03 22.67 14.355 34.67 of bandwidth and storage space without sacriﬁcing systemperformance. NS is designed to excel in this scenarioand to verify this, we limit the total memory size (X-axis) and plot the memory quality factor (Y-axis) inFig. 6 (c) (scenario-1) and Fig. 7 (c) (scenario-2). Thememory quality factor is deﬁned in Eqn. 1. In both thescenarios we observe that NS is capable of functioning at amuch lower memory capacity in comparison to TraditionalCAM. Lower space utilization also translates to less trans-mission bandwidth consumption, in-case the system has toupload the stored data to the cloud or other IoT devices.Also, note that the quality factor of the NS frameworkincreases exponentially with the increase in memory sizelimit whereas the quality factor of the traditional CAMincreases at a much slower pace.

Quality F actor = P SN R + (100 ∗ SSIM ) (1)

B. UAV-based Security System

UAV-based public safety monitoring is a critical ap-plication which is often deployed in remote areas withlimited bandwidth and charging facilities. Additionally,UAVs by design must deal with small battery life andlimited storage space. Hence, this application operates ina space, power and bandwidth constraint environment.However, this application is resistant to imprecise stor-age and retrieval because it deals with images whichretain most of the important features even after compres-sion/approximation. And, a UAV roaming over diﬀerentregions captures plenty of unnecessary images which maynot be important for the application’s goal/purpose. Hencethere is a notion of object(s)-of-interest. All these charac-teristics and requirements, make this application ideal forusing NS. ) Dataset Description:

To capture the application sce-nario, we have created a dataset containing UAV footageof a parking-lot [22]. The UAV remains in motion andcaptures images of cars and people in the parking lot. Weconstruct the experiments with the assumption that thesystem user wishes to prioritize car images and performfrequent car image retrieval tasks.

2) Eﬀectiveness of NS in Comparison to TraditionalCAM:

We notice a similar trend as observed for thewildlife surveillance system. The memory space utilizationgraph in Fig. 8(a), shows that the NS framework is muchmore space-eﬃcient than traditional CAM. In the zoomed-in portion, Fig. 8(b), we observe that the memory spaceutilization decays after the store phase, due to compressionof data which are not being accessed. In Fig. 8(d-e), weobserve that NS is more eﬃcient in terms of retrievaloperation cost due to its learning guided dynamic memoryorganization (operation cost is estimated as described inSection III-F). From Table II, we observe that NS is about65x more eﬃcient in terms of average operation cost (storeand retrieve combined). Furthermore, in Table II, we notethat the PSNR and SSIM of the fetched images duringretrieve operations are similar for NS and traditionalCAM. So it evident that the NS framework is equallyeﬀective as a traditional CAM in terms of serving theapplication.

3) Eﬀectiveness of NS in Constrained Settings:

TheUAV-based surveillance system may require to operate inresource-constrained environment. In Fig. 8(c), we observethat the NS framework is much more suitable when itcomes to functioning at extremely low memory space. Onthe other hand, the quality factor (deﬁned in Eqn. 1) of thetraditional CAM is much lower and increases very slowlyas the storage space limitation is relaxed.

V. Conclusion

We have presented NS, a learning-guided memoryparadigm, which can provide a dramatic improvementin memory access performance and eﬀective storage ca-pacity in diverse applications. It draws inspiration fromthe human brain to systematically incorporate learningin the memory organization that dynamically adapts tothe data access behaviour for improving storage and ac-cess eﬃciency. We have presented the retrieve, store andretention processes in details that integrate and employdata-driven knowledge. We have developed a complete per-formance simulator for NS and compared its data storagebehaviour with traditional content-based memory. Quanti-tative evaluation of NS for two representative applicationsshows that it vastly surpasses the storage and retrievaleﬃciency of traditional CAM. By dynamically adaptingdata granularity and adjusting the associations betweendata and search patterns, NS demonstrates a high-level ofplasticity that is not manifested by any existing computermemory organization. While we have worked with high-level memory organizational parameters here, our futurework will focus on the physical implementation of NS.We believe, the proposed paradigm can open up avenues for promising physical realizations to further advancethe eﬀectiveness of learning and can signiﬁcantly beneﬁtfrom the data storage behaviour of emergent non-siliconnanoscale memory devices (such as resistive or phasechange memory devices).

References [1] R. Karam, R. Puri, S. Ghosh, and S. Bhunia, “Emerging trendsin design and applications of memory-based computing andcontent-addressable memories,”

Proceedings of the IEEE , vol.103, no. 8, pp. 1311–1330, 2015.[2] V. Dyo, S. A. Ellwood, D. W. Macdonald, A. Markham,C. Mascolo, B. P´asztor, S. Scellato, N. Trigoni, R. Wohlers, andK. Yousef, “Evolution and sustainability of a wildlife monitoringsensor network,” in

Proceedings of the 8th ACM Conference onEmbedded Networked Sensor Systems , 2010, pp. 127–140.[3] D. Duarte, F. Nex, N. Kerle, and G. Vosselman, “Towardsa more eﬃcient detection of earthquake induced facade dam-ages using oblique uav imagery,”

The International Archivesof Photogrammetry, Remote Sensing and Spatial InformationSciences , vol. 42, p. 93, 2017.[4] S. Li, H. Tang, S. He, Y. Shu, T. Mao, J. Li, and Z. Xu, “Unsu-pervised detection of earthquake-triggered roof-holes from uavimages using joint color and shape features,”

IEEE Geoscienceand Remote Sensing Letters , vol. 12, no. 9, pp. 1823–1827, 2015.[5] S. Park, “Portable surveillance camera and personal surveil-lance system using the same,” Jan. 25 2007, uS Patent App.10/561,607.[6] C. Carthel, S. Coraluppi, and P. Grignan, “Multisensor trackingand fusion for maritime surveillance,” in

Nature Reviews Neuro-science , vol. 18, no. 5, pp. 261–262, 2017.[9] M. L¨ovd´en, L. B¨ackman, U. Lindenberger, S. Schaefer, andF. Schmiedek, “A theoretical framework for the study of adultcognitive plasticity.”

Psychological bulletin , vol. 136, no. 4, p.659, 2010.[10] R. C. Atkinson and R. M. Shiﬀrin, “Human memory: A pro-posed system and its control processes,”

Psychology of learningand motivation , vol. 2, no. 4, pp. 89–195, 1968.[11] E. Tulving and D. L. Schacter, “Priming and human memorysystems,”

Science , vol. 247, no. 4940, pp. 301–306, 1990.[12] J. L. Hennessy and D. A. Patterson,

Computer architecture: aquantitative approach . Elsevier, 2011.[13] S. Paul and S. Bhunia, “Reconﬁgurable computing using contentaddressable memory for improved performance and resourceusage,” in

Proceedings of the 45th annual Design AutomationConference , 2008, pp. 786–791.[14] K. Pagiamtzis and A. Sheikholeslami, “Content-addressablememory (cam) circuits and architectures: A tutorial and sur-vey,”

IEEE journal of solid-state circuits , vol. 41, no. 3, pp.712–727, 2006.[15] Y. V. Pershin and M. Di Ventra, “Neuromorphic, digital, andquantum computation with memory circuit elements,”

Proceed-ings of the IEEE , vol. 100, no. 6, pp. 2071–2080, 2011.[16] L. Zheng, Y. Yang, and Q. Tian, “Sift meets cnn: A decadesurvey of instance retrieval,”

IEEE Transactions on PatternAnalysis and Machine Intelligence , vol. 40, no. 5, pp. 1224–1244,2018.[17] C. Niederee, N. Kanhabua, F. Gallo, and R. H. Logie,“Forgetful digital memory: Towards brain-inspired long-termdata and information management,”

SIGMOD Rec. , vol. 44,no. 2, p. 41–46, Aug. 2015. [Online]. Available: https://doi.org/10.1145/2814710.2814718[18] J. Von Neumann and R. Kurzweil,

The computer and the brain .Yale University Press, 2012.[19] W. B. Pennebaker and J. L. Mitchell,

JPEG: Still image datacompression standard . Springer Science & Business Media,1992.20] A. G. Hanlon, “Content-addressable and associative memorysystems a survey,”

IEEE Transactions on Electronic Computers ,vol. EC-15, no. 4, pp. 509–521, 1966.[21] C. M. Sadler and M. Martonosi, “Data compression algorithmsfor energy-constrained devices in delay tolerant networks,” in

Proceedings of the 4th international conference on Embeddednetworked sensor systems , 2006, pp. 265–278.[22] “BINGO dataset,” https://github.com/prabuddha1/BINGODataset, accessed: 11-18-2020.[23] K. Simonyan and A. Zisserman, “Very deep convolutionalnetworks for large-scale image recognition,” arXiv preprintarXiv:1409.1556 , 2014.[24] S. Haug and J. Ostermann, “A crop/weed ﬁeld image datasetfor the evaluation of computer vision based precision agriculturetasks,” in

European Conference on Computer Vision . Springer,2014, pp. 105–116.[25] A. Bauer, A. G. Bostrom, J. Ball, C. Applegate, T. Cheng,S. Laycock, S. M. Rojas, J. Kirwan, and J. Zhou, “Combiningcomputer vision and deep learning to enable ultra-scale aerialphenotyping and precision agriculture: A case study of lettuceproduction,”

Horticulture research , vol. 6, no. 1, pp. 1–12, 2019.[26] M. T. Chiu, X. Xu, Y. Wei, Z. Huang, A. G. Schwing, R. Brun-ner, H. Khachatrian, H. Karapetyan, I. Dozier, G. Rose et al. ,“Agriculture-vision: A large aerial image database for agri-cultural pattern analysis,” in

Proceedings of the IEEE/CVFConference on Computer Vision and Pattern Recognition , 2020,pp. 2828–2838.[27] C. Yuan, Z. Liu, and Y. Zhang, “Aerial images-based forest ﬁredetection for ﬁreﬁghting using optical remote sensing techniquesand unmanned aerial vehicles,”

Journal of Intelligent & RoboticSystems , vol. 88, no. 2-4, pp. 635–654, 2017.[28] ´A. Rest´as, “Thematic division and tactical analysis of the uasapplication supporting forest ﬁre management,”

Viegas, X.D.,Ed., Advances in Forest Fire Research , 2014.[29] Liyang Yu, Neng Wang, and Xiaoqiao Meng, “Real-time forestﬁre detection with wireless sensor networks,” in

Proceedings.2005 International Conference on Wireless Communications,Networking and Mobile Computing, 2005. , vol. 2, 2005, pp.1214–1217.[30] C¸ . F. ¨Ozgenel and A. G. Sorgu¸c, “Performance comparison ofpretrained convolutional neural networks on crack detection inbuildings,” in

ISARC. Proceedings of the International Sympo-sium on Automation and Robotics in Construction , 2018.[31] M. J. Way, J. D. Scargle, K. M. Ali, and A. N. Srivastava,

Advances in machine learning and data mining for astronomy .CRC Press, 2012.[32] M. H. Carr,

The surface of Mars . Cambridge University Press,2007, vol. 6.[33] B. Rothrock, R. Kennedy, C. Cunningham, J. Papon, M. Hev-erly, and M. Ono, “Spoc: Deep learning-based terrain classiﬁca-tion for mars rover missions,” in

AIAA SPACE 2016 , 2016, p.5539.

Prabuddha Chakraborty is pursuing hisPh.D. at the University of Florida under thesupervision of Dr. Swarup Bhunia. He receivedhis M.Tech. from Indian Institute of Tech-nology (IIT), Kanpur. He has interned withXilinx and Texas Instruments. His researchinterests include computer vision, system se-curity and applied machine learning in variousdomains.

Swarup Bhunia received his Ph.D. fromPurdue University, IN, USA. Currently, Dr.Bhunia is a professor and Semmoto EndowedChair of IoT in the University of Florida, FL,USA. Earlier he was appointed as the T. andA. Schroeder associate professor of ElectricalEngineering and Computer Science at CaseWestern Reserve University, Cleveland, OH,USA. He has over ten years of research anddevelopment experience with over 200 publi-cations in peer-reviewed journals and premierconferences. His research interests include hardware security andtrust, adaptive nanocomputing and novel test methodologies. Dr.Bhunia received IBM Faculty Award (2013), National Science Foun-dation career development award (2011), Semiconductor ResearchCorporation Inventor Recognition Award (2009), and SRC technicalexcellence award as a team member (2005), and several best paperawards/nominations. He has been serving as an associate editorof IEEE Transactions on CAD, IEEE Transactions on Multi-ScaleComputing Systems, ACM Journal of Emerging Technologies; servedas guest editor of IEEE Design & Test of Computers (2010, 2013)and IEEE Journal on Emerging and Selected Topics in Circuitsand Systems (2014). He has served in the organizing and programcommittee of many IEEE/ACM conferences. He is a senior memberof IEEE. ppendix AHyperparameters

In this appendix, we describe the NS hyperparameters ingreter details. These hyperparameters can be set/changedby the user during the setup or during the operationalstage. We deﬁne the number of memory hives as asystem-level hyperparameter and we propose the followinghyperparameters for each memory hive :1)

Number of localities ( L n ) : Each locality is usedto store a speciﬁc nature of data. It is an unsignedinteger value. If there exist X types of objects-of-interest for an application, then using X + 1 localitiesis advised. Every object-type can be assigned to aspeciﬁc locality for optimal search eﬃciency and datagranularity control. The last locality can be used forstoring the unimportant data.2) Memory decay rate of each locality : Controlsthe rate at which data neuron memory strength andfeatures are lost due to inactivity. It is a list (of length L n ) containing positive ﬂoating-point values. Assumethat two localities L L I I I > I

2, then it is advised to pick the decay rate of L L Association decay rate of each locality : Controlsthe rate at which NMN associations losses strengthdue to inactivity. It is a list (of length L n ) containingpositive ﬂoating-point values. Assume that two local-ities L L I I I > I

2, then itis advised to pick the decay rate of L L Mapping between data features and localities :This mapping dictates the segregation of applicationrelevant data and their assignment to a locality witha lower decay rate. It is a dictionary with L n entries(one for each locality). Each entry is a list of datafeatures (vectors) which when present in a data makesit a ﬁt for the respective locality.5) Data features and cue extraction AI (Ar-tiﬁcial Intelligence) models : These models areused to obtain more insights about the data duringstore/retrieve. They should be selected based on thedata-type being processed.6)

Neural elasticity parameters : Used to make space,in case the memory becomes full. It is a dictionarywith L n entries. Each entry (corresponding to eachlocality) is a list of positive ﬂoating-point values.The values indicate the amount of memory strengthloss imposed in successive iterations of the elasticityprocedure.8) Association weight adjustment parameter : Usedas a step size for increasing/decreasing association weights inside the NMN. A higher value will increasethe dynamism but lower the stability. An optimal bal-ance should be determined based on the applicationand based on a test-run.9)

Minimum association weight ( ε ) : It is an unsignedinteger which limits the decay of association weightbeyond a certain point. A lower value will increasedynamism.10) Degree of allowed impreciseness( ϕ ) : Limits theamount of data feature which is allowed to be lostdue to memory strength decay and inactivity. It isa ﬂoating-point number in the range of [ 0 - 100 ].0 implies data can get completely removed if needsarise. Keeping the parameter at 1 will ensure every-thing remains in the memory while retaining someunimportant memories at extremely low quality.11) Frequency of retention procedure : The retentionprocedure of NS brings in the eﬀect of ageing. Thishyperparameter is a positive integer denoting thenumber of normal operations to be performed beforethe retention procedure is called once. A lower valuewill increase dynamism at a cost of operation eﬀort(energy and time consumption).12)

Compression techniques : For each memory hivean algorithm for data compression must be speciﬁed.For example, we can use JPEG [19] compression foran image hive.Additional hyperparameters can be added to the systemto allow more ﬁne-tuned adjustments.

Appendix BOperation Algorithms

In this appendix we will provide implementation leveldetails of diﬀerent NS operations, sub-operations andprocedures.

1) Write/Store Operation:

For storing a data elementin the NS framework, we propose the algorithm depictedin Algo. 1. M is the NS memory system, D is the datato be stored, S is the search parameters used for datamerging attempts, SearchLimit limits the amount ofsearch eﬀort spent on the merge attempt, up can limitthe number of changes made to the data neuron searchorder, and k allows/disallows association weight decay. Atﬁrst, the memory hive M H suitable for the data type DT (estimated from D ) is selected (line 3). P arams is the setof hyperparameters associated with

M H (line 4). η (step-size) is the association weight adjustment parameter (line5). From D , the data features are extracted and storedin DF (line 6). S has three components: C is the set ofcues provided by the user that are associated with thegiven data D , T T D . The ﬁrst step of data storage is to ensure that thememory hive M H has enough space to store the incomingdata. If the memory lacks space, then, in order to emulatea virtually inﬁnite memory, the framework reduces dataeatures and details of less important data until enoughspace is created. For example, in a wildlife surveillancesystem for detecting wolves, a lack of space can leadto compression of image frames without wolves in them.This is an iterative procedure (lines 13-15) and the data-reduction/compression aggressiveness increases with eachiteration as shown in Algo. 5. Once enough space is madefor the incoming data D , the next step is to determinethe locality L , which would be ideal for storing the data D (line 16). L is determined based on the hyperparame-ter which holds the mapping between data features andlocalities. Next, a search is carried out to determine ifthe data D can be merged with an existing data neuron(lines 21-30). Intelligently merging the incoming data witha pre-existing data neuron can help save space and increasesearch eﬃciency. For example, if the incoming data isalmost the same as an already existing data, then thereis no need to separately store them. Before the searchbegins, the search order list is extracted and stored in SO (using Algo. 6). The search order list is maintained as apart of the memory M and is a dictionary containing x entries where x is the number of cues currently in the cuebank for the hive M H . Each entry is an ordered list of < P ath, DU > pairs arranged in the order of decreasingsearch priority. The search terminates when either a goodmatch is found or when the

SearchLimit is reached.During the search, at each iteration, if the data featureof the candidate/target data neuron (

T argetDN → DF )has a good match with the data feature of incoming data DF , then the T argetDN is selected for merging. Usingthe

Reaction procedure (Algo. 4), the NMN is updatedusing reinforcement learning to reﬂect a success (line 26).Otherwise, the

Reaction procedure (Algo. 4) updates theNMN using reinforcement learning considering a failedmerge attempt (line 28). If the merge candidate searchterminates without ﬁnding a good match, then a newdata neuron (

DN N ew ) is initialized inside the Locality L (line 32) and all the cue-neurons corresponding to therespective cues in C are associated (if already associated,then strengthened) with it (lines 33-34). If any of thesecues are not present in the cue-bank, then new cue neuronsare generated for those cues. The memory search order isalso updated for M H to reﬂect the presence of the newdata neuron. These < CN, DN > associations thus formedare also a form of learning.The REACTION (Algo. 4) subroutine is a reinforce-ment learning guided procedure which is used during thestore and retrieve operation for creating new associa-tions, changing association weights, and updating searchorder for cues. M H is the memory hive being updated,

T argetDN is the data neuron which is the focus of thechange,

P ath is the path to the

T argetDN , η (step-size)is the association weight adjustment parameter, f lag isused to indicate a search/merge success or failure, C is theset of cues used during the search/merge procedure, up isused to allow or disallow memory search order change,and k allows/disallows association weight decay. Eachassociation/connection a in the P ath , is either weakened

Algorithm 1

Store procedure store ( M, D, S, SearchLimit, up, k ) DT = D → Data T ype MH = select Hive ( M, DT ) P arams = MH → P arams η = P arams → Assoc W t Adj DF = D → F eatures C = S → Search Cues T S → Assoc T hresh T S → Match T hresh

C ext = extract Cue ( P arams, D ) C.append ( C ext ) elasticity iter = 0 while size ( D ) > remaining space ( MH ) do ELAST ICIT Y ( elasticity iter, MH ) elasticity iter + + L = select Locality ( P arams, C, D ) found = F alse SO = GET SEARCH ORDER ( C, MH, T , SearchLimit ) index = 0 DN List = φ while found == F alse && index < = SearchLimit do { P ath, T argetDN } = SO [ index ] if T argetDN / ∈ DN List then if match ( T argetDN → DF, DF ) > T then found = T rue

REACT ION ( MH, T argetDN, P ath, η, , C, up, k ) else REACT ION ( MH, T argetDN, P ath, η, , C, up, k ) DN List.append ( T argetDN ) index + + if found == F alse then

DN New = Initialize New DN with D in L for each c ∈ C do associate ( c, DN New ) UP DAT E MEMORY SEARCH ORDER ( MH ) or enhanced depending on the value of f lag and k , byan amount η (lines 2-6). If f lag == 1, then the memorystrength of T argetDN is increased (line 8). All the cues C i in C are associated with T argetDN and if a link exists,then it is strengthened (lines 9-10). Also, if any of thesecues are not present in the cue-bank, then new cue neuronsare generated for the cues. Finally, if up == 1, then thememory search order for the memory hive M H is updatedto reﬂect the alternations in NMN.

2) Read/Load Operation:

For the retrieval/load oper-ation, we propose the algorithm depicted in Algo. 2. Inthe current implementation, this operation only returnsone data-unit as output. M is the NS memory system, S is the search parameters used for retrieval, SearchLimit limits the amount of search eﬀort spent on the search-ing attempt, up can limit the number of changes madeto the search order, and k allows/disallows associationweight decay. At ﬁrst the memory hive M H is selectedbased on DT in line 3 (the data type of D ). P arams lgorithm 2

Retrieve procedure retrieve ( M, S, SearchLimit, up, k ) DT = S → Data T ype MH = select Hive ( M, DT ) P arams = MH → P arams η = P arams → Assoc W t Adj C = S → Search Cues C ext = extract Cue ( S → Ref D ) C.append ( C ext ) C C → Search Cues Coarse C S → Search Cues F ine T S → Assoc T hresh T S → Match T hresh found = F alse

DN List = φ Ret Data = φ SO = GET SEARCH ORDER ( C , MH, T , SearchLimit ) index = 0 while found == F alse && index < = SearchLimit do { P ath, T argetDN } = SO [ index ] if T argetDN / ∈ DN List then if match ( T argetDN, C > T then found = T rue

REACT ION ( MH, T argetDN, P ath, η, , C, up, k ) Ret Data = T argetDN → Data else

REACT ION ( MH, T argetDN, P ath, η, , C, up, k ) DN List.append ( T argetDN ) index + + return Ret Data is the set of hyperparameters of

M H (line 4). From S diﬀerent components are extracted. C is the set ofsearch cues provided by the memory user, DT is the datatype, T T C C SO (using Algo. 6). Next, the search iscarried out to determine if a good match can be found(lines 18-28). During the search, at each iteration, if thedata feature of T argetDN has a good match with any ofthe ﬁne-grained cues C

2, then the

T argetDN is selectedfor retrieval. In this situation, the

Reaction procedure(Algo. 4), updates the NMN using reinforcement learningto reﬂect a success (line 23). Otherwise, the

Reaction procedure (Algo. 4) updates the NMN using reinforcementlearning considering a failure (line 26). Finally, the dataselected for retrieval (

Ret Data ) is returned.

3) Retention:

In the human brain, memories lose fea-tures and prominence over time. To model this sub-conscious process, we introduce the retention procedure.This procedure can be repeated after a particular intervalor can be invoked after certain events. The reinforcement

Algorithm 3

Retention procedure retention ( M, N, k ) C = M → Connections D = M → Data Neurons if k == 1 then for each c ∈ C do if Not Accessed Recently ( c, N ) then MH ID = F ind Memory Hive ID ( c ) L ID = F ind Memory Locality ID ( c ) a decay = F ind Assoc Decay ( MH ID, L ID, M ) W eaken ( c, a decay ) for each d ∈ D do if Not Accessed Recently ( d, N ) then MH ID = F ind Memory Hive ID ( d ) L ID = F ind Memory Locality ID ( d ) d decay = F ind Mem Decay ( MH ID, L ID, M ) imp degree = F ind Imprec Degree ( MH ID ) d str new = reduce mem str ( d, d decay ) Compress Mem ( d, d str new, imp degree ) for each MH ∈ M do UP DAT E MEMORY SEARCH ORDER ( MH ) Algorithm 4

Reaction procedure reaction ( MH, T argetDN, P ath, η, flag, C, up, k ) for each a ∈ P ath do if flag == 1 then Increase Assoc W eight ( a, η ) if flag == 0 && k == 1 then Decrease Assoc W eight ( a, η ) if flag == 1 then Increase Mem Str ( T argetDN ) for each C i ∈ C do associate enhance ( C i , T argetDN ) if up == 1 then UP DAT E MEMORY SEARCH ORDER ( MH ) learning guided algorithm to carry out this operation isdepicted in Algo. 3. M is the memory system, N is the his-tory threshold, and k allows/disallows association weightdecay. During this operation, any connections/associationsnot accessed during the last N-operations are weakeneddue to inactivity (lines 5-10) and any data neurons notaccessed in the last N-operations are subject to feature-loss/compression (lines 11-18). The compression rate islimited by the maximum allowed degree of impreciseness( imp degree ) for the given memory hive ( M H ID ).Also, the search order for the cues of each memory hive

M H is updated to reﬂect any changes due to alternationof association weights (lines 19-20).

A. Elasticity

The ELASTICITY (Algo. 5) subroutine is used duringthe store operation for making space in-case of a mem-ory shortage. NS framework is designed to operate as avirtually inﬁnite memory where no data is ever deletedut instead unimportant data are subject to feature lossover time. The elasticity hyperparameters are ﬁrst ex-tracted into elast param from the memory hive

M H (line 2). Depending on the current iteration of elasticity( elasticity iter ) and the Locality ( L ), we obtain thefactor of memory strength decay (line 5), ef . For each dataneuron D in the locality L , the new memory strength iscomputed and are compressed if required (lines 6-8). Thecompression rate is limited by the maximum allowed de-gree of impreciseness ( imp degree ) for the given memoryhive ( M H ). B. Get Search Order

Algo. 6 is a subroutine which is used to fetch thesearch order for a search/merge-attempt. C is the set ofcues provided for the operation, M H is the memory hivewhere the search/merge-attempt is to take place, T SearchLimit is used to limitthe number of candidates selected. For each cue C i in C ,the search order list is fetched from M H and stored in SO (line 5). Then for each candidate in SO , if the averageassociation strength of the candidate is greater than T candidate is appended to the CandidateList . The ﬁnal

CandidateList is returned at the end of the function (line14) or at line 11, in case of an early exit.

C. Update Memory Search Order

Algo. 7 is a subroutine which is used to update thesearch order for a given memory hive (

M H ). Cues holdsall the cues in

M H . For every cue, C in Cues , the paths toeach data-neuron D , with the highest average associationstrength is selected and stored in decreasing order of aver-age association strength. The N ewSearchOrder replacesthe previous search order for the

M H (line 12).

D. Update Operation

In traditional CAM, update operation corresponds tochanging the data associated with a particular tag/cueor by changing the tag/cue associated with a particulardata. Such updates are also possible in NS using retrieve,store and retention procedures. For example, to associatea new tag/cue with a data, one can simply issue a re-trieve operation of the target data with the new cue.This will cause the NS system to associate the new cuewith the target data. Any old associations of the datawith other cues will lose prominence over time if thoseassociations do not get referenced. The storage, retrievaland retention algorithms can be used in many diﬀerentways to automatically form new associations and modifyexisting associations. Traditional CAM also supports datadeletion which can be also enabled in NS by setting ‘degreeof allowed impreciseness’ hyperparameter to 0. Deletionis not directly achieved in NS but less important data(worthy of deletion) will slowly decay in-terms of memorystrength due to lack of access and ultimately disappear.

Algorithm 5

Elasticity procedure elasticity ( elasticity iter, MH ) elast param = MH → P arams → elast param imp degree = F ind Imprec Degree ( MH ID ) for each L ∈ M H do ef = elast param [ L → Index ][ elasticity iter ] for each D ∈ L do d str new = Decrease Mem Str ( D, ef ) Compress Mem ( D, d str new, imp degree ) Algorithm 6

Get Search Order procedure GET SEARCH ORDER ( C, MH, T , SearchLimit ) CandidateList = φ Count = 0 for each C i ∈ C do SO = MH → Search Order [ C i → Index ] for each candidate ∈ S O do if candidate → assoc Str > T then CandidateList.append ( candidate ) Count + + if Count > = SearchLimit then return

CandidateList else

Break Loop return

CandidateList

Algorithm 7

Update Memory Search Order procedure UPDATE MEMORY SEARCH ORDER ( MH ) Cues = MH → Cues NewSearchOrder = φ for each C ∈ C ues do cueSearchOrder = φ D = All data-neuron reachable from C for each D i ∈ D do SP = find Average Strongest path ( C, D i ) assoc Str = average Assoc Str ( SP ) insert Sort ( cueSearchOrder, DU, SP, assoc Str ) NewSearchOrder [ C → Index ] = cueSearchOrder MH → Search Order = NewSearchOrder

Appendix CNS Simulator Case-Study Configuration

In this appendix, we shall provide the NS simulatorconﬁguration used during the case-studies. In the contextof the case studies, we allow the following simpliﬁcationsover the proposed organization and algorithms (presentedin Appendix B): • Connections between cue-neurons are not formed. • Search order entries have a path length of 1. • Every locality has a ”default cue” which is connectedwith all the data-neurons in the locality and is usedto access data-neuron which are not otherwise ac-cessible from normal cues. This construct emulates < DN, DN > associations. . NS hyperparameters Used

For all the case studies, we use a single memory hive forholding the image data. For this speciﬁc hive we use thefollowing hyperparameters: • Number of localities ( L n ): 2. • Memory decay rate of each locality: [0.5, 1] • Association decay rate of each locality: [0, 0] • Mapping between data features and localities: De-pends on the application and case-study. – Wildlife Surveillance:

For scenario 1, deer im-ages are mapped to locality 0 and other imagesare mapped to locality 1. For scenario 2, wolf/foximages are mapped to locality 0 and other imagesare mapped to locality 1. – UAV-based Security System:

Car images aremapped to locality 0 and other images are mappedto locality 1. • Data features and cue extraction AI (Artiﬁcial Intel-ligence) models: VGG 16 predicted classes are usedas coarse-grained cues [23]. For data feature and ﬁne-grained cues, we use the VGG 16 fc2 activations. • Data-Neuron matching metric: Cosine-Similarity. • Neural elasticity parameters:0 → [80, 70, 60, 50, 40, 30, 20, 10, 1]1 → [80, 70, 60, 50, 40, 30, 20, 10, 1] • Association weight adjustment parameter: 20 • Degree of allowed impreciseness: 1 • Frequency of retention procedure: 500 • Compression technique: We use JPEG compressionfor the image hive.

B. NS Operation Parameters Used

These operation parameters (as described in Ap-pendix B) are used during all store, retrieve and retentionoperations/procedures unless otherwise speciﬁed. • S → Assoc T hresh : 0 • S → M atch T hresh : 0.95 • SearchLimit : Maximum (Int Max) • up = 1 • k = 0 Appendix DDynamism of NS: A Detailed Analysis

To visually illustrate how the dynamism of NS we show-case a series of NS framework snapshots in Fig. 9. Sameoperation parameters and hyperparameters as mentionedin Appendix C are used with the followings exceptions: • Memory decay rate of each locality: [10, 20] • Mapping between data features and localities: Allfox/wolf images are designated to stay in locality 0and the rest are assigned to locality 1. • Association weight adjustment parameter: 10 • Frequency of retention procedure: 1The top 4 graphs in Fig. 9 plots the data neuron memorystrength (Y-axis left) and size (Y-axis right) with respectto the number of operations performed (X-axis) for each of the 4 data neurons in the scenario. The 12 snapshots (IDprovided on the lower-left corner) in Fig. 9 are describedbelow:1) In the initial state, there are two data neurons inlocality 0 and one data neuron in locality 1.2) The ﬁrst operation starts with the coarse-grainedcue “Wolf” and the ﬁne-grained cue correspondingto the data-feature of DN1. Using the coarse-grainedcue “Wolf”, the NS framework ﬁrst compares theﬁne-grained cue with the data-feature of DN0. Thematching fails due to lack of similarity.3) In this step, the previous operation continues and theNS framework reaches DN1 using the coarse-grainedcue “Wolf”. The ﬁne-grained cue and data feature ofDN1 matches leading to a successful retrieval. Theweight of < Wolf, DN1 > association is increased. Also,note that the memory strength of all data neuronsdecreases and memory strength of DN1 is restoredback to 100 (indicated by the red arrows).4) The next operation is of type store. The cue providedis “Wolf” and a new data with nothing similar inthe locality 0 is provided. The NS framework ﬁrstattempts to merge the incoming data with an exitingdata neuron. The ﬁrst match with DN1 fails becausethey are not similar. Note that DN1 is searched ﬁrstbecause < Wolf,DN1 > is greater than < Wolf,DN0 > .5) The match with DN0 also fails due to lack of datasimilarity.6) Given that the merge attempt failed, a new data neu-ron DN3 is generated for the new incoming data. DN3gets associated with the other DNs via the default cueneuron (cyan coloured circle in middle) and also getsassociated with cue “Wolf”. Furthermore, the memorystrength of all remaining data neurons decreases.7) The next operation is a retrieve operation with coarse-grained cue “Canis” and the ﬁne-grained cue same asthe data feature of DN1. There are no cue neuron for“Canis”, so the search is carried out via the defaultcue neuron (Locality-0). The ﬁrst search yields DN0which is not a good match.8) The second search yields the correct output. The < default cue neuron, DN1 > strength increase. A newcue neuron for “Canis” is generated and linked withDN1. Also, the memory strength of all data neuronsexcept DN1 decreases and memory strength of DN1is restored back to 100.9) The next operation is also a retrieve operation withcoarse-grained cue “Wolf” and ﬁne-grained cue sameas data-feature of DN0. < DN1, Wolf > has the higheststrength among other “Wolf” associations, so DN1 iscompared with ﬁrst. This however does not yield agood match.10) The next data-neuron searched is DN0 and a goodsimilarity is found. The association < Wolf, DN0 > increases in strength. Memory strength of DN0 isrestored back to 100 and the memory strength of allremaining data neurons decreases.ig. 9: A peek inside the NS framework for understanding its dynamism and plasticity.11) The next operation is of type retrieve with coarse-grained cue “Wolf” and ﬁne-grained cue same as thedata-feature of DN0. The association < Wolf, DN0 > isexplored ﬁrst (because being ﬁrst in the search orderfor cue “wolf”) and a good match is found. Memorystrength of DN0 is restored back to 100 and the mem-ory strengths of all remaining data neurons decreases.Also, the < Wolf, DN0 > association is strengthened.12) The next operation is a store operation and a datavery similar to DN0 is provided with cue “Wolf”.The ﬁrst merge attempt with DN0 is a success andno new data neuron is generated for the incomingdata. Memory strength of DN0 is restored back to 100 and the memory strengths of all remaining dataneurons decreases. Also, the < Wolf, DN0 > associa-tion is strengthened.In the DN0 graph (Fig. 9), we observe that the data sizeand memory strength is maintained at a relatively highvalue throughout the case-study. This is because DN0 havebeen accessed the most among all data neurons. For DN2(background image in locality 1), the memory strengthconstantly decays at a higher rate due to lack of accessand importance. ppendix EAdditional Applications Suitable for NS We have observed that the wildlife surveillance systemand the UAV-based security system can beneﬁt greatlyfrom using NS. We believe that there are many other suchapplications suitable for NS which are widely being used indiﬀerent domains. Some of these applications are discussednext.

A. Agriculture Automation

Automation in agriculture such as weed detection [24],aerial phenotyping [25] and agriculture pattern analysis[26] requires storing and subsequent analysis of hugeamount of image data. A dynamic, intelligent and virtuallyinﬁnite memory framework like NS can adapt to thedemand while providing optimal performance.

B. Forest Fire Detection

Forest ﬁre detection and prompt reaction can save livesand reduce the impact on air/soil. Diﬀerent proposedforest ﬁre detection systems capture, store and analysemulti-modal data such as image, sound, gas proﬁle, tem-perature, humidity etc. [27], [28], [29]. Eﬃciently managingand segregating (important from unimportant) this hugeamount of constantly streaming data is crucial for successand an intelligent priority-driven memory system suchas NS can help optimize storage, retrieval and analysisperformance.

C. Post-Disaster Infrastructure Inspection

Automatic detection techniques for inspecting infras-tructure damages due to natural disasters such as earth-quake and hurricane are being widely explored [4], [3],[30]. Most of these techniques deal with a huge inﬂux ofdata that may not be always relevant to the task. Anintelligent memory framework such as NS can optimizethe entire data retention process to boost overall systemperformance.

D. Maritime Surveillance

Detecting and monitoring maritime activities ofteninvolve SAR (Synthetic-aperture radar) data, standardradar data, infrared data and video data [6]. Eﬃcientlyhandling this huge amount of multi-modal data is crucialfor success and NS would be ideal for such applications.