[PDF] On the Equivalence of Neural and Production Networks

Abstract

This paper identifies the mathematical equivalence between economic networks of Cobb-Douglas agents and Artificial Neural Networks. It explores two implications of this equivalence under general conditions. First, a burgeoning literature has established that network propagation can transform microeconomic perturbations into large aggregate shocks. Neural network equivalence amplifies the magnitude and complexity of this phenomenon. Second, if economic agents adjust their production and utility functions in optimal response to local conditions, market pricing is a sufficient and robust channel for information feedback leading to macro learning.

Full PDF

OOn the Equivalence of Neural and ProductionNetworks

Roy Gernhardt Bjorn PerssonMay 4, 2020

Abstract

This paper identiﬁes for the ﬁrst time the mathematical equivalence betweeneconomic networks of Cobb-Douglas agents and Artiﬁcial Neural Networks. Itexplores two implications of this equivalence under general conditions. First,a burgeoning literature has established that network propagation can trans-form microeconomic perturbations into large aggregate shocks. Neural networkequivalence ampliﬁes the magnitude and complexity of this phenomenon. Sec-ond, if economic agents adjust their production and utility functions in optimalresponse to local conditions, market pricing is a suﬃcient and robust channelfor information feedback leading to global, macro-scale learning at the level ofthe economy as a whole.

There is a burgeoning literature surrounding the network origins of aggregate shocks.Network economics can model how an individual’s behavior eﬀects the society inwhich she is embedded (Goyal, 2015). Several mathematical frameworks have beenproposed that amplify small microeconomic ﬂuctuations into large macroeconomicmovements (Acemoglu et al., 2012, 2015) which would otherwise be modeled as ex-ogenous shocks. This papers core innovation is that it identiﬁes a more natural andprofoundly more ﬂexible mathematical model than previously identiﬁed. This modelis intrinsic to existing theory. Speciﬁcally, economic networks of Cobb-Douglas agentsfacing nonlinear inverse demand functions are mathematically equivalent to ArtiﬁcialNeural Networks (ANNs.)We survey the pertinent mathematics at the intersection of the two disciplinesof Economics and Machine Learning to demonstrate this equivalence. We exploretwo implications of modeling the economy as a Neural Network (NN) under generalconditions. The ﬁrst is a consequence of the functional completeness of NNs: neitherthe amplitude nor complexity of an aggregate shock in response to a given microeco-nomic input can be reasonably bounded. Any conceivable functional form is possible.1 a r X i v : . [ ec on . T H ] M a y econd, if the producers and consumers are free to adjust their production and utilityfunctions in optimal response to local conditions, market pricing is a suﬃcient androbust channel for information feedback leading to global, macro-scale learning at thelevel of the economy as a whole. This learning is transparent to the individual produc-ers and consumers. For aesthetic reasons we address these three topics - mathematicalequivalence, endogeneity/exogeneity of macro shocks, and global learning behavior -in reverse order. As an initial exploration into this topic, imagine the ﬁctional country of Islandia, asmall, isolated nation somewhere in the South Paciﬁc. Its economy is driven by theimport of raw materials and the export of ﬁnished goods.Islandia’s imports consist of only two products: raw steel and brass. The countryuses these two raw materials to fashion intermediate goods which are in turn usedto fashion its single export commodity – a product for which Islandia is renownedthroughout the world – clockwork chess players. Each month, a cargo ship from thenearest Economic Super-Power oﬄoads a shipment of brass and steel and hauls awaya load of Mechanical Turks to be sold in catalogs, websites, and shopping malls. Ittakes exactly one month for the small nation to convert brass and steel into chessplayers.The producers in Islandia form a simple hierarchy. Eight work with the raw steeland brass to create tools, tubing, spring steel, gear blanks, cog blanks, and the like.These eight producers are the ﬁrst level in the economy, and their only inputs arethe two imported raw materials. There are then eight producers in the second levelusing some of each of the ﬁrst level’s products as inputs. Levels three and four eachhave eight more producers who use the previous level’s outputs as inputs. The eightoutputs of level four are the sub-components used by the single level-ﬁve producerto make clockwork chess players. So there are 33 individual producers in Islandia,all of whom are organized into well-deﬁned layers where 32 producers manufactureintermediate goods, and one producer manufactures the ﬁnal good.The producers in Islandia all have Cobb-Douglas production functions of the form: Y = A n (cid:89) i =1 x α i i , (1)where A is a technology parameter shared with all of the producers within the same2 echanicalTurks forExport Islandia

Level 1Producers Level 2Producers Level 3Producers Level 4Producers

Brass

Steel

Figure 1:

Input/Output matrix of Islandia economy. Arbitrary producers are highlighted for visualclarity. layer. The producers each attempt to maximize their own proﬁts: π = P Y − n (cid:88) i =1 p i x α i i , (2)where P is the anticipated price (the price the producer believes she will get for herﬁnished good at the time she makes her production choices), p i is the price of theinput commodity x i , and n is the number of input commodities (always eight exceptfor the ﬁrst layer, where it is two).Prices are set through Walrasian tatonnement. Therefore, the accuracy of theanticipated price cannot be known at the time of production. The actual price man-ufacturer m receives for her output minus the price she expected to receive gives m a value for her “pricing error.” She uses the pricing error to estimate her productionerror – that is, the amount that she over or under produced. This production errorcould also be called, using the terminology of Artiﬁcial Neural Networks (ANN’s),her “cost function.” She is motivated to minimize this cost function, and makes twotypes of adjustments to do so.Once the producer has calculated her cost function, she ﬁrst adjusts her antici-pated price. She uses a simple moving average for this adjustment. Next, she employsa single step of the gradient descent procedure (or similar) to make small adjustments3o the exponents in her production function (Ruder, 2016). She is able to make theseadjustments through slight changes to the processes and technologies that she em-ploys.Let E be the production error, or cost, let ∂Y /∂α i be the partial derivative ofoutput with respect to exponent α i , and let µ be a “learning rate” parameter. Usinggradient descent, producer m will update each of the exponents of her productionfunction according to the rule: α (cid:48) i = α i − Eµ ∂Y∂α i . (3)All manufacturers on Islandia use the same learning process as producer m . Theirinformation set does not allow any of them to make adjustments to their productionbased on the quantities of imports or exports from the nation – but together, theireconomy forms what we call an Economic Neural Network (ENN) that deﬁnes andreﬁnes decision boundaries – regions of the two-dimensional (brass and steel) importquantity map for which diﬀerent production levels are preferred.The arrangement of the interconnections between neurons in a Neural Network(NN) is known as the Neural Architecture (Misra and Saha, 2010; Svozil et al., 1997).Islandia has a simple “Feed Forward” architecture – that is, each neuron outputs onlyto neurons on the next layer. There are no outputs to neurons of the same or previouslayers. Feed Forward networks can vary by the number of “hidden layers” (a termreferring to all layers except the input and output layers – the Brass and Steel andthe Mechanical Turks in Islandia) and also the number of neurons in each layer. Ingeneral, the number of neurons per hidden layer deﬁnes the number of dimensions usedto calculate the decision boundary while the number of layers deﬁnes how convolutedthe decision boundary can be.We note that the ENN on Islandia performs very poorly by Artiﬁcial NeuralNetwork standards. Unlike the NN’s used in Artiﬁcial Intelligence applications, theeconomy of the little nation is not optimized for network performance. Every agentoptimizes their choices based only on the local price feedback mechanism (the localerror), not on the component share in the global error (as calculated by Islandia’sexport market and ﬁnal producer – as would be the case in an eﬃcient ANN.) Be-cause price information must trickle down the network layer by layer, the informationa producer uses to adjust production is delayed by several periods compared to theinformation that was used to direct that production. This phase delay scramblesmuch of the information about the economy’s overall optimization problem and pre-vents the country’s ENN from learning to predict with accuracies that rival ANNs.Economies less rigid than Islandia’s are not as highly constrained. Price is not theonly information channel available to most producers. Economic reports, stock mar-ket performance, communication with suppliers and customers, and other businessnews sources give each producer direct information about global system performance.Another factor that prevents Islandia’s ENN from approaching the performanceof an optimized ANN is that the Cobb Douglas production functions are strictly4

20 40 60 80 100020406080100 Quantity Raw Steel Import Q u a n t i t y B r a ss I m p o rt Training Datasets 1,2 Q u a n t i t y B r a ss I m p o rt Training Datasets 3,4 Q u a n t i t y B r a ss I m p o rt Training Datasets 5,6

Figure 2:

Maps of the six training dataset result values as a function of input quantities. Forodd numbered datasets the shaded region is true and the unshaded region is false, vice-versa foreven numbered datasets. Each training dataset consists of 100 periods worth of randomly generatedinput-result vector pairs. The 2D input vector consists of quantities imported of steel and brass andthe 1D boolean result vector is true if demand in the period is higher than average. Using dataset1 as an example, let the randomly generated input vector be 40 units of steel and 40 units of brass.The map deﬁnes the matching output vector as true. In contrast, dataset 5 would map the samepoint to false. increasing with respect to the input quantity of goods. Neurons within ANNs, on theother hand, use functions that can be either increasing or decreasing, as needed, withrespect to their inputs. This means, for example, that Islandia cannot manufacturemore Mechanical Turks in response to a shortage in both steel and brass – evenif the corresponding increase in steel and brass prices consistently signal booms indemand for clockwork chess players next period. We will examine the mathematics ofthis phenomenon below and demonstrate realistic economic models that do not suﬀerfrom this weakness.All Neural Networks, whether ANNs, biological systems, or ENNs, are subjectto trouble with local optima. This is certainly an issue faced by each of Islandia’sproducers and the country as a whole, but there is another notable and idiosyncraticbarrier to learning in Islandia. Every producer has amnesia. They have completemedium and long-term memory loss. Therefore each producer can make exactly oneadjustment to their production function per period based only on the information infront of them. This is equivalent to a hyper-restricted “online learning” paradigmin an ANN (Misra and Saha, 2010; Svozil et al., 1997). High-power ANNs, how-ever, typically use “batch-learning” processes. In batch-learning, the error functionis calculated from a large number of training examples (in Islandia’s case, periods)simultaneously. This is advantageous because a single training example might leada Neural Network to update in such a way that it improves performance for thatsingle example but degrades performance for all others. Given the further possibilityof local optima, this means a memoryless online learning paradigm will sometimeslead to network “unlearning” – that is, training decreases performance rather than5mproves it. Fortunately, most producers in the world economy have a memory sothis is only a problem in Islandia.

Table 1:

Average Islandia ENN Learning Performance Over 20 TrialsInitial Accuracy Post-training Accuracy Training Improvement(Std. Error) (Std. Error) (t-score)Dataset 1 20.15% 32.2% 12.05%**(0.69%) (3.35%) (3.3503)Dataset 2 76.35% 90.3% 13.95%***(1.34%) (0.5624%) (9.586)Dataset 3 70.2% 76.85% 6.65%***(0.73%) (1.0495%) (5.1905)Dataset 4 27.65% 79.1% 51.45%***(0.84%) (0.764%) (45.2923)Dataset 5 49.7% 54.8% 5.1%*(1.13%) (2.0424%) (2.1865)Dataset 6 49.05% 54% 4.95%**(1.10%) (1.4654%) (2.7024)

Average Percent Accuracy of Islandia’s ENN before and after training using the six diﬀerent datasetsillustrated in in Figure 2 on page 5. Each result is the average of 20 independent trials with randominitialization of the ENN’s parameters. Training Improvement is deﬁned as (Percent Accuracy post-training) minus (Percent Accuracy pre-training).*,**,and *** denote greater than 95%, 99%, and 99.99% conﬁdence rejection of the one-sided nullhypothesis of no average training improvement.

Even with Islandia’s limitations, however, its economy shows a remarkable abilityto learn – that is, its output changes in response to patterns in the input prices. Al-though this learning is driven by individual microeconomic optimization choices, theindividual producers are unaware learning is occurring. It happens in the aggregate,on the macroeconomic level, and not on the level of the producers who drive it.Islandia is worth studying in this regard because of the simple structure of itseconomy. We are able to create a precise and tractable computer model by which tosimulate its learning performance. The learning response of the model of Islandia’seconomy with six diﬀerent datasets is reported in Table 1. The datasets consist of 100periods of import supply and export demand data randomly generated to conform to6he patterns in Figure 2 on page 5. For each period there is a pair of input quantities(representing raw steel and brass), each ranging between 0 and 100, and a Booleanvalue assigned to “true” if Islandia can expect higher than normal demand for theirclockwork chess players. The six datasets each exhibit a diﬀerent pattern determiningwhich regions of the input / output map are true and which regions are false. TheENN attempts to converge to a diﬀerent optimal decision boundary for each dataset.We initialized twenty independent model economies for each dataset, and in eachcase trained the economies over 600 randomly drawn rounds (each round consistingof a single training example from the dataset.) To measure learning, we ﬁrst deﬁnethe threshold output quantity as the average output of the untrained economy acrossa subset of the training set inputs. We then interpret an output above this thresholdas corresponding to the Boolean true value and an output below the threshold ascorresponding to Boolean false. The Percent Accuracy of the network is the propor-tion of output true/false values that match the input true/false values. We deﬁne asimple measure of learning performance as the increase in Percent Accuracy – thatis, the Percent Accuracy of the trained network minus the Percent Accuracy of theuntrained network.Learning performance is reported in Table 1 on page 6. Using dataset 1, theENN Percent Accuracy increased 12 .

1% on average. This was the most diﬃcultdataset for the ENN to respond to since lower input quantities of the raw materialsneeded to correspond with higher output quantities of the ﬁnished good. With dataset2, Percent Accuracy increased 14 .

0% on average and with dataset 3 it increased6 . .

3% Percent Accuracy. Dataset 4 yielded an accuracy increaseof 51 . . .

1% and dataset 6 accuracyincreased by 5 . . At ﬁrst blush, the equivalence between ENNs and ANNs might seem a mere curiosity.So we must address two practical questions. How does this model ﬁt within the ﬁeldof economics’ current understanding? And why is this important?The ENN model is certainly distinct from the canonical models which are basedon smooth, typically monotonic and convex functions that describe aggregations of7icroeconomic agents. But the ENN model does not challenge the existing models.The two types of models ask and answer diﬀerent questions.Imagine we are analyzing a teenager’s trip to a movie theater which they makeevery Saturday. There are twelve turns, each at the next intersection. The turnsare, in order:

L, L, R, L, L, R, R, R, R, L, R, L . If we are interested in the wear on thecar’s tires, we might ask about the number of turns and their distribution betweenleft and right. The fact that there are twelve turns, half of them in each direction, isuseful information in that context. If this information was given to another teenagedriver looking for directions to the movie theater, however, it would be of almostno value. With this information, the driver could do no better than ﬂip a coin ateach intersection, turning left on heads and right on tails. Directions are logicalinstructions which cannot be decoded from diﬀerentiable probability distributions.This illustrates the inadequacy of modeling any logical system in the aggregate –the encoded logical function is completely transparent to the aggregate model. It iswashed out and subsumed in what appears to be random noise. Since, as we show, themacroeconomy can be understood as a logical system, aggregate modeling cannot besuﬃcient to account for every aspect of the economy’s behavior. The ENN, however,models the economy as a logical function rather than a smooth function, so it issubject to a diﬀerent set of limitations.As for the importance of the ENN model, the Islandia computer simulations illus-trate that ENNs retain some of the learning capabilities of ANNs even under highlyrestrictive conditions. This meta-learning is a high-order emergent phenomenon whichis entirely novel to the ﬁeld of economics. Next we explore how small, local shockscan propagate through an ENN to generate sometimes highly surprising aggregatebehaviors. Here we build upon the arguments of Vasco Carvalho (2014), who usesnetwork structures to “confront a deep-seated and inﬂuential logic which, to this day,justiﬁes the continued appeal to an exogenous synchronization device, in the form ofaggregate shocks.” Our model ampliﬁes Carvalho’s reasoning. When a laborer takesmaternity or bereavement leave or contracts a serious illness, or when a companychanges hands through inheritance, these micro shocks are not necessarily lost in therandom noise of the aggregate economy. Instead, they can become important signalswhich reverberate through the markets and change the direction of the economy as awhole.A modern economy can be described as an intricate system of networks in whichconsumers and producers interact via reasonably eﬃcient markets. Within thesenetworks, small changes directly aﬀecting only a small set of producers within a sectorof the economy may translate to large and unforeseeable changes in the aggregate.This system is too complex to describe in detail, and impossible to model explicitlyusing standard economic models. Standard models often aggregate heterogeneityinto distributions according to various assumptions and thereby lose this importantfeature of the economy they seek to model. Attempts have been made, however, tobetter understand the inner workings of production networks and how micro shocks in8hem produce macroeconomic ﬂuctuations (see e.g. Acemoglu et al. (2012), Carvalho(2014)).Here we are considering a world where explicit modeling is not possible due tothe complexity of the various networks that comprise the economy. A key featureis to retain the idiosyncrasies that characterize real economies and how they propa-gate through the network, producing sometimes unpredictable, inexplicable, and, notseldom, unintended consequences.For tractability, we speciﬁcally consider a simple network architecture with oneinput, one hidden layer, and a dual output. The set-up is inspired by Rosenblatt’sperceptron model in which binary classiﬁers learn to recognize patterns through analgorithm when fed training data (Rosenblatt, 1961). Our model is a slightly modiﬁedversion of the original perceptron. We will address the mathematic justiﬁcation forthis in detail below. The network is one directional so there are no feedback loops.Consequently, unlike both the perceptron model and Islandia, learning cannot occur.Output seems quite random, yet follows logical principles that are unobservable. Out-side empirical data can be ﬁtted to the model, and this data will be consistent withreality. Naturally, this data will be overﬁtted. Standard models rely on exogenousvariation (i.e., exogenous shocks) to explain e.g. aggregate business cycle ﬂuctuations– here these ﬂuctuations arise naturally from individual changes in behavior that mayoccur for endogenous reasons.We use a basic Cobb-Douglas technology in which a single input price is fed toa set of up to ten producers who produce intermediate goods to the ﬁnal producerin the economy. The ﬁnal producer then chooses between two kinds of outputs. Weillustrate the ratio of these two outputs in Figure 4 on page 11 with respect to theprice of the input commodity. In order for the impacts of these behaviors to be asclear as possible, it is helpful to imagine the two products are Labor and Leisure. Suppose the ﬁnal producer is a hot dog vendor choosing whether to provide snacks at abaseball game or instead to take a seat and watch the Red Sox ﬁght oﬀ their arch rivals(who shall remain nameless.) The modiﬁcation of the vendor’s production function orthe cascading eﬀects of modiﬁcations to intermediate producers’ production functionslead to unforeseen and sometimes highly unpredictable and non-monotonic changesin the labor/leisure decision of the vendor. In other words, unscalable individualdecisions produce aggregate outcomes that could not be easily modeled within thecanonical neoclassical framework. There is a complicating factor which limits this computer model’s accuracy with regard to aproducer’s labor/leisure choice. It would not be an issue if choosing between say, hot dogs andhamburgers, but with Labor/Leisure, total output is limited by the number of hours in the day.The addition of this constraint yields a maximization problem that is not generally closed-form andconvex. This can only amplify the behaviors we examine. We stand by our labor/leisure narrativebecause it elegantly illustrates the essential parts of our message. But for simplicity of optimizationour math ignores the constraint imposed by a 24-hour day. ...

Intermediate GoodProducers ...

Leisure GoodProducersVendorMakeHotdogs WatchThe Game

Figure 3:

The four models in ﬁgure 4 are Cobb Douglas producer networks. The so-called hiddenlayer has ﬁve producers in models two and three and ten producers in models one and four. Hiddenlayer members are either intermediate good producers or leisure good producers. These goods areinputs to the network’s ﬁnal output producer. All hidden layer production functions take the form Y i = A i x α i where x is the quantity of the sole raw material required to produce intermediate andleisure goods in this economy, Y i is output, and A i is the technology parameter. The networks ﬁnalproducer outputs Y l = A l (cid:81) Kk =1 x ka k where l ∈ { Intermediate, Leisure } and k is the index of the k th labor or leisure good (as applicable). Figure 4 illustrates four examples of the hot dog vendor network. Relatively smallchanges to the producers’ production functions alter the output ratio considerably,and in highly unpredictable ways. Two things are worth noticing. First, the hot dogvendor behaves almost erratically as the input price changes. The ratio of Labor toLeisure takes very diﬀerent values in a relatively small input range. Second, smallchanges in the Cobb-Douglas parameters yield substantial changes in the shape ofthe graphs. This again underlines the diﬃculty of predicting output variation causedby production changes in the hot dog technology.Of course the choices of a single hot dog vendor are not enough to move the needleon national unemployment or GDP ﬁgures. But the capacity of this type of networkto generate and transmit these erratic behaviors only increases with the network’scomplexity. When Labor and Leisure choices are writ large, unpredictable behaviorcan be catastrophic. 10 − L a b o r / L i e s u r e R a t i o

1) Improved RTS for an intermediate good: α Int increases from 0.675 to 0.875 − −

2) Leisure technology improves:All leisure good tech parameters double − −

4) Improved RTS for all Leisure goods: α Leisk doubled for all k. − − L a b o r / L i e s u r e R a t i o

3) Improved RTS for ﬁnal producer:multiply α Outlk by 2 then by 1.5 for all l, k

Figure 4:

The simple single-hidden-layer network architecture (inspired by Rosenblatt’s percep-tron) used in these functions is described in ﬁgure 3. See table 2 for initial parameters.These hypothetical (but feasible) labor/leisure ratio functions are nonmonotonic, even chaotic. Asthe log price of the single raw material increases, the functions change regimes sharply and unpre-dictably. An observer without a network-based economic model must describe such regime changesas exogenous shocks. An additional source of such shocks is also illustrated here: adjustments tothe initial model parameters. These can cause unexpected and counterintuitive changes to the la-bor/leisure ratio function both in level and in shape.The dashed line denotes the initial parameters, the dotted line (where present) denotes intermediateparameters, and the solid line denotes the ﬁnal parameters.

Table 2:

Parameters for Figure 411odel Product Type A Int/Leisi α Int/Leisi A outl α outlk Model 1 Intermed 9 0.675 1.684 0.091” 0.022 0.954 0.091” 0.389 0.964 0.13” 5,066 0.265 0.091” 2.2 0.53 0.01Leisure 11 0.53 1.5 0.01” 0.083 0.974 0.083” 2.399 0.964 0.091” 85.516 0.742 0.001” 22 0.53 0.142Model 2 Intermed 0.002 0.909 1.6 0.375” 50 0.95 0.01” 0.383 0.98 0.057Leisure 0.406 0.968 1.5 0.091” 0.002 0.98 0.091Model 3 Intermed 0.005 0.909 847.277 0.111” 41.389 0.909 0.067” 41.389 0.909 0.067Leisure 0.439 0.909 0.1 0.111” 34,600,299 0.909 0.067Model 4 Intermed 47.275 0.476 6.009 0.01” 41.389 0.455 0.01” 2,105 0.417 0.01” 0.003 0.49 0.048” 0.541 0.484 0.038Leisure 34,600,299 0.455 1.5 0.01” 2.707 0.455 0.02” 0.002 0.455 0.231” 0.383 0.49 0.029” 5,530 0.476 0.005

The networks’ initial parameters are described as follows: A Int/Leisi refers to the Cobb Douglastechnology parameter of the ith intermediate or ith leisure good, α Int/Leisi refers to the CD exponenton the sole (raw material) input of the ith intermediate or ith leisure good. A outl refers to thetechnology parameters of the output producer, and α outlk refers to the output producer’s exponenton the kth input of type l where l ∈ { Intermediate, Leisure } . Mathematical Relationship Between Cobb-DouglasProducers and Artiﬁcial Neurons

Islandia and the hotdog vendor behave like Artiﬁcial Neural Networks for two reasons.First, the microeconomic agents are mathematically analogous to artiﬁcial neurons,the fundamental components of ANNs. Second, the microeconomic agents interactwith each other in a way that is mathematically analogous to the interactions betweenneurons in an ANN. To demonstrate this equivalence, we will begin by modelling anANN neuron.ANN neurons combine weighted input signals, which are the outputs of otherneurons, in a nonpolynomial fashion. Theoretically, the inputs can be weighted in anyfashion (exponentially, polynomially, etc.), but in practice what we call a StandardArtiﬁcial Neuron (SAN) takes the form of Equation 4 (Minsky et al., 2017). P i = a ( b i + ω i · p k ) , (4)where a ( x ) is an “activation function”; p k is a vector of output values from theprevious layer k ; ω i is a vector of “synaptic weights” or coeﬃcients belonging toneuron i ; b i is a “bias” belonging to neuron i (often in practice included as the ﬁrstentry in ω i while p k is padded with a leading entry equal to 1); and P i is the outputof neuron i . P i is destined to be one of the entries in the vector p k +1 to be used inthe next layer.The activation function, a ( x ), can take many forms, with diﬀerent functions havingdiﬀerent properties, but as long as the activation function is nonpolynomial an ANNwith linear weights will function to some degree (Cybenko, 1989; Ritter and Sussner,1996).Due to a number of desirable properties, one commonly used activation functionis the “Softplus” function expressed in Equation 5 (Glorot et al., 2011): σ ( x ) = ln (1 + e x ) (5)We will modify this equation slightly: a ( x ) = − σ ( x ) = ln (cid:18)

11 + e x (cid:19) (6)This will have the exact same properties as the Softplus, but it will reﬂect the out-put around the x-axis. This inversion has no eﬀect on the network because thefollowing layer’s neurons can neutralize it by inverting the sign of their weight vec-tors. The ANN neuron speciﬁcation using the inverted Softplus is equivalent (withthe constraints we discuss below) to a proﬁt maximizing, price-taking Cobb-Douglasproducer facing an unknown inverse demand of the form : ρ ( Y i ) = 11 + Y i (7) The inverse demand function is unknown to the producer in this model for two reasons: ﬁrst, it’s Y i is the fully diﬀerentiated but substitutable output of producer i and ρ ( Y i )yields the price of each unit of Y i . This inverse demand function is not mathematicallynecessary for economic agents to form an ENN, but it has reasonable theoreticalproperties: It exhibits a ﬁxed maximum price, is decreasing in Y , and approacheszero asymptotically. Furthermore, it is exactly the function needed to make plain theequivalence between standard economic agents and Softplus activated neurons, henceour reason for choosing it.Now we examine the Cobb-Douglas Producer-cum-neuron in the ENN. Let m be an index corresponding to a neuron in the previous layer; A be the “technologyparameter”; x ji be the input quantity demanded by producer i for the jth good; α ji be the exponent in i’s Cobb-Douglas function parameterizing the jth good; P i be theprice that producer i expects to get for each unit of her output at the time of herproduction decision; p j be the price per unit of input j.The Decreasing Returns to Scale (DRS) Cobb-Douglas production function is: Y i = A i n (cid:89) m =1 x α mi mi , (8)where x mi , α mi < , Φ i < , and where Φ i = n (cid:80) m =1 α mi and η i ≡ Φ i − α i . The proﬁtfunction is: π i = P i Y i − n (cid:88) m =1 p m x mi . (9)Maximizing proﬁt, the ﬁrst order conditions give us equations 10 and 11: x ,i = (cid:34) ( P i A i ) − (cid:18) p α i (cid:19) − η i n (cid:89) m =2 (cid:18) p m α mi (cid:19) α mi (cid:35) i − . (10) x mi = x i p α mi p m α i (11)The ENN model requires output in terms of prices alone. Inserting equations 10 and11 into equation 8 yields equation 12: Y i = (cid:34) P − Φ i i A − i n (cid:89) m =1 (cid:18) p m α m,i (cid:19) α m,i (cid:35) i − . (12) a reasonable economic assumption that a price-taking producer’s price prediction is imperfect andsecond, it makes the math more tractable. If we instead use the strong assumption that the functionis known to the producer, as is often done in simpliﬁed economic models, a less elegant activationfunction is implied and no closed form proﬁt maximization solution exists for this particular inversedemand function. This strong assumption would not aﬀect NN equivalence because the price of theproducer’s output would still be a nonpolynomial function of the prices of her inputs. Therefore,our analysis still holds because the ENN is still functionally complete (that is, any ﬁnal output pricefunction can be approximated to an arbitrary degree of precision with correctly chosen parametersand a large enough network.) Y i = n (cid:88) m =1 α mi Φ i − p m + Φ i ln P i + ln A i + (cid:80) nm =1 α mi ln α mi − Φ i . (13)Now, let ω m ≡ α mi Φ i − , let ω be the vector of all ω m , let l ≡ ln p calculated elementwise(that is, the vector of log input prices,) let z i = Φ i ln P i + ln A i + (cid:80) nm =1 α mi ln α mi − Φ i , and let the log price of output, L i ≡ ln ρ ( Y i ) = ln (cid:16) Y i (cid:17) by (7). Then ln Y i = l k · ω k + z i and a ln Y i = L i by (6), and the price of the good output by a Cobb-Douglas producer can be modelled in logs with equation 14: L i = a ( l k · ω k + z i ) . (14)This is the standard ANN neuron as expressed in (4). In the ENN context, themarket prices of the producers’ inputs are equivalent to neuronal input values andthe price of the producer’s output is equivalent to neuronal output values. Althoughthe mathematical model we show speciﬁcally invokes the Cobb-Douglas producer, theequivalence between an economic agent (whether goods producer or labor producingconsumer) and an ANN neuron is general and robust to production, utility, and inversedemand function speciﬁcations. As long as the price of an output good cannot beexpressed as a linear combination of the prices of input goods, the production of thatoutput good within a network of similar production activities creates a functionallycomplete network. The hot dog vendor’s ﬁnal neuron, interpreted as the ratio of labor output to leisureoutput gave the hot dog network an important property that Islandia did not have:functional completeness. As we mentioned above, the Cobb-Douglas has constraintsthat do not bind the ANN neuron. These constraints are the source of some ofthe learning limitations we discussed in Islandia’s economy. The exponents of theproduction function, α mi , must be greater than zero and the function we speciﬁedmust be DRS, that is, Φ i must be less than one (DRS is required so that the proﬁtmaximization process we use is valid). Therefore, the weights ω mi are strictly lessthan zero. This limits the ENN’s learning potential compared to an ANN which canhave both positive and negative weights. As our computer simulation shows, thisconstraint does not fully remove the ENN’s ability to learn. Nor is this constraintevident in real world production functions. It is an artifact of the Cobb-Douglas,15hich we have chosen because of its prominence as a theoretical standard rather thanfor empirical reasons.Our choice of the sub-optimal Cobb-Douglas speciﬁcation serves as both a ro-bustness test of ENN behavior and a nod to canonical economic theory. We useit exclusively in our simulations for those reasons. A more realistic ENN model,however, would include heterogenous household utility functions and ﬁrm productionfunctions that inherently overcome the limitations imposed by homogenous Cobb-Douglas producers. In order to demonstrate how easily a practical ENN speciﬁcationcan emulate the full range of ANN behavior, we consider the NAND gate.The NAND gate is functionally complete, and since a simple single hidden layerNN can emulate the NAND gate, Neural Networks are also functionally complete byextension. This means that all possible logical functions can be approximated to anarbitrary degree of precision by an NN (AbuMostafa, 1986; Leshno et al., 1993; Ritterand Sussner, 1996). This argument is most intuitive when using neurons with Heav-iside step activation functions, but it is well understood in the ﬁeld that functionalcompleteness extends naturally to Standard Artiﬁcial Neurons with diﬀerentiableactivation functions as speciﬁed above. The SAN does not output two discreet val-ues, but the values can be interpreted as probabilities with an actionable probabilitythreshold. If a neuron outputs a value greater than or equal to the threshold, it canbe interpreted as the binary 1 (or electronic HIGH, or logical TRUE), and if belowthe threshold, it can be interpreted as the binary 0 (or electronic LOW, or logicalFALSE). NAND Truth Table

Input 1 Input 2 Output0 0 10 1 11 0 11 1 0NAND gates can be physical objects implemented in electronic circuits or can belogical operators that input two Boolean variables.Because the synaptic weights of the Cobb-Douglas Producer Neuron (CDPN) inequation 11 are strictly negative, it cannot form a NAND gate. Neither can it forma NOT gate, which could be used in series with an AND or OR gate to build afunctionally complete network. In order to establish the feasibility of a functionallycomplete ENN, we will specify parameters of a CDPN to form an AND gate and thenspecify a Cobb-Douglas-like production function and parameters to form a NOT gate.The NOT gate can then be joined in series with the AND to form the functionallycomplete NAND gate.One example set of parameters with which a two-input CDPN can emulate anAND gate is as follows (here using a sigmoid activation function: a ( x ) = ln (cid:0) e x (cid:1) ) : α = α = P i = 1 A = e α Φ1 . ω = ω = − , and z = 30 . Let this CDPN be called producer (oragent or neuron) X. Let the producer emulating the NOT gate be called the Ψ neuron.Let Ψ have the capability of producing one of two products, Y or Y , each usingCD functions with a single input. But due to the production technology employed, Ψcannot produce both Y and Y . Optimal output of each product is calculated by Ψ,and she then chooses to produce the product with the greater proﬁt at that optimallevel, thereby forgoing any production of the less proﬁtable product. This productionfunction satisﬁes the following expressions:( Y = 0) ∨ ( Y = 0) (15)and (cid:0) Y j = A j x α j j (cid:1) ∨ ( Y j = 0) for j = 3 , . (16)Using the same process that transformed (8) into (14) above, (16) can be re-expressed in terms of output prices, in logs:ln ρ ( Y j ) = a ln Y j = a ( ω j ln p j + z j ) ∨ L j for j = 3 , . (17)Producer Ψ, as expressed in (17) has two outputs, L and L , and two inputs, p and p . If input p is constant (or varies within a range, or has less volatility than p )the parameters of Ψ can be set so that its output L and input p form a NOT gate.One set of parameters with which Ψ can emulate a NOT gate (when ln p ≈

0) isas follows: P , P = 1 , α = , A = 3 , α = 1 − e − , A = − α ) − α α α . Combining agents X and Ψ with the parameters listed above into a simple twoneuron network yields a functioning NAND gate. Let the output of X be L ≡ ln p .That is, the output product of X is the ﬁrst input of Ψ. The output of this networkis shown below in truth-table format. Since the NAND gate is functionally complete,any ENN which incorporates producers who can shift production between two or moreproducts may also be functionally complete. Of course, other conceivable productionfunctions can yield the same results. The only requirement is that some of the pro-ducers in the network experience, within some range of conditions, circumstances inwhich an increase in the price of one or more of their inputs induces an increase inthe output of one or more of their products.17 AND gate ENN (with agents X and Ψ ) lnp lnp lnp lnp L (logical input 1) (logical input 2) (ln p ∧ ln p ) (assumed const.) ¬ (ln p ∧ ln p ) X inputs are priced p and p and output is priced p , Ψ inputs are priced p and p , and outputs are priced, in logs L (which is unused here) and L . Our macroeconomic understanding, until now, has been rooted in smooth functionsof aggregations of microeconomic agents. The possibility of emergent economic phe-nomena has been implicitly recognized for some time, but our assumptions may haveinadvertently suppressed their study by oﬀering no mechanism for emergent behav-iors to arise. We have analyzed two critical examples of this. The endogeneity ofapparently exogenous shocks can be easily masked by the complexity of the ENN’snetwork interactions. Furthermore, because an ENN is capable of learning en masse,economic structures (i.e. traditions, norms, or institutions) may evolve which pro-vide beneﬁt to an aggregation without apparently providing utility to the individualagents involved. Further study of these interactions may yield novel insights intomarket failures like the Tragedy of the Commons and unfavorable game theoreticalequilibria.Our studies of the ENN model are just beginning. Perhaps the most promisingimplication of the ENN model is its use in government and NGO policy formation.The ENN’s learning behavior implies that policies might employ active training tech-niques to aﬀect change. We cannot say with conﬁdence that this is possible – but theprospect is tantalizing enough to motivate further study. If it is possible to identify atraining ‘handle’ – by which we mean an input through which to deliver rewards andpunishments – and a low latency data source to observe an economic response to thevarying economic landscape, then most any desired economic reform could be made.For example, imagine an NGO targets inner-city Detroit for economic redevel-opment. The NGO recruits a representative sample of Detroit businesses to submitweekly sales ﬁgures. These are the data source. The NGO also identiﬁes a traininghandle (say, a weekly cash subsidy given to a sample of local households.) Whenthe sales ﬁgures are below a threshold, a punishment is delivered through the handle(perhaps the subsidy is lower than expected) and when the sales ﬁgures are higher,the reward is delivered (likely the inverse of the punishment.) The mathematical18tructure of the ENN as well as our Islandia computer model imply the Detroit busi-nesses will almost certainly increase their sales if they are connected downstream ofthe household. But the reality of the Detroit economy is not simple. The architectureof the ENN is not Feed Forward, like Islandia, but is instead, Recurrent. Furthermore,identifying the training handle – indeed, identifying whether or not such a traininghandle can even exist in a recurrent ENN – is not trivial. Hence the need for furtherstudy.We do not assert the ENN model supplants the canonical smooth models in anyway. These two models are complementary and coexist without conﬂict. Here weappeal to the mathematical analogy between biological brains and ANN’s, which isquite robust. Medical science has learned much about the biological brain, has em-ployed powerful medical imaging, and has used eﬀective surgical and pharmacologicaltechniques based solely on aggregate approaches. Virtually no medical technologycurrently relies on the precise identiﬁcation of the synaptic weights of individual neu-rons. Aggregate measures for diagnosis and prediction – in both the biological brainand the ENN – will likely always be superior. Measurement precision and computingpower are ﬁnite. But the complexity of a Neural Network is vast and perfect precision– an impossibility – would be required to accurately predict its output. A simpliﬁedmodel cannot predict the output of a Neural Network. Therefore, models of the ENNcan only illuminate certain aspects of its behavior. These are the topics of furtherresearch.

References

AbuMostafa, Yaser S. (1986) “Neutral networks for computing?”

AIP ConferenceProceedings , 151 (1), 1–6, .Acemoglu, Daron, Vasco M. Carvalho, Asuman Ozdaglar, and Alireza Tahbaz-Salehi(2012) “The Network Origins of Aggregate Fluctuations,”

Econometrica , 80 (5),1977–2016, 10.3982/ecta9623.Acemoglu, Daron, Asuman Ozdaglar, and Alireza Tahbaz-Salehi (2015) “Networks,Shocks, and Systemic Risk,”Technical report, 10.3386/w20931.Carvalho, Vasco M. (2014) “From Micro to Macro via Production Networks,”

Journalof Economic Perspectives , 28 (4), 23–48, 10.1257/jep.28.4.23.Cybenko, G. (1989) “Approximation by superpositions of a sigmoidal function,”

Mathematics of Control, Signals, and Systems , 2 (4), 303–314, 10.1007/bf02551274.Glorot, Xavier, Antoine Bordes, and Yoshua Bengio (2011) “Deep Sparse RectiﬁerNeural Networks,” in Gordon, Geoﬀrey, David Dunson, and Miroslav Dudk eds.

Proceedings of the Fourteenth International Conference on Artiﬁcial Intelligence nd Statistics , 15 of Proceedings of Machine Learning Research, 315–323, FortLauderdale, FL, USA: PMLR, 11–13 Apr, http://proceedings.mlr.press/v15/glorot11a.html .Goyal, S. (2015) “Networks in Economics: A Perspective on the Literature,” 10.17863/CAM.5774.Leshno, Moshe, Vladimir Ya. Lin, Allan Pinkus, and Shimon Schocken (1993) “Mul-tilayer feedforward networks with a nonpolynomial activation function can approx-imate any function,” Neural Networks , 6 (6), 861–867, 10.1016/s0893-6080(05)80131-5.Minsky, Marvin, Seymour Papert, and Bottou Leon (2017)

Perceptrons: an introduc-tion to computational geometry : The MIT Press.Misra, Janardan and Indranil Saha (2010) “Artiﬁcial neural networks in hardware: Asurvey of two decades of progress,”

Neurocomputing , 74 (1-3), 239–255, 10.1016/j.neucom.2010.03.021.Ritter, G.X. and P. Sussner (1996) “An introduction to morphological neural net-works,” in

Proceedings of 13th International Conference on Pattern Recognition :IEEE, 10.1109/icpr.1996.547657.Rosenblatt, Frank (1961)

Principles of neurodynamics. perceptrons and the theory ofbrain mechanisms : Defense Technical Information Center.Ruder, Sebastian (2016) “An overview of gradient descent optimization algorithms.”Svozil, Daniel, Vladim´ır Kvasnicka, and Jir´ı Pospichal (1997) “Introduction to multi-layer feed-forward neural networks,”