Computational Models of Human Decision-Making with Application to the Internet of Everything
aa r X i v : . [ c s . M A ] A ug Computational Models of Human Decision-MakingWith Application to the Internet of Everything
Setareh Maghsudi and Max Davy
Abstract —The concept of the
Internet of Things (IoT) first appeared a few decades ago. Today, by theubiquitous wireless connectivity, the boost of machinelearning and artificial intelligence, and the advancesin big data analytics, it is safe to say that IoT hasevolved to a new concept called the
Internet of Ev-erything (IoE) or the
Internet of All . IoE has fourpillars: Things, human, data, and processes, whichrender it as an inhomogeneous large-scale network.A crucial challenge of such a network is to developmanagement, analysis, and optimization policies thatbesides utility-maximizer machines, also take irrationalhumans into account. We discuss several networkingapplications in which appropriate modeling of humandecision-making is vital. We then provide a brief reviewof computational models of human decision-making.Based on one such model, we develop a solution fora task offloading problem in fog computing and weanalyze the implications of including humans in theloop.
Keywords : Cognitive hierarchy, Decision-making, Hu-man agent, IoE, Prospect theory, Social preference
I. Introduction
Integrating the human element in digital technology isa crucial aspect of digitalization that blurs the boundariesbetween the cyber and physical worlds. Many emergingapplication domains of IoE, such as intelligent transporta-tion systems and autonomous driving include humans inaddition to machines. An example of the potential of thehuman-in-the-loop can be seen on the basis of the follow-ing assumption: Given the great dissemination of smart-phones as a result of their vast functionality and wideprice-range, near-future human-in-the-loop systems will beheavily based on smart-phone technology. Consequently,their powerful computation and sensing capabilities can beutilized to address networking challenges such as scarcityof resources.While combining humans and machines to achieve sus-tainable resource allocation and efficient service provi-sion in networks is promising, the integration of humansand machines in a unique network is challenging due tothe following reasons: (i) Humans make mistakes, oftendue to inaccurate beliefs and imprecise predictions; (ii)Humans often act irrationally and based on heuristics;(iii) Humans think and act in different manners as aresult of their unique background, including personality
S. Maghsudi is with the Department of Electrical Engineering andComputer Science, Technical University of Berlin, 10623 Berlin, Ger-many (email: [email protected]). M. Davy is with the Schoolof Mathematics and Statistics, University of Sydney, CamperdownNSW 2006, Australia (email: [email protected]). and experiences. These characteristics stand in contrast tothe most common assumptions of rational decision-makingand strategic behavior. As a result, human behavior can-not be formalized by using conventional models such assimple concave utility functions, and new models needto be developed. For example, the well-being of humansis strongly based on their utility compared to that ofothers, rather than the absolute magnitude of the utilityon its own. Another example is the herding phenomenonobserved in humans, which describes how individuals ina group can act collectively without centralized direction.A further example is the information-avoidance behavior,where, by consciously ignoring the available information,individuals perform some tasks that oppose their values[1]. For the analysis, management, and optimization ofnetworks with both humans and machines, appropriatemodeling of this irrational human behavior is crucial.Moreover, the presence of the human often harms thereliability of the data based on which the network is con-trolled. Often suppressed by norm-related conflicts, infor-mation externalities, and emotions, self-reported humandata is severely biased. Therefore, learning or predictingthe human’s intent is highly complicated.In Section II, we show that considering human-introduced characteristics for network management byusing appropriate models is needed, and we review someresearch papers that take the human into account whenaddressing the optimization problem. In Section III, weprovide a brief survey on models for human behavior.In Section IV, we describe some important issues in de-veloping models. In Section V, we provide an exemplaryapplication of such modeling in edge computing. SectionVI concludes the paper.
II. State-of-the-Art and Applications
In this section, we first briefly review some relevantliterature. Afterward, we describe some applications ofagent-based modeling of human behavior.
A. State-of-the-Art
Human-in-the-loop systems can be discussed from sev-eral perspectives. Article [2] is a forward-looking surveyon human-in-the-loop control in cyber-physical systems,with an emphasis on robotics and human-machine inter-action. A theoretical framework for shared control betweenhumans and autonomous agents is proposed in [3], whichis based on nonlinear optimization. Architectural issuesare also a concern in designing socially-intelligent systems.For example, [4] is an attempt for including wireless sensor networks into the human-in-the-loop cyber-physicalsystem. In this work, the network is designed to supportuser-driven applications, through peer-to-peer in-networkqueries between resource-constrained devices. Another im-portant topic, which is the focus of this paper, is toformally model human behavior, which finds applicationin a wide range of scenarios. In [5], the authors proposean algorithm that maintains high levels of cooperation ina peer-to-peer network consisting of entities that performthe collective task of file sharing. The algorithm is adaptedfrom tag models of cooperation in social science, whichdo not rely on explicit reciprocity, reputation or trustmechanisms. Moreover, it does not require any centralcontroller. In [6], the authors present an optimization-based method for approximating the stochastic reachablesets, for informative prediction in human-in-the-loop sys-tems. Reference [7] proposes a human-in-the-loop controlmechanism based on human data analysis, which resultsin a reduction of energy-waste by detection of distractions.In [8], the authors consider the repeated task manage-ment problem. To include the human in the system, theyutilize a model based on prospect theory . Reference [9]investigates the crowdsourcing of multimedia files, wherethe cognitive hierarchy concept is applied to formalize thedecision-making of human agents in saving and sharingfiles. Finally, in [10], the authors consider the cyber secu-rity problem by formulating the attacker-defender interac-tions as an evolutionary game. They model the behavior ofreal-world players using quantal best-response dynamics.In
Table I , we summarize some of the research papersthat involve human agent and use specific computationalmodels for human decision-making.
B. Application
The decision-making of humans affects network plan-ning in a variety of applications related to the IoE, espe-cially those involving cooperation or competition. In thefollowing, we provide a few examples.
Fog Computing:
IoE necessitates energy-efficient and low-latency computa-tion, which is impossible to achieve if the central cloud isthe sole option for computation offloading. To address thischallenge, fog computing employs smart devices locatedat the proximity of users for intermediate computationand storage of small- or medium-scale tasks, while thecomputationally-expensive tasks are still forwarded to thecloud. Given the rapidly increasing number of smart hand-held devices, motivating users to participate in fog compu-tation using their idle devices would result in a significantreduction in deployment, transmission, and computationcost. Moreover, it is possible to pursue users to select fognodes over the central cloud, or vice versa . Therefore, torealize the full potential of fog computing, it is essential todevelop appropriate models for human decision-making.
Device-to-Device (D2D) Caching:
The mobile data traffic caused by on-demand multime-dia transmission exhibits the asynchronous content reuseproperty ; that is, the users request a few popular files at different times, but at relatively small time intervals.Wireless caching takes advantage of this property: Insteadof being fetched from the core network frequently insmall time intervals, the popular files are stored and re-transmitted whenever requested by new users. Amongdifferent models of strategies for wireless caching, D2Dcaching involves user devices to build a virtual library bystoring different files privately and provide each other withthe files on-demand. Therefore, similar to fog-computing,it is essential to provide human users with incentives tocollect and distribute multi-media contents.
Opportunistic Cooperative Communications:
The basic idea of cooperative communications is thatmultiple relays cooperate to help a source forward its mes-sage to the destination, thereby enhancing the throughputand extending the coverage. In such a scheme, the relaysare not pre-determined and pre-deployed. Instead, nearbydevices simply act as relays to forward the data. Naturally,acting as relays necessitates an expenditure of the radioresources such as spectrum and power. Therefore, unlessa user receives an attractive pay-off, it would not allow itsdevice to cooperate in transmission. Developing suitablereimbursement schemes, however, requires modeling thedecision-making behavior of the human.
Wireless Energy Transfer:
Wireless energy transfer between devices is a promisingidea for the charging of wireless sensor networks, smallmobile equipment, on-body medical devices. As the en-ergy waste during the wireless transfer grows with in-creased distance, it pairs well with energy harvesting.While each device obtains its required energy partiallythorough energy harvesting, the rest can be acquired fromthe nearby devices that do not urgently need energy.Because there is a high density of human-driven devices inmany geographical areas, they are considered as suitablechoices to participate ins such energy transfers. However,reimbursement is required to justify such a transfer. To setprices, the network manager and automated participantsneed to model the decision-makers optimally to allocatetheir budget efficiently and maximize their revenue orminimize their costs.The performance of every technological network (such asa wireless network) depends on several, often conflicting,variables. Therefore, given the system’s characteristics, thenetwork manager designs a specific optimization objectivefunction for each particular application. The optimizationsometimes involves model development as well (e.g., usinggame theory). The model development strongly relies onthe problem under investigation while it is essential toalso consider the general performance metrics such ascomplexity and stability. In this regard, human-in-the-loop IoT applications are similar to their counterpartsthat do not involve human intervention. The differenceoften arises in formalizing the model taking humans’irrational behavior into account. Some scenarios such asfog computing, wireless energy transfer, and cooperativerouting, resemble a market to provide commodities andservices. Consequently, the agents require incentives such
TABLE I: Some Research Works with Human-in-the-Loop
Reference Challenge Approach [5] File sharing Modeling human based on tag models of cooperation[6] Informative prediction of human action Optimization[7] Energy efficiency Modeling human based on data analysis[8] Repeated task allocation Modeling human utilizing prospect theory[9] Multimedia crowdsourcing Modeling human by applying cognitive hierarchy[10] Cyber security defense Modeling real-world players using quantal response dynamics as monetary compensation. Therefore, to improve effi-ciency, it is essential to acquire a good understandingof the utility and the behavior of the involved humandecision-makers, for example, by suitably adapting theutility functions; otherwise, appropriate mechanism designis implausible. Some other scenarios such as D2D cachingor autonomous driving require coordination. In such cases,it is crucial to predict the actions of human decision-makers, utilizing appropriate models or possibly usingits features, similarity measures, and historical actions.In summary, the unrealistic assumption that all decision-makers are fully rational is a pitfall of traditional modelsthat shall be addressed.
III. Agent-Based Models of Human Behavior
Here we describe the most important models of humanbehavior. These models aim to predict human behaviorin scenarios that can be modeled by a game-theoreticframework. Contrary to typical machine learning methods,they can operate with no historical data about humanbehavior in games identical to the current game, so longas they have access data about similar agents’ choices insimilar games. This data is used to estimate parametersin behavioral models. The models aim at adapting theclassical game theory to capture real human behavior. Toachieve this goal, they combine the standard results withinsights from cognitive- and behavioral psychology andmicroeconomics. These models seek to predict behavior inone-shot, simultaneous games where participants do notlearn (whether due to one-off participation or since thedynamics of the game change sufficiently fast to renderlearning useless). Indeed, almost all of the theoreticaljustification from behavioral psychology for these modelsis based on the one-off interaction, as further interactionsmay be complicated by the relationship between agents.This means that if available strategies in a system arenot changing over time, then as data is accumulated fromobservation of agents, a learning approach can be usedto improve or adjust the original general model of humanbehavior. The usefulness of the models below, in such astatic system, would be then either to ’start-off’ such aprediction model, or to use as a baseline for performanceevaluation.
A Brief Background in Game TheoryNormal-form games can be expressed as a matrix of pay-offs for players, dependent on the combination of choicesmade by each player, or, in other words, the joint action profile . Each possible action for a player is referred to asa pure strategy . A mixed strategy is then a probabilitydistribution over pure strategies. The expected utility of apure strategy for a player is the expectation of the payoffof a particular action, given the belief of that player aboutthe opponents’ joint action profile.In classical game theory, any rational agent seeks tomaximize its expected utility. Human agents, however,often act irrationally due to emotions, social norms, peerpressure, and the like. The experiments leading to thedevelopment of human models for such games usually in-volve showing such a normal form depiction to experimentparticipants and asking them to make a choice. In thispaper, we describe computational models for bounded-rational or irrational actors, who we assume in general areselfish, but due to either emotions, biases or limitations onhuman cognitive ability, are not optimal decision-makers.Note that although mixed strategies play a crucial role inclassical game theory, they will not appear so often in thesemodels. The probability distributions over actions thatappear here indicate the varied levels of human cognitionthat our models seek to capture rather than being a resultof players’ strategic thinking.
A. Iterative Strategic Thinking
The concept of Nash equilibrium is based on the as-sumption that agents reason their way to an equilibriumpreemptively before taking any actions, via fixed-pointreasoning in which the agents find some equilibrium fromwhich a unilateral deviation does not improve the utility.A natural alternative is to learn and develop towards suchan equilibrium, where the agents have the opportunity torefine their strategy over time. In the absence of such anopportunity for learning, we are left with iterative strategicthinking , in which an agent reasons along the lines of ’shethinks that I think that she thinks’ before making anyconclusion. A natural example is a financial marketplace,in which participants price a stock by guessing at howother participants will value it in the future.
1) Level- k Reasoning
Human agents participate in a limited number of iter-ations of strategic reasoning. A level- k agent engages in k iterations, responding to level-( k -1) agents by recur-sively determining their behavior, and responding to itby selecting the action maximizing their expected utility[11]. The simplest model for the level-0 agents is uniformrandomization over available actions, and more complex specification is discussed below. The final model is alinear combination of K agent levels parameterized by K variables.
2) Cognitive Hierarchy
A subtle feature of the level- k model is that each agentconsiders all of the other agents to be one level belowitself in the cognitive hierarchy. Alternatively, they canrespond to a linear combination of the strategies of loweragents in the cognitive hierarchy model [11]. This can bespecified in several ways. Each level can have its own beliefabout the proportion of lower-level agents (introducingseveral parameters), or these distributions can be theactual distribution of agent levels of the model, normalizedappropriately. For an agent at level- K , a popular way tospecify the distribution of agents is to assume that thenumber of agents in each level k = 1 , ..., K −
3) Specifying Level-0 Behavior
The distribution of level types in the level- k or cognitivehierarchy model significantly affects the proper specifica-tion of level-0 behavior.If there is no way to capture the errors made bylevel- k players (see also Section III-B2), then the level-0 specification captures such errors by a simple behaviorfollowing uniform randomization as previously described.Level-0 behavior can also capture the salient features ofa particular setting or commonly observed heuristics usedby strategically naive players. See [12], [13] for a summaryof how modifications to level- k and cognitive hierarchymodels improved models for games with asymmetric in-formation, market entry and other coordination games,games in which salient features affect coordination efforts,and games of strategic communication. B. Alternatives to Best-Response
In a perfect-rationality setting, best-response dynamics,which return Nash equilibria if they exist, are unrealisti-cally deterministic in their output. Part of the heterogene-ity of human responses to situations can be explained bytheir different cognitive abilities and methods of problem-solving, as with the level- k and cognitive hierarchy modelsabove. Another part can be explained as people simplymaking mistakes - even if they are attempting to actstrategically. The models below incorporate imprecision into a noisy best-response . ǫ -Nash The simplest such noisy best-response function is toreturn a pure Nash equilibrium (if any exists) with someprobability 1 − ǫ , and with probability ǫ , uniformly ran-domize over the remaining actions [11]. However, thisapproach has been shown to poorly model individuals’behaviour, mirroring the observed fact that people make systematically rather than randomly irrational decisions.We use ǫ -Nash in the example in Section V to simplydemonstrate the effect of a small amount of noise on the behaviour of agents. If we wanted to model agents’ be-haviour accurately, then we would use more sophisticatedmodels, such as QBEs, described in the next section.
2) Quantal Best-Response
Quantal Best-Response (QBR) returns a probabilitydistribution over actions, satisfying two observations frombehavioral psychology: (i) An agent is more likely tochoose an action with greater expected value than anaction with less expected value; (ii) The likelihood ofmaking the optimal choice decreases as the importanceof the decision decreases, whether because the differencebetween actions decreases, or because the payoffs decreasein value [14]. The most common variation is the logit-QBRfunction, which selects an action s i with higher probabilityif, under the belief B i , it has a higher expected utility U i ( s i , B i ):QBR i ( s i , B i , λ ) = exp( λU i ( s i , B i )) P s ′ i ∈ S i exp ( λU i ( s ′ i , B i ))The precision parameter λ corresponds to the rationalityof agents. As λ → ∞ , QBR approximates best-response,and if λ →
0, QBR approaches uniform randomization.This can be used on its own, predicting the quantalbest-response equilibrium via similar fixed-point reasoningto that involved in the calculation of Nash equilibria.However, this reasoning is similarly (or more) difficult thaneven Nash equilibria calculation. QBR can nonetheless beincorporated into level- k and other models, replacement ofthe previous utility maximization response. Incorporatingit usually improves predictive performance at the cost ofintroducing only one more parameter-although variantsare possible, such as different precision levels for differentagent types [11].
3) Noisy Introspection
An alternative way to incorporate QBR is noisy intro-spection, in which agents have no limit on the numberof strategic iterations they can perform, but their rea-soning becomes noisier as they perform higher levels ofintrospection, with a corresponding exponential increasein the noise parameter [15].
C. Adjustments to utility function
The first group of models, iterative strategic thinking,offer alternatives to the fixed-point reasoning used inclassical game theory. The second group of models intro-duce systematic models of noisiness in decision-making.The third group, described here, corresponds to changingthe underlying utility function, which produces the finalpayoffs for agents, which their decisions are based upon.
1) Prospect Theory
A highly popular explanation for many non-rationalbehaviors commonly exhibited by human decision-makersis prospect theory . It is an alternative to conventional util-ity maximization in decision theory, emphasizing people’sbiased perceptions of value and probability. As such, thereare two main areas of adjustments: how the agent perceives the values of outcomes (value function), and how theagent perceives the probabilities of outcomes (weightingfunction). a) Value Function
In prospect theory, there are three key changes to wayoutcomes are valued. Firstly, a possible outcome of anaction, rather than having a fixed ’utility’ value, is seenas a ‘gain’ or ‘loss’ relative to their reference state, orcurrent state. Secondly, the base gain or loss of an outcomeis adjusted by a value function , following the notion of‘diminishing returns’ and ‘diminishing losses’ with theincrease in value of the shift. Thirdly, the value function forgains is scaled-down relative to that of losses, so an agentwill perceive the absolute value of a loss as larger thanthat of a gain, reflecting people’s psychological aversionto loss. b) Weighting Function
Events’ probabilities are also transformed via a weight-ing function. In the presence of uncertainty, each outcomeis associated with a relative probability of occurrence p i .The weight function maps from the probability p i of anoutcome x i to a point in [0 , k . How-ever, this would create tension between the assumptionsof iterative strategic thinking and that of prospect theory.Namely, an agent in iterative strategic thinking mod-els others using the same utility function which theyuse, whereas Prospect Theory is based on psychologicalresearch that suggests that agents are unaware of thebiases present in their own and others’ reasoning. Thereis no obvious satisfactory resolution of this tension, andexploring the applicability of prospect theory to game-theoretic situations is an interesting direction for research.
2) Social Preference
A large class of models seek to explain non-rational be-haviour via social preferences [17]. Such models formalizenotions such as aversion to inequity and reciprocity. Theformalization is then utilized to predict human behaviorin situations where conventional game theory incorrectlypredicts strongly selfish behavior. In these models, ratherthan modelling agents as noisily deviating from best-response, the utility functions of agents are given by alinear combination of their normal or base ‘selfish’ utilityfunctions, and terms that incorporate notions of social utility. By altering the weighting of the different com-ponents of the utility function, either selfish behaviours,or altruistic or other ‘non-rational’ behaviours, can beemphasised.There are then many possible behaviours which canbe modelled via using a utility function. One exampleis altruism , in which the utility function increases withglobal base utility. Another example is inequity aversion ,in which the utility function has a term which decreaseswith inequity. Inequity can be measured in a number ofways, one of which is simply to measure the differencebetween the largest and smallest base utilities. Anotherpossibility is to compare the utility of each individualwith the average utility of the society. We can also model‘negative’ behaviours. An example of this is envy, whichis modelled via a term in their utility function. This termdecreases with the difference between their base utility andthat of the highest-utility agent.Notably, this does not require any complex iterativeprocedures. A clear difference between this method andthe previous methods is that the ‘noisiness’ is not aresult of players making mistakes, but rather playing best-responses according to notions such as what is fair orunfair for themselves or society. As such, they modelsignificantly different aspects of agents’ behaviour. Thisdifference should be taken into account when modelling.In
Table II , we summarize the well-known models ofhuman behavior together with some of their applicationareas in IoE.
IV. Developing a Model
As observed in Section III, most of the models thatformalize the human’s strategic behavior include someparameters. Often, such parameters are optimally tunedvia training the model using the available data. Indeed,one of the key features of a behavioral model is its data-driven methodology: rather than normatively definingmathematical features of games, it seeks to predict thedecision-making behavior based on past evidence of suchbehavior. In the following, we provide a brief explanationwithout going into details.One conventional approach for training a model, i.e.,to find the parameters’ value, is to use the maximumlikelihood estimation (MLE). The ‘likelihood’ describes theplausibility of a set of parameter values of a statisticalmodel describing a set of observations, given those obser-vations. In an MLE model, there is a likelihood functionassociated with a statistical model. The function takesin the vector of parameter values of that model and adata set. The output is then the likelihood of the specifiedparameter values given the data. More precisely, initially,a model is created using the given vector of parametervalues. This model returns a probability distribution overactions. Then, for every element in the training data, itreturns the probability of observing the selected actionand multiplies these probabilities together. The resultedvalue is the likelihood that those parameter values describethe observed data. The final parameter values for a given
TABLE II: Well-known Models of Human Decision-Making with IoT Application Areas
Model Potential Application Area Brief Description
Iterative strategic thinking Autonomous driving, E-commerce,D2D caching Making decisions based on some beliefs on howothers will decide, rather than the hypotheticalreasoning of game theory .Alternation to best-response Cooperative communications, Fogcomputing, Smart grid Making decisions to optimize an objective functionsuch as when bidding, but suffering from mistakes.Adjustment to utility function Sharing economy, Crowdsourcing,Wireless energy transfer, Coopera-tive mining Incorporating factors beyond rational self-interest,such as loss aversion (prospect theory) or a desirefor fairness (social preferences). training set are those which maximize the likelihood func-tion. The data required for such model training is usu-ally collected by performing experiments; however, oftenthe usefulness of data from experiments as generalizabletraining data is obscured by the experiment’s design toanswer specific questions about applications of behavioralgame theory to a particular area. Moreover, the quality ofdata plays a critical role. For example, self-reported datais frequently biased and inaccurate.Finally, it is essential to evaluate the developed humandecision-making models. Here, the predictive ability of themodel is of specific importance. Therefore, such a modelshould be tested on data separated from that it is trainedon, a concept which is known as cross validation . Otherimportant factors are low-complexity, interpretability, andaccuracy. Often, there is a trade-off between these factors.For example, using the likelihood as a measure of accuracy,despite being simple, lacks interpretability to some extent,as it does not give much information about the importanceof different model parameters.
V. Example Application: ComputationOffloading in Fog Computing
Here we provide an example concerning fog computationoffloading to show the effect of humans’ irrational orimprecise behavior on the system’s performance.
A. System Model and Problem Formulation
We consider a scenario with an offloading user thatdivides some dividable tasks between a set of computingfog nodes represented by m ∈ M = { , , ..., M } . Eachfog node demands a price c m to perform each unit of thetask. We denote the share of each fog node m ∈ M by r m ∈ R + . The user has limited budget so that it shouldsatisfy P m ∈M r m c m ≤ B , with B >
Alg. 1 .The performance of the offloading user depends onseveral crucial factors, including (i) transmission delay and
Algorithm 1
Task Offloading to Fog Nodes Each fog node m ∈ M announces the initial price c m ; repeat The user determines the demand r m , m ∈ M ; Each fog node m ∈ M adapts the price c m . until convergence service delay; (ii) power consumption for data transmis-sion; and (iii) paid service price to the fog nodes. We definethe utility function of the offloading user as g o ( r , ..., r M ) = a X m ∈M α m log( r m β m ) − c m r m . (1)In (1), the first term is the utility of the user by offloadingthe task. We model the utility by an increasing concavefunction (e.g., logarithmic function) due to the followingreason: Although offloading is in general beneficial forthe user, the marginal utility of offloading decreases sinceheavier offloading results in longer delays and higher en-ergy consumption. The parameters α m and β m dependon the characteristics of the fog node that affect theuser’s quality of experience, for example, the distance tothe offloading user, queue length, and the like. Detaileddescription of such dependencies is out of the scope of thispaper. The second term in the right-hand-side of (1) is theservice cost paid to the fog nodes m ∈ M . The gain is thenthe utility minus cost.Similar to the offloading user, each fog node m ∈ M has some measure for its gain. The utility of fog node m equals the price paid by the offloading user. The cost isthen determined by several factors such as the consumedenergy. We model the cost of a fog node m by a linearfunction with parameter c (L) m . Formally, the gain of a fognode m ∈ M yields g m ( c m ) = κ m ( c m r m − c (L) m r m ) , (2)where κ m ∈ (0 ,
1] is a price-regulatory constant, whichis equal to one in the most conventional form. Naturally, c m ≥ c (L) m . The traditional solution of the supply-demandprocedure described in Alg. 1 is the ‘Nash equilibrium’,as determined in the following section.
B. Nash Solution
At Nash equilibrium, the players act by best-respondingto each other. Indeed, Nash equilibrium is the intersectionof the best-response dynamics of the players. Therefore, no player benefits by a unilateral deviation from theequilibrium strategy, implying a steady state.In the developed offloading game, the offloading user,if fully rational, optimizes its performance by max-imizing the utility function (1) subject to the con-straint P m ∈M c m r m = B . This problem is an equality-constrained convex optimization problem with an one-dimensional differentiable function for which ‘Karush-Kuhn-Tucker’ (KKT) conditions hold. It requires only afew calculus steps to show that r ∗ m = Bc m (1+ P i ∈M ,i = m αjαm ) .Similarly, every fully-rational fog node m ∈ M deter-mines the service price by maximizing the utility functiongiven by (2) that corresponds to the best-response. As theequilibrium is the intersection of the two best-responsedynamics, the optimal price is the solution of the following r ∗ m + κ m ( c m − c (L) m ) ∂r ∗ m ∂c m = 0.In the distributed implementation, the fog nodes startfrom some minimum price and adapt the prices continu-ously until convergence, i.e., until the price and demandsettle at some point. Based on an approach similar to theone that appears in [18], one can establish that with thegiven price and demand functions, the procedure describedin Alg. 1 indeed converges to an stable point that is theNash equilibrium of the formulated game.For simulation, we consider a system with one offloadinguser and four fog nodes.
Fig. 1a depicts the iterativeinteraction between the fog nodes and the user. It isobvious that the price, and thus the demand, settle atsome point. The characteristics of this settlement pointmainly depend on the preferences of the user. These, inturn, are determined by the properties of the fog node,including their requested service price.
C. The Effect of Human Agent with Noisy Best-Response
In fog computing, similar to other crowd sourcingschemes, the set of contributors might include small hand-held devices, where the human agents take the role ofdecision-makers. To model the human agents in the afore-mentioned fog computing scenario, we use a noisy best-response model. We model the imprecision of decision-making by a uniform noise that is added to the best-response as perturbation. We perform the experiment bytwo small noise levels, namely maximum of 5% and 10% ofthe best-response for all fog nodes and the offloading user.
Fig. 1b and
Fig. 1c depict the results respectively. Fromthe figures, it is obvious that even a small deviation fromthe fully-rational response can result in dramatic instabil-ity in the system, as well as prolonging the convergencetime; As such, it should be taken into account for networkoptimization.
D. Potential Solution to Resolve the Instability
To resolve the instability of the system that is inducedby the noisy best-response, we use a simple, yet effective,signal processing approach, namely, signal averaging . Uti-lizing this method relies on the assumption that everyplayer is aware of the possibility of making mistakes;therefore, at every round, each player simply acts by the moving average of its best-responses in all the pricing-demand rounds so far. As it is evident from
Fig. 1d , thissimple method reduces the noise significantly and resolvesthe instability to a large extent.Finally,
Fig. 2 shows the utility of the offloading user(as the customer) and the aggregated utility of the fognodes (as the suppliers) in different scenarios. It can beconcluded that in addition to system stability, imprecise-ness reduces the system efficiency.
VI. Outlook and Conclusion
Humans play a vital role in future IoT networks. Assuch, appropriate modeling of human behavior is signifi-cantly important for network optimization. We elaboratedon formalizing human behavior using simple and efficient,yet effective, models. Based on a toy example in thefog computing framework, we showed that neglecting thehuman-specific characteristics endangers the system’s sta-bility. Therefore, in future-looking applications, the mod-els of wireless communication networks based on multi-agent systems with full-rationality shall be reconsideredto include the imperfections in human behavior.The complexity of humans’ character and behavior,however, render such enhancement notoriously difficult.While such difficulties have been studied in several do-mains such as marketing, the available solutions are notdirectly generalizable to technological networks such asIoT, and further study is imperative. Some challenges areas follows: • Generally, one uses the available data to tune theparameters of the computational models; however,IoT applications are widely diverse. Moreover, gath-ering a large amount of data might not be plausible.As such, it is necessary to develop some methodsto manually adapt and re-use the existing data fordifferent scenarios, or to create synthetic data. • It is important to adapt the standard human decision-making models to the IoT applications, as such appli-cations are influenced by various physical constraints,e.g., the transmission medium over which the entitiescommunicate with each other. • Centralized learning can be excessively time- andenergy-consuming; hence it is essential to developdistributed learning algorithms to obtain accuratemodels and parameters efficiently and securely. • Another challenge is the implementation of human-in-the-loop solutions in existing IoT systems. Indeed,the efficiency and feasibility of such implementation isitself a line of research, as it might require large com-puting capability and availability of power resources. • Investigating the effect of integrating humans in thenetworks on addressing the trade-offs in IoT ap-plications, as well as on network optimization andmanagement.Beyond IoE applications and in the general setting, thereare several challenges open to address. These include • As previously discussed, there are several modelsto computationally formalize human decision-makers.
Fog Node 1
SharePrice
Fog Node 2
SharePrice
Fog Node 3
SharePrice
Fog Node 4
SharePrice (a) Fully rational agents.
Fog Node 1
SharePrice
Fog Node 2
SharePrice
Fog Node 3
SharePrice
Fog Node 4
SharePrice (b) Bounded-rational agents with 5% noisy best-response.
Fog Node 1
SharePrice
Fog Node 2
SharePrice
Fog Node 3
SharePrice
Fog Node 4
SharePrice (c) Bounded-rational agents with 10% noisy best-response.
Fog Node 1
SharePrice
Fog Node 2
SharePrice
Fog Node 3
SharePrice
Fog Node 4
SharePrice (d) Approximate resolution of instability.
Fig. 1: Price-demand negotiation. The x-axis shows the iteration rounds. Prices are normalized by 10. The resolutionof instability results from signal averaging over 10% noisy best-response.Therefore, it is necessary to develop accurate andapplication-specific indicators for the complexity andpermissiveness or usefulness of probabilistic models. • Experiment design is a classical research direction inbehavioral studies. In this regard, it is important todesign diverse tests that reflect the usability of thecombination of two or more models. • Although an entity might change its behavior overtime, such dynamics are largely neglected in thecurrent research. Therefore, it is imperative to studymodels that include the time-variations in the behav-ior of a human agent. • Irrespective of the nature of players, efficiency andstability are the most important factors in au-tonomous systems. Therefore, developing methods toeliminate the adverse effect of human agents on thesystem’s stability is a significant line of research. • Tailoring deep learning frameworks for an accurateestimation of the models’ parameters is another po-tential research direction.
References
Com-puter , vol. 46, no. 1, pp. 36–45, Jan 2013.[3] D. Gopinath, S. Jain, and B. D. Argall, “Human-in-the-loopoptimization of shared autonomy in assistive robotics,”
IEEERobotics and Automation Letters , vol. 2, no. 1, pp. 247–254, Jan2017.[4] A. D. Wood, L. Selavo, and J. A. Stankovic, “SenQ: Anembedded query system for streaming data in heterogeneous in-teractive wireless sensor networks,” in
Distributed Computing inSensor Systems . 2008, pp. 531–543, Springer Berlin Heidelberg.[5] D. Hales and B. Edmonds, “Applying a socially inspiredtechnique (tags) to improve cooperation in P2P networks,”
Iteration U t ili t y No Noise
UserFNs
Iteration U t ili t y Noise 5%
UserFNs
Iteration U t ili t y Noise 10%
UserFNs
Iteration U t ili t y Restability
UserFNs
Fig. 2: Utility of the offloading user and the aggregatedutility of the fog nodes at full rationality, in different noiselevels, and when using the signal averaging method.
IEEE Transactions on Systems, Man, and Cybernetics - PartA: Systems and Humans , vol. 35, no. 3, pp. 385–395, May 2005.[6] K. Driggs-Campbell, R. Dong, and R. Bajcsy, “Robust, infor-mative human-in-the-loop predictions via empirical reachablesets,”
IEEE Transactions on Intelligent Vehicles , pp. 1–1, 2018,early-access.[7] S. Munir, J. A. Stankovic, C. J. M. Liang, and S. Lin, “Reduc-ing energy waste for computers by human-in-the-loop control,”
IEEE Transactions on Emerging Topics in Computing , vol. 2,no. 4, pp. 448–460, Dec 2014.[8] Q. C. Ye and Y. Zhang, “Participation behavior and socialwelfare in repeated task allocations,” in
IEEE InternationalConference on Agents , 2016, pp. 94–97.[9] Q. Shao, M. H. Cheung, and J. Huang, “Multimedia crowd-sourcing with bounded rationality: A cognitive hierarchy per-spective,”
IEEE Journal on Selected Areas in Communications ,vol. 37, no. 7, pp. 1478–1488, 2019.[10] H. Hu, Y. Liu, C. Chen, H. Zhang, and Y. Liu, “Optimal decisionmaking approach for cyber security defense using evolutionarygame,”
IEEE Transactions on Network and Service Manage-ment , pp. 1–1, 2020.[11] Y. In and J. Wright, “Signaling private choices,”
The Reviewof Economic Studies , vol. 85, pp. 558–580, jan 2018.[12] V. P. Crawford, M. A. Costa-Gomes, and N. Iriberri, “Structuralmodels of nonequilibrium strategic thinking: Theory, evidence,and applications,”
Journal of Economic Literature , vol. 51, no.1, pp. 5–62, mar 2013.[13] J. R. Wright and K. Leyton-Brown, “Level-0 meta-models forpredicting human behavior in games,” in
Proceedings of ACMconference on Economics and computation . 2014, ACM Press.[14] R. D. McKelvey and R. T. Palfrey, “Quantal response equilibriafor normal form games,”
Games and Economic Behavior , vol.10, no. 1, pp. 6–38, 1995.[15] J. R. Wright and K. Leyton-Brown, “Predicting human be-havior in unrepeated, simultaneous-move games,”
Games andEconomic Behavior , vol. 106, pp. 16 – 37, 2017.[16] A. Tversky and D. Kahneman, “Advances in prospect theory:Cumulative representation of uncertainty,”
Journal of Risk anduncertainty , vol. 5, no. 4, pp. 297–323, 1992.[17] E. Fehr and K. M. Schmidt, “A theory of fairness, competition,and cooperation,”
The quarterly journal of economics , vol. 114,no. 3, pp. 817–868, 1999.[18] S. Maghsudi and S. Stanczak, “A hybrid centralized-decentralized resource allocation scheme for two-hop transmis-sion,” in