[PDF] Bayesian-Assisted Inference from Visualized Data

Abstract

A Bayesian view of data interpretation suggests that a visualization user should update their existing beliefs about a parameter's value in accordance with the amount of information about the parameter value captured by the new observations. Extending recent work applying Bayesian models to understand and evaluate belief updating from visualizations, we show how the predictions of Bayesian inference can be used to guide more rational belief updating. We design a Bayesian inference-assisted uncertainty analogy that numerically relates uncertainty in observed data to the user's subjective uncertainty, and a posterior visualization that prescribes how a user should update their beliefs given their prior beliefs and the observed data. In a pre-registered experiment on 4,800 people, we find that when a newly observed data sample is relatively small (N=158), both techniques reliably improve people's Bayesian updating on average compared to the current best practice of visualizing uncertainty in the observed data. For large data samples (N=5208), where people's updated beliefs tend to deviate more strongly from the prescriptions of a Bayesian model, we find evidence that the effectiveness of the two forms of Bayesian assistance may depend on people's proclivity toward trusting the source of the data. We discuss how our results provide insight into individual processes of belief updating and subjective uncertainty, and how understanding these aspects of interpretation paves the way for more sophisticated interactive visualizations for analysis and communication.

Full PDF

BBayesian-Assisted Inference from Visualized Data

Yea-Seul Kim, Paula Kayongo, Madeleine Grunde-McLaughlin and Jessica Hullman

What percentage of the population has the disease X? (a) ELICIT: user’s prior beliefs Uncertainty Analogy Posterior visualization

The survey data has about than your prior beliefs. In other words, your prior beliefs are than the survey data. (b) PRESENT: Visualized data (c) ADD: Bayesian assistance

Predicting what you should believe now 10% (d) User’s posterior beliefs relative to (e) Bayesian normative posterior beliefs

Our study evaluates by comparing: [Bayesian model]

Fig. 1. Using Bayesian inference to assist how data is shown to improve belief updating. (a): The viewer holds prior beliefs abouta parameter such as a disease rate in the population, which are elicited in the form of a probability distribution. (b): The user ispresented with an observed dataset estimating the rate, which conveys information about the likelihood function. (c): The observeddata is accompanied by Bayesian assistance techniques in the form of an uncertainty analogy or visualization of Bayesian posteriorpredictions derived from their prior beliefs and a normative Bayesian model. (d): In our experiment, we elicit posterior beliefs anduse the deviation between these beliefs and (e) the normative beliefs to evaluate the two types of Bayesian assistance. The goal ofBayesian assistance is to bring the user’s updated beliefs closer to (e).

Abstract — A Bayesian view of data interpretation suggests that a visualization user should update their existing beliefs about a pa-rameter’s value in accordance with the amount of information about the parameter value captured by the new observations. Extendingrecent work applying Bayesian models to understand and evaluate belief updating from visualizations, we show how the predictionsof Bayesian inference can be used to guide more rational belief updating. We design a Bayesian inference-assisted uncertaintyanalogy that numerically relates uncertainty in observed data to the user’s subjective uncertainty, and a posterior visualization thatprescribes how a user should update their beliefs given their prior beliefs and the observed data. In a pre-registered experiment on4,800 people, we ﬁnd that when a newly observed data sample is relatively small (N=158), both techniques reliably improve people’sBayesian updating on average compared to the current best practice of visualizing uncertainty in the observed data. For large datasamples (N=5208), where people’s updated beliefs tend to deviate more strongly from the prescriptions of a Bayesian model, we ﬁndevidence that the effectiveness of the two forms of Bayesian assistance may depend on people’s proclivity toward trusting the sourceof the data. We discuss how our results provide insight into individual processes of belief updating and subjective uncertainty, andhow understanding these aspects of interpretation paves the way for more sophisticated interactive visualizations for analysis andcommunication.

Index Terms —Bayesian cognition, Belief updating, Uncertainty visualization, Adaptive visualization.

NTRODUCTION

People look to data visualizations in the media, government, and sci-ence to help them form beliefs about the world around them. However,abundant research indicates that people often struggle to properly ac-count for uncertainty in making judgments from data. For example,many people overinterpret small samples [32, 57]. In other cases theymay underreact to data, misjudging how informative large samplesare [2] or failing to update their beliefs when a sample conﬂicts withtheir pre-existing beliefs [16].Cognitive errors like under- and overreaction to data can be de-ﬁned by comparing human judgments to Bayesian inference, a statis-tical method that prescribes how to update probabilistic beliefs given • Yea-Seul Kim is with the University of Washington. E-mail:[email protected]. • Paula Kayongo is with Northwestern University. E-mail:[email protected] • Madeleine Grunde-McLaughlin is with the University of Pennsylvania.E-mail: [email protected] • Jessica Hullman is with Northwestern University. E-mail:[email protected] received xx xxx. 201x; accepted xx xxx. 201x. Date ofPublication xx xxx. 201x; date of current version xx xxx. 201x.For information on obtaining reprints of this article, please sende-mail to: [email protected] Object Identiﬁer: xx.xxxx/TVCG.201x.xxxxxxx/ new evidence. Imagine you are interested in a political candidate A’schance of winning an election, and you have some expectations aboutthat chance, based on, for example, seeing early results from a smallpoll of registered voters, and your experiences talking to others in yoursocial circle. If asked to describe your beliefs, you’d say your bestguess of the candidate’s chance of winning the election is 51%, witha 95% chance that the value will be between 47% and 55%. In aBayesian framework, these beliefs are called your prior beliefs .One day you encounter a visualization of new poll results. The dataindicates that A has a 60% chance of winning, based on responsesfrom around 1000 people, with the chance of winning falling between57% and 63% with high conﬁdence (e.g., 95%). What should youbelieve after encountering the second poll? The laws of Bayesian be-lief updating prescribe an “optimal” way for combining prior and newinformation. Assuming that you have no reason to distrust the new ev-idence, you should update your beliefs proportional to the amount ofnew information that the poll provides over what you already believed.Bayesian inference formalizes this intuition through Bayes rule, whichstates that your posterior beliefs about a parameter after observing newdata are proportional to your prior beliefs about the parameter multi-plied by the information contained in the new evidence about the pa-rameter. In this case, your new beliefs about A’s chance of winningshould be around 57%, with a 95% interval between 54% and 59%.Recent work shows how a Bayesian cognition perspective candeepen understanding of visualization interpretation [43, 65] and con-tribute to more rigorous evaluation, in which deviation from Bayesianupdating is used as a proxy for understanding which visualizationsbest support accurate perception of how informative data is [43]. We a r X i v : . [ c s . H C ] A ug xtend this work by considering the generative potential of predictionsfrom models of Bayesian inference to guide belief updating from visu-alized data. We propose two Bayesian assistance techniques that usethe mathematical intuitions of Bayesian theory to guide a user’s beliefformation process as they interact with visualized data. Both tech-niques treat the user’s subjective uncertainty about a parameter valuebefore seeing newly observed data (i.e., their prior distribution) as areference point against which the uncertainty in the observed data canbe compared (Fig. 1b2). An uncertainty analogy relates uncertaintyin observed data to uncertainty in the user’s prior. A posterior visu-alization depicts the posterior beliefs predicted by Bayesian inferencegiven the user’s prior beliefs.How does Bayesian assistance change users’ beliefs as they inter-act with a visualization? We present a preregistered experiment with4,800 participants in which we compare users’ belief updating underBayesian uncertainty analogies and posterior visualizations to beliefsbased on common presentations of uncertain estimates like point esti-mates with reported sample size or a shaded interval displaying prob-ability density. We ﬁnd that: • For small datasets (N=158), both techniques bring the averageuser’s belief updating closer to normative Bayesian inference. • Eliciting a prior from a user itself can encourage more Bayesianupdating, as evidenced through an aggregate analysis of people’supdating without and without elicitation.We conclude by discussing the implications of our results as well asthe adoption of Bayesian inference to guide visualization design andevaluation.

ELATED W ORK

Research in judgment and decision-making demonstrates how humanjudgments under uncertainty can diverge from statistical accounts. Forexample, belief in the law of small numbers describes how many peo-ple are too conﬁdent in the representativeness of small samples [63].More recent work describes how a related bias called non-belief inthe law of large numbers, in which a person simply believes that pro-portions in any given sample might be determined by a rate differentthan the true rate (i.e., misunderstands the relation between samplesize and error), is compatible with the earlier work on small samplesby Tverksy, Kahneman, and many others [12].Some interventions can reduce biases in interpreting uncertainty.Research in uncertainty visualization has proposed many techniquesfor visually representing quantiﬁed uncertainty distribution to im-prove judgments or decisions, from boxplots (e.g., [55]) to visualiza-tions of probability density as area, shading or other visual properties(e.g., [22, 27]) to frequency-based representations of probability likehypothetical outcome plots and quantile dotplots [11, 26, 28, 35, 38,39, 40] probabilistic animations. We compare how well users updatetheir beliefs using two Bayesian assistance relative to a conventionalinterval and shaded density representation of a dataset.

Empirical research in economics and mathematical cognition demon-strates the role of beliefs in numerous judgments and decisions. Man-ski [46] argues against a long standing bias in economics toward in-ferring beliefs from choice, noting that eliciting probabilistic beliefsprovides useful and predictive insight into behavior. He surveys eco-nomic literature on how people form beliefs and how these beliefsinﬂuence their ﬁnancial decision making [9, 4] or other consump-tion [8, 25, 56, 3, 13, 18, 20, 30, 29, 66]. Camerer [17], Schotterand Trevino [58] summarize the value of studying beliefs from labora-tory ﬁndings, while Abeler et al. [1] use quantitative meta-analysis toshow that experiment subjects can generally be trusted to report honestbeliefs in economics experiments [1].Mathematical psychologists have shown how Bayesian models ofcognition help explain a range of perceptual and cognitive phenom-ena, such as inferring causal relationships [60, 59] or inductive learn-ing [61, 34]. For example, Grifﬁths and Tenenbaum [34] demonstrate that the aggregate posterior belief distribution across people approxi-mates the normative Bayesian posterior over various “everyday quan-tities” such as cake baking times and human lifespans.Though the authors explicitly suggest that a mathematical accountwould not be feasible, McCurdy et al.’s [47] suggestion that impliciterror captures how users “mentally adjust” data-driven estimates ininterpretation resembles the Bayesian ideal that prior beliefs inﬂu-ence inferences drawn from new data. In contrast to their assertionthat mathematical frameworks are not possible, we demonstrate howBayesian modeling can combine subjective beliefs with observed datato reduce integration errors that may arse in mental approximation.Until recently, research on the role of visualizations in promotingBayesian reasoning was limited to studying how visualizations affectperformance on classic conditional probability tasks like the mam-mography problem [51, 54, 31, 62, 53, 33, 21]. However, severalrecent visualization studies apply Bayesian modeling to visualizationinterpretation [43, 65]. In the closest prior work, Kim et al. [43] pre-sented people with survey estimates of several proportions, ﬁnding thatat an individual level, people’s posterior beliefs diverged considerablyfrom normative Bayesian. In aggregate, however, people’s posteriorbeliefs closely approximated the predictions of normative Bayesianinference for estimates based on small samples (N=158), but not forthose based on very large samples (N=750k). Kim et al. show howthe deviation between a person’s posterior beliefs and the Bayesiannormative posterior beliefs can be used as a proxy for a user’s uncer-tainty comprehension. Our work extends this inquiry by consideringwhether Bayesian inference can also be used to generate personalizeddata presentations based on a user’s prior beliefs.

OTIVATING B AYESIAN A SSISTANCE

We introduce the assumptions behind applying a Bayesian perspec-tive to visualization interpretation, then the speciﬁc components ofour Bayesian modeling approach in the context of a belief updatingscenario.

To apply Bayesian inference to visualization, we assume that prior tointeracting with a visualization, a user has some state of prior beliefsabout a parameter which the data provides an estimate of (e.g., a rate).We assume that any user’s prior beliefs can be elicited through an inter-active interface, and represented by a probability distribution. We canthink of how tightly concentrated this distribution is as the strength ofthe user’s beliefs, capturing how conﬁdent they are in their knowledgeabout the parameter value. The user’s prior beliefs about a parametercan range from no relevant knowledge about the parameter value (e.g.,a uniform distribution in which all values of the parameter are thoughtto be equally likely) to near complete certainty (e.g., high conﬁdencethat the value is within a very small interval).We assume that the user will update their prior beliefs about the pa-rameter upon viewing new information in a visualization. We assumethat the closer the user’s belief update is to optimally combining theinformation in their prior with the new visualized data (as deﬁned by astandard Bayesian model of updating a sample proportion), the morerationally they have updated their beliefs.For example, if one has no reason to believe that any particularvalue of the parameter is more likely than any other, their posteriorbeliefs should be equal to the evidence that the visualized data pro-vides about the parameter value. If they had very strong prior beliefsabout the parameter, and saw a relatively small amount of evidence inthe visualization, their posterior beliefs should remain close to evenidentical to their prior beliefs.To model this process we use mathematical formulations standardin Bayesian statistics, including to ﬁt the elicited beliefs to a statistical(prior) distribution, to represent the information about the parameterimplied by the dataset (likelihood), and to calculate the Bayesian pos-terior beliefs. We provide further mathematical details below.Finally, note that Bayesian inference in cognition is typically as-sumed to be the implicit process; our work explores whether makingredictions from normative Bayesian updating explicit can be ben-eﬁcial to users. Further, unless possible bias is intentionally mod-eled, a Bayesian model of updating will assume that prior beliefsand observed data are equally credible sources of information. Ourwork demonstrates how people’s self-reported trust in data’s credibil-ity helps predict where this assumption may not hold.

Consider a scenario in which a user will be presented with a visualizedestimate of a parameter θ . Imagine that the parameter is the propor-tion of residents of U.S. assisted living centers who have Alzheimer’s.As a proportion, θ can theoretically take any value from 0 to 1. Be-fore the user views observed data , they articulate their prior beliefs byassigning probability over plausible values of θ using an interactiveinterface (Fig 1a).In Bayesian inference, beliefs take the form of a probability distri-bution. For a proportion parameter θ , a Beta distribution is a conve-nient distribution to capture beliefs. Two parameters sufﬁciently de-ﬁne a unique Beta distribution: Beta ( α , β ) . We can think of α − β − Beta ( , ) . The sum of the successful events and the failureevents (i.e., 10) represents the amount of information (or converselyuncertainty) contained in the user’s prior distribution.Imagine that the user is next presented with a visualization of an es-timate captured by observed data (Fig 1 (b1)), such as the proportionof assisted living center residents with dementia according to recordsfor a chain of centers with locations across the country. Out of 1,000residents of these chains, 420 have dementia. We model the data gen-erating process as a binomial process in which any individual indepen-dently has the disease with a certain (identical) probability θ .We represent the observed data as a likelihood function capturingthe probability of different values of θ given the observed data. Con-veying a sense of likelihood is the goal of most approaches to commu-nicating uncertainty in estimates. The likelihood encodes the relativenumber of ways that different values of θ could produce the observedproportion given our assumptions about the data generating processand the size of the observed sample. The likelihood function for asample proportion, 42%, of 1,000 total residents can be representedby Binomial ( , . ) , implying an expected 420 successful eventsand 580 failure events but with some uncertainty due to sampling error. o f successes posterior = o f successes prior + o f successes data o f f ailures posterior = o f f ailures prior + o f f ailures data (1)The normative posterior distribution (Fig 1e) that predicts rationalupdating is calculated by using Bayes rule to update the probabilityof θ in the prior with the information about θ implied by the likeli-hood function. Equation 1 results from using Bayes rule to estimatethe number of successful events and the failure events in the poste-rior beliefs as a function of the estimates implied by the observed dataand prior. The number of successful and failure events in the poste-rior beliefs is equivalent to a Beta distribution: Beta ( , ) . In-tuitively, under Bayesian inference the user’s belief distribution afterencountering the observed data shifts proportionally to the amount ofinformation contained in the two distributions. We propose two

Bayesian assistance techniques that exploit the user’sprior beliefs. An uncertainty analogy relates uncertainty in observeddata to uncertainty in the user’s prior, and a posterior visualization depicts the posterior beliefs predicted by Bayesian inference, giventhe user’s prior beliefs.

The user’s prior distribution captures their uncertainty about the pa-rameter value before seeing the observed data. We can treat this sub-jective uncertainty as a personally meaningful reference against whichuncertainty in the observed data can be compared. Imagine you arepresented with a visualization and text telling you how much informa-tion the visualized data contains relative to how informed you wereabout the topic already: “Your prior beliefs have 2 times more infor-mation than the data.”To generate the multiplicative factor, we compare κ (a proxy forsample size deﬁned as α + β ) in the prior distribution ( κ prior ) to thesample size of the observed data ( κ data ). To avoid multipliers less thanone, we always chose the distribution (Beta corresponding to likeli-hood or participant’s prior) for which κ was lower as the reference dis-tribution. For example, if κ data is greater than κ prior , we calculated themultiplier as κ data / κ prior (e.g., Your prior beliefs have 2 times moreinformation than the data ), calculating the multiplier as κ prior / κ data in the case where κ prior was greater. An even more direct way to guide a user toward Bayesian inference isto present them with the normative belief distribution calculated usingtheir prior beliefs and the likelihood. Imagine that in addition to an ob-served dataset, you are presented with a visualization suggesting howyou should update your beliefs, in the form of the normative posteriorcalculated using your prior distribution, along with a brief explanationof how it was derived (i.e., by combining the information in their priorbeliefs with that in the observed data).

XPERIMENT : B

AYESIAN A SSISTANCE

We designed and preregistered a large crowd-sourced between-subjects experiment to evaluates how participants’ appear to updatetheir beliefs under Bayesian assistance versus more conventional de-pictions of proportion estimates.

Prior elicitation Visualization Intervention

Prior elicitation Point EstimateUncertainty VisualizationUncertainty AnalogyPosterior VisualizationNo prior elicitation Point EstimateUncertainty Visualization

Main studyDementia dataset with small sample sizewith large sample size

More trustworthy dataset

Abortion dataset with small sample sizewith large sample size

Less trustworthy dataset

Dataset Conditions X Fig. 2. The study conditions and datasets.

We tested four approaches to conveying uncertainty (Fig. 2). • Point Estimate (with sample size) : Participants view a point esti-mate of the observed proportion with the size of the sample in text. • Uncertainty Visualization : Participants view a point estimate of theobserved proportion along with a probability density shaded intervalin which the estimate is expected to fall with high probability (95%). • Uncertainty Analogy : Participants view the uncertainty visualiza-tion alongside an uncertainty analogy. A brief explanation of howthe analogy was generated (e.g., “We directly compared the samplesize of the study to the sample size implied by your prior beliefs.”)is also presented. • Visualization : Participants view the uncertainty visualizationalongside a visualization of the normative posterior distribution. Abrief explanation of how the posterior was arrived at (including ananalogy expression comparing the uncertainty in the participant’sprior beliefs to that of the data as above) is presented. =158 N=5028

Priors PriorsNormative posterior Normative posteriorLikelihoodfunction Likelihoodfunction

Fig. 3. Illustration of how normative posterior beliefs (dashed) are in-ﬂuenced by the sample size of the observed data (represented by thelikelihood in gray) given a prior distribution (solid). Assuming a relativelyweak prior, when the sample size is small, the normative posterior dis-tribution is located between the likelihood and the prior. Assuming thesame prior and a large sample observed dataset, the normative poste-rior distribution is nearly identical to the likelihood function.

As Fig. 3 left shows, a weak prior belief distribution still has a demon-strable impact on the normative posterior beliefs when the observeddata is relatively small (N=158). For a larger sample (N=5208) thenormative posterior distribution is nearly identical to the observed data(Fig. 3 right). By varying sample size, we use our experiment to in-vestigate whether a tendency for people’s posterior beliefs to deviatemore substantially from the normative posterior distribution for largesamples found in prior work [43] holds for our participants as well.We chose 158 (after Kim et al [43]) and 5,208 as samples in the lowthousands are common in presentations of poll or survey results thatpeople encounter in everyday life.

Besides misunderstanding uncertainty, not trusting that a dataset is afaithful depiction of reality is another possible reason for the deviationbetween one’s posterior beliefs and the normative Bayesian posterior.To investigate the impact of the perceived “controversialness” ofdata on the effects of Bayesian assistance, we identiﬁed two datasetsthat vary in how likely they are to be perceived as having been manip-ulated. We recruited 200 Mechanical Turk workers in the U.S. withapproval ratings of 97% and above. Participants viewed pairwise com-binations of six datasets: the proportion of 1) residents of U.S. assistedliving centers residents who have Alzheimer’s or other dementia, 2)corn production relative to other grain production in the U.S., 3) pa-tients in the U.S who misuse opioids prescribed for chronic pain, 4)foreign-born residents in the U.S., 5) adults in the U.S who think thirdtrimester abortion should be illegal regardless of circumstances, and6) adults in the U.S. who support the death penalty.In a ﬁrst session, on each trial the participant saw a pair of datasetdescriptions (i.e., a summary of the variable) side by side. Partici-pants were asked to choose one dataset that “seems more likely tobe tampered with or manipulated to persuade” using a radio button.Participants viewed a total of 15 pairs (trials). In the second session,participants viewed the same 15 pairs but where the original propor-tion from the source is presented with a 95% highest density intervalcalculated by for an assumed sample size of 158. We randomized theorder of pairs in both sessions.We ranked the datasets by perceived manipulation using the sumof participants’ votes per dataset. The proportion of U.S. assisted liv-ing centers residents who have Alzheimer’s obtained the fewest votesacross both questions, while the proportion of Americans who believelong-term abortions should be illegal unilaterally obtained the most.

It is possible that prior elicitation itself may affect how “Bayesian” aperson appears to be, for example if it encourages the user to be moresensitive to uncertainty in the data. We include two conditions forwhich we do not elicit prior beliefs–No Elicitation-Point Estimate andNo Elicitation-Uncertainty Visualization–and use them to evaluate theimpact of elicitation on deviation from normative Bayesian belief up-dating. Though individual-level updating with and without elicitationcannot be directly compared without eliciting the individual’s prior, anaggregate-level analysis, in which we assign No Elicitation conditions a common prior learned from many participants, allows us to observehow elicitation appears to change updating at an aggregate level.

We ran our experiment as a between-subjects study. Participants wererandomly assigned to one of the six elicitation and visualization con-ditions and one of four datasets (small or large dementia dataset orsmall or large abortion data) (Fig. 2). We pre-registered our condi-tions, sample sizes, and analysis . An introductory page described thedementia datasets (originally from the U.S. National Center for HealthStatistics [14]) as having been collected by a national health agency,and the abortion datasets (originally from FOX News [10]) as havingbeen collected by a media outlet. Before we show you the study data, please tell us your best estimate of what percentage of assisted-living center residents in the US have Alzheimer's or dementia.

Tell us how sure you are about your prediction

Next, consider how uncertain you are about your estimate. Drag either gray end of the uncertainty range around the value that you just entered, until the uncertainty it displays aligns with how uncertain you are about the true percentage.If you have no idea whether the estimate you made is more correct than any other value between 0 and 100%, the interval should span from 0 to 100%.

Otherwise, you should adjust the ends of the interval to make it smaller.

You think the percentage is almost certainly no less than and no more than and it's most likely around . Fig. 4. The elicitation interface. First, the participant enters a point es-timate (top), then they specify how certain they are about their estimateby dragging either end of the interval (bottom). When the participantinteracts with either handle, the other handle updates to accommodatethe updated Beta distribution.

Participants assigned to elicitation conditions ﬁrst provided their priorbeliefs (Fig. 4 top). We designed an interface that prompted theparticipant to enter their best estimate of the parameter of interest(e.g., the percentage of assisted-living center residents in the US haveAlzheimer’s or dementia), following prior research in proportion priorelicitation from experts [64]. A two-handled slider then appeared, rep-resenting an interval around the value they provided as their estimate,with endpoints at 0 and 100%. Participants were asked to specify arange around the value by dragging the end of the interval until itswidth aligned with how uncertain they felt about the true rate (Fig. 4bottom). Participants were explicitly told that if their estimate repre-sented a truly random guess, then their interval should span from 0 to100%; otherwise they should adjust the ends of the interval to makeit smaller. When the participant interacted with either handle, we up-dated the concentration parameter ( κ ) based on the handle’s value andthe mode, then calculated the other handle’s location to reﬂect the 95%interval of the new Beta distribution. Speciﬁcally, κ is inversely pro-portional to the width of the elicited interval. Text above the sliderreﬂected the speciﬁed prior (e.g., You think the percentage is almostcertainly no less than 15% and no more than 33% and it’s most likelyaround 23% , Fig. 4c).

After prior elicitation, all participants examined the observed data. Tocreate the visualization stimuli, we used the proportions from the orig-inal source of the datasets (dementia dataset: 42%, abortion dataset:37%) and varied the sample size that a participant was assigned (small:158, large:5208). Participants in the Point Estimate conditions saw thepoint estimate of the proportion plotted with the number of successesand sample size in text only (Fig 5a). Participants in the UncertaintyVisualization and Bayesian assistance conditions saw the point esti-mate plotted with an interval depicting the lower and upper bound of Pre-registration I, Pre-registration II b) Uncertainty visualization only condition(d) Posterior Visualization(a) Point Estimate condition(c) Uncertainty Analogy condition

The study data has than your prior beliefs. In other words, your prior beliefs are than the study data. This means you should take the study data than your prior beliefs.

How was this calculated?

We directly compared the sample size of the study by the national health agency to the sample size of the study implied by your prior beliefs. Your prior beliefs from the previous page implied a study with less participants than those of the study, so the data has than your prior beliefs.This doesn’t mean that your prior beliefs don’t have some useful information. Both your beliefs and the study data suggest information about what the true value of the percentage of residents who had Alzheimer's or dementia.

How was this calculated?

We directly compared the sample size of the study by the national health agency to the sample size of the study implied by your prior beliefs. According to our computation, the study data has 2 times more information than your prior beliefs , so we weighted the study data more when we merged the two information.

Fig. 5. Conditions in our experiment, including visualizing observed data as a point estimate with sample size, using a high probability intervalwith shading to visualize uncertainty in the observed data only, providing an uncertainty analogy based on the participant’s prior, and providing apredicted posterior visualization based on the user’s prior. the corresponding Beta distribution for the Binomial likelihood func-tion, with shading proportional to probability density (Fig 5b).

After viewing the data and prior visualization, participants in the as-sistance conditions then clicked for the Bayesian assistance, which ap-peared below the visualization of the observed data. For participantsin the Analogy condition, we presented an analogy in text (Fig. 5c).For participants in the Posterior Visualization condition, we presenteda visualization like our uncertainty visualization of the observed data,but where the distribution shown is the Beta distribution correspondingto the predicted posterior from our Bayesian model (Fig. 5d).

All participants then submitted their posterior beliefs on the nextscreen. On a ﬁnal screen, participants were asked demographic ques-tions (gender, education level, and age), and how likely they thoughtit was that the data was manipulated on a ﬁve-point Likert scale withendpoints labeled Not at all likely (1) and Extremely likely (5). Theﬁnal screen asked participants what proportion corresponded to theobserved data they had been shown via multiple choice (Below 30%,between 30% to 60%, above 60%) as a preregistered exclusion criteriato ﬁlter participants who were not paying attention from analysis.

We recruited participants on Amazon Mechanical Turk, removingthose who failed the preregistered exclusion criteria question (total182), and recruiting more until each condition had 200 participants(total 4,800). We made the HIT available to U.S. workers with an ap-proval rating of 97% or more. The HIT carried a reward of $0.8, whichwe calculated to ensure that the majority of workers would receive theU.S. minimum wage according to pilot study completion times.

ESULTS

The average completion task time was 3.6 min (SD: 6.6). To analyzeparticipants’ responses, we ﬁt the elicited beliefs to a Beta distribu-tion. We treat the elicited point estimate as the mode of a Beta distri-bution ( ω ) and the width of the interval as the concentration param-eter ( κ ) to ﬁt a distribution using optimization as suggested by priorwork [64]. To compute each participant’s normative posterior distribu-tion, we used the relationship between the posterior Beta parametersand those of the prior and likelihood deriving from Bayes’ rule (Eq. 1). We treat the deviation between the participant’ actual posterior beliefsand the normative posterior beliefs as a proxy for how well the par-ticipant appears to have interpreted the information contained in theobserved data and combined them with their knowledge they already had. We analyzed the deviation in two ways. First, to provide intu-ition for how participants updated in terms of the familiar notions ofa distribution’s location and variance, we compared the location (i.e.,mean) and the variance of each participants’ posterior distribution tothose of the normative posterior distribution.Second, we pre-registered an analysis using KL Divergence (KLD)to measure the difference between a participant’s stated posterior be-liefs and the normative posterior distribution from our Bayesian mod-els. KLD captures the information loss when representing a targetdistribution p with a second distribution q [45]. We analyzed qualitative differences in how participants updated theirbeliefs across datasets and visualization conditions.

We categorize participants into ﬁve “update types” based on the lo-cation (i.e., mean) of their posterior distribution relative to their priordistribution, the normative posterior for that participant, and the likeli-hood (Fig. 6). We use near normative when the location of the partic-ipant’s posterior is within a relatively small window of the normativeposterior (i.e., +/- 2%). We use overweight prior for cases where aparticipant overweighted their prior distribution relative to the predic-tions of normative Bayesian updating, and overweight data for caseswhere the participant’s posterior fell between the prior and likelihoodbut was closer to the likelihood than predicted by normative Bayesianupdating. While most participants’ posterior distributions fell, as wemight expect, somewhere between their prior distribution and the like-lihood, we use updated away from data for cases where participant’sposterior moved in an opposite direction from the likelihood as well astheir prior. We use overshoot data for cases where the location of theparticipant’s posterior surpassed or “overshot” the observed data.Figure 6 characterizes participants’ updating behavior by datasetand visualization condition according to these categories. Overall, the near normative type was the most frequent across datasets and condi-tions, suggesting that people are approximating Bayesian updating interms of the location of their distributions. Participants in the Point Es-timate conditions (ﬁrst column in Fig. 6) were the least likely to fall inthe near normative category, and those in the Posterior Visualizationconditions (last column) were the most likely to.Overweighting one’s prior was, however, more common in two con-ditions: the Point Estimate for the large abortion dataset and Uncer-tainty Visualization for the small abortion dataset. The greater ten-dency among participants to perceive the abortion dataset as havingbeen manipulated may have led participants to adhere more stronglyto their prior beliefs.Similarly, when comparing the ratio of the overweight prior typebetween dementia datasets (row a and b) and abortion dataset (row cand d), more participants overweighted their priors when they exam-ined abortion datasets. ataNormative

Updated away from data Overweight prior Near normative Overweight data Overshoot data

PriorLegend:

Point Estimate Uncertainty Visualization Analogy Posterior Visualization(a) Dementia(N=158)(b) Dementia (N=5028)(c) Abortion (N=158)(d) Abortion (N=5028)

Location of Updated Belief Distribution by Condition

Legend:

Variance of posterior more than 50% smaller than variance of normative 10-50% smaller than variance of normative Close to variance of normative 10-50% larger than variance of normative More than 50% larger than variance of normative

Variance of Updated Belief Distribution by Condition <=50% <10-50% close >10-50% >=50%

Point Estimate Uncertainty Visualization Analogy Posterior Visualization(a) Dementia(N=158)(b) Dementia (N=5028)(c) Abortion (N=158)(d) Abortion (N=5028) <=50% <10-50% close >10-50% >=50% <=50% <10-50% close >10-50% >=50% <=50% <10-50% close >10-50% >=50%

Fig. 6. Categorization of the location and variance of participants’ updates relative to the predictions of normative Bayesian inference for thatparticipant. Top: Each participant was categorized according to the relationship between the mean of their posterior distribution relative to that oftheir prior distribution, the normative posterior distribution, and the likelihood function. The legend shows a hypothetical participant for which themean of their prior distribution was smaller than that of the likelihood; our analysis also includes the opposite case (i.e., the mean of the participant’sprior was greater than the mean of the likelihood). Bottom: Each participant was categorized according to the relationship between the variance oftheir posterior distribution relative to that of the normative posterior distribution.

Figure 6 also indicates that the analogy conditions resulted in thehighest ratio of people who overshot the likelihood across datasets.The vast majority (roughly 95%) of our participants had more uncer-tain priors compared to the likelihood, leading to multipliers greaterthan one. It is possible that imprecise mental calculations led analogyparticipants to overcorrect.

To contextualize how the amount of uncertainty implied by partici-pants’ posterior beliefs compared to the amount predicted by norma-tive inference, we categorized patterns in variance updates (Fig. 6).Because the deviation in elicited posterior versus normative posteriorvariance was considerably larger than that for means, we categorizedparticipants as close to normative if the participant’s posterior waswithin 10% of the variance of the normative posterior. We similarlycategorized participants whose posterior variance was more than 50%smaller than the variance of the normative posterior, as well as 10-50%smaller, 10-50% larger, or more than 50% larger.Comparing the distribution across categories in Figure 6 Location(top) to that in Figure 6 Variance (bottom), it is clear that partic-ipants’ deviations from normative inference are driven primarily bynon-Bayesian updating of the variance of their beliefs. Additionally,in contrast to the results on location updating, we see no clear ad-vantages of the two types of Bayesian assistance in reducing errors invariance updating. Regardless of the speciﬁc dataset, most participantsprovided posterior beliefs the variance of which was 10%-50% higherthan the variance of the normative posterior. Hence, participants re-mained more uncertain about the parameter value than they shouldhave in general. Possible drivers of this pattern include unmodeledpredictors (e.g., a person’s relative trust in data relative to a Bayesian),error in elicitation, or non-Bayesian updating.Variance results are somewhat different between the small (row a and c) and large datasets (row b and d). Speciﬁcally, around 30%of participants who saw small datasets were more certain than thenormative posterior (summing up the ﬁrst two bars). However, forthose who saw large datasets, this number dropped to less than 17%of participants. Overall, participants were less certain of their up-dated beliefs than the normative posterior, but those who saw the smalldatasets were overconﬁdent more frequently than those who saw thelarge datasets.

Per our pre-registration, we speciﬁed four Bayesian linear regressions,one for each dataset we presented to participants (dementia N=158,dementia N=5208, abortion N=158, abortion N=5208). These regres-sions estimate differences in the distributions of KLD , a singular mea-sure of deviation between each participant’s updating and normativeBayesian updating, by condition. kld ∼ dlnorm ( µ , σ ) µ = µ int + µ post ∗ Post + µ anlg ∗ Analogy + µ pointEst ∗ PointEst log ( σ ) = σ int + σ post ∗ Post + σ anlg ∗ Analogy + σ pointEst ∗ PointEst µ int , µ post , µ anlg , µ pointEst ∼ dnorm ( , ) , σ int , σ post , σ anlg , σ pointEst ∼ dnorm ( , . ) Each model consisted of two submodels. The ﬁrst submodel pre-dicted bias (mean error) in log KLD, capturing how closely partic-ipants’ response distributions aligned with the normative Bayesianprediction by condition. We use log KLD in our analysis (reportingnon-log error results in Supplemental Material) to reduce the impactsof outliers we observed across conditions on our estimates, as KLDgrows rapidly as the two distributions diverge more. ig. 7. Posterior estimates of bias (mean error) and dispersion (standarddeviation) of log KLD with 95% credible interval by condition. Results forthe dementia datasets are presented in the top row, and for the abor-tion datasets in the bottom row. Annotations describe effects relative tovisualizing uncertainty in observed data (Uncertainty Vis).

The second submodel regressed dispersion (variance) in log KLDin log space on the same variables, capturing how much variation therewas between participants’ deviations from normative inference in acondition. In addition to lower bias, lower dispersion (i.e., more con-sistent) estimates of log KLD means a technique reduces noise.We implemented each model in R’s rethinking package [48], usingweakly-informed Gaussian prior distributions centered around 0 for bias and dispersion. We used dummy variables to indicate whether theparticipant was shown an uncertainty visualization, an analogy, or aposterior visualization.We report the result for each condition and dataset relative to a par-ticipant in the Uncertainty Visualization condition, as visualizing un-certainty is arguably the best choice a designer could make outsideof personalization. We provide coefﬁcients for both submodels in Fig-ure 7, left. For readers familiar with statistical signiﬁcance, we say thata condition has a reliable effect over uncertainty visualization when its95% Percentile Interval (PI) (reported in text) does not overlap with 0(which would indicate the possibility of no effect). We visualize poste-rior estimates of expected bias and dispersion in log KLD by condition(Fig. 7, right). Model speciﬁcations are in Supplemental Material.To further contextualize the size of the effects in bias and disper-sion, we also report Cohen’s d [19] and Common Language Effect Size(CLES [50]), measures of standardized effect size, using our model re-sults. Cohen’s d captures the number of standard deviations by whichtwo means differ, while CLES describes what percentage of the time arandomly drawn sample from one distribution would have a highervalue than a randomly drawn sample from the second distribution.To calculate effect size on our model estimates, we ﬁrst constructedan aggregated posterior distribution for each condition, using the biasposterior estimates from the bias submodel and dispersion posteriorestimates from the dispersion model. We compute effect size by com-paring the distribution of the assistance conditions with that of the Un-certainty Visualization condition.

Small sample (N=158):

Relative to the Uncertainty Visualization con-dition, both Bayesian assistance techniques reliably decreased bias inlog KLD by similar amounts (-0.19, -0.17 respectively; Fig 7 a ). View-ing a Point Estimate was not distinguishable in log KLD compared toviewing an Uncertainty Visualization.Our characterization of updating by location and variance (Sec. 5.3)suggested that the Posterior Visualization helped participants correctlyupdate the location of their beliefs. Hence, the bias reduction in logKLD may be driven by better location updating among Posterior Visu-alization participants. On the other hand, our earlier analysis (Fig. 6)indicates that the location updating of participants in the Analogy con-dition and the Uncertainty Visualization condition for the small de-mentia dataset are similar. Hence the reliable improvement in updatingwe observe for the Analogy condition may be driven more by bettervariance updates than better location updating.Our dispersion submodel indicates that the Posterior Visualizationled to more consistent values of log KLD among participants com-pared to Uncertainty Visualization, with an estimated reduction in dis-persion of 0.39 (Fig 7 e ). Seeing an Analogy did not noticeably af-fect dispersion compared to the Uncertainty Visualization. However,viewing a Point Estimate increased dispersion in log KLD relative toUncertainty Visualization.Cohen’s d for the Posterior Visualization was 0.33, equivalent to aCLES of 59%. Hence, a participant from Posterior Visualization con-ditions will have lower log KLD than a participant from the Uncer-tainty Visualization condition 59 out of 100 times when we randomlyselect a participant from each condition. Cohen’s d for the Analogyassistance was 0.27, equivalent to a CLES 57%. Large sample (N=5208):

Relative to the Uncertainty Visualizationcondition, viewing a Posterior Visualization reliably reduced bias inlog KLD, but viewing an Analogy or Point Estimate had no observableeffect (Fig. 7 b ).While highly variant, the distribution of bias in log KLD for thePosterior Visualization condition does not overlap with the distribu-tions of expected bias for the non-Bayesian conditions (Fig. 7 b right).However, the distribution of expected bias for the Analogy conditionis not distinguishable from the Point Estimate and Uncertainty Visual-ization conditions. Again, our earlier analysis of location and varianceupdates (Fig. 6) suggests that participants in the Posterior Visualiza-tion conditions were better at updating the location of their posterior.ll conditions reliably increased dispersion in log KLD relative toUncertainty Visualization (Fig. 7 f )Cohen’s d for the Posterior Visualization was 0.21, equivalent to aCLES of 56%. Small sample (N=158):

Similar to the small dementia dataset, theAnalogy and Posterior Visualization both reliably reduced bias in logKLD relative to the Uncertainty Visualization (Fig. 7 c ) while the PointEstimate condition was not reliably different.Compared to the small sample dementia dataset, being in the Pos-terior visualization condition resulted in higher estimated dispersion in log KLD (Fig. 7 g ).Cohen’s d for the Analogy and Posterior Visualization were 0.35(CLES 59%). Large sample (N=5208):

In contrast to the large dementia dataset,neither the Posterior Visualization nor the Analogy condition reliablyreduced bias in log KLD for the large abortion dataset (Fig. 7 d ). APoint Estimate also did not reliably differ from Uncertainty Visual-ization. We suspect that any effects of Bayesian assistance were toosmall to observe in light of the rather large discrepancies we observedbetween participants’ posterior beliefs and the predictions of norma-tive Bayesian inference with regard to variance (Fig. 6).We see slightly different patterns compared to the large sample de-mentia dataset when it comes to effects on dispersion in log KLD.Viewing an Analogy slightly decreased dispersion in log KLD whileviewing a Point Estimate had a stronger decreasing effect (Fig. 7 h ). Our results conceptually replicate a difference in how closely theupdates of untrained participants resemble Bayesian updating whenshown a small versus a large dataset observed in behavioral eco-nomics [2, 12] and visual data interpretation [43]. While participantsassigned large datasets appear to update closer to normative Bayesianinference when we look at location of posterior beliefs (e.g., com-pare row a and b, and row c and d in Fig 6), the opposite is truewhen we look at the variance of their posterior beliefs, where devi-ation from normative Bayesian inference is substantial. The averagebias in log KLD across participants was 0.90 (median:0.93, IQR:0.23,KLD: 11.24) for small datasets, and much higher for large datasets(mean: 1.67, median:1.68, IQR:0.04, KLD: 49.7), similar to Kim etal.’s [43] observations for a small sample (n=158) and much larger(n=750k) sample.Conceptual models of bias like belief in the law of small num-bers [63] attempt to explain diverse experimental evidence on beliefupdating. Our results and those of Kim et al. [43] are congruent witha model of non-belief in the law of large numbers [12] suggesting thatwhile a Bayesian expects a estimate to eventually converge to the truerate, people update their beliefs as though they expect error in the es-timate to be relatively high and constant as sample size increases. -0.8 -0.1 ElicitationNo elicitation

Dementia (N=158) Dementia (N=5208) Abortion (N=158) Abortion (N=5208) -1.0 -0.1 -0.1 0.3 -0.3 0.1

Fig. 8. Comparing whether users from whom we elicited priors updatedcloser to Bayesian in aggregate than those who did not provide pri-ors. Elicitation conditions yielded lower log KLD, implying prior elicitationalone may improve updating.

Our results show that conditional on a user specifying their prior, Pos-terior Visualization and sometimes Uncertainty Analogy better pro-mote Bayesian updating than simply visualizing uncertainty in the ob-served data. However, given that the status quo in most interactivevisualization is not to elicit a prior, one might ask how the act of priorelicitation itself impacts updating. Do users become more sensitive to uncertainty in observed data when they explicitly consider their sub-jective uncertainty about a parameter value?Comparing an individual’s posterior beliefs to a normative Bayesianposterior with and without elicitation is not possible, as without a priorwe would have no way of computing the normative posterior. We in-stead use an aggregate analysis approach similar to that used in priorwork on Bayesian cognition [34, 43] and to our approach to computingeffect size using CLES (full details reported in Supplemental Mate-rial). Across the board, elicitation conditions yielded lower log KLD,suggesting prior elicitation alone may improve updating (Fig. 8).

AYESIAN C OGNITION AS V ISUALIZATION F RAMEWORK

We reﬂect on the potential for using Bayesian assistance and Bayesianmodeling to improve visualization.

Our work adds to growing evidence that a Bayesian cognition ap-proach can deepen insight into belief formation from visualization andgive rise to new design and evaluation techniques for visualization re-search and practice.Our results ﬁrst provide evidence of tendencies in how untrainedusers form beliefs from data. Comparing our analysis of locationupdates to that of variance updates as a whole (Sec. 5.3), it is clearthat people are much better at providing posterior beliefs that are lo-cated (i.e., have a mean that is) approximately near the location of thenormative Bayesian posterior beliefs than they are at providing poste-rior beliefs that are appropriately certain. Speciﬁcally, study partici-pants remained considerably less certain that the information-poolingBayesian would do, aligning with recent empirically-based models ofbelief updating from behavioral economics [12] as well as the largesample results of Kim et al. [43].When visualizations present estimates based on small samples forinference, generating Bayesian assistance from users’ priors in thecontext of a simple Bayesian model can improve untrained users sensi-tivity to how informative new data are. Compared to visualizing uncer-tainty in an estimate, Bayesian assistance resulted in a small to moder-ate reduction in bias in updating for estimates based on small samples,even when data were perceived as moderately likely to have been ma-nipulated. When the Bayesian assistance techniques were compared topoint estimates, which remain the default approach to presenting esti-mates in many venues [36], the Bayesian assistance techniques wereslightly more effective (CLES from 55% to 61%). Using prior beliefsas an entry point into communicating uncertainty via Bayesian assis-tance may therefore be helpful in common small sample scenarios likepresentations of poll results, where people’s misinterpretations of un-certainty in data often have implications for their decisions. It can alsoreduce heterogeneity in updating behavior, especially if the alternativepresentation is a point estimate with sample size.The beneﬁts of Bayesian assistance for large sample scenariosare less clear-cut. For the dementia dataset, visualizing a predictedBayesian posterior better aligned participants’ posterior beliefs on av-erage with Bayesian inference. This effect, similar to the effects ofposterior visualization that we observed for small samples, appears tobe driven mostly by the Bayesian assistance helping people more ac-curately update the location of their beliefs. We note, however, that theeffect of posterior visualization for the large dementia dataset may betoo small to be of practical signiﬁcance, as in a large data case KLDcan be sensitive (e.g., even if two highly concentrated distributions arequite close in location, KLD can yield a high value.The Analogy condition did not reliably improve inference for thelarge dementia dataset. It is possible that people struggled to use largemultipliers to arrive at the normative posterior implied by the analogy,as larger numbers are associated with less precise mental representa-tions and more error in mental calculation [23].For the large abortion dataset, which participants rated as slightlymore likely to be subject to manipulation, neither of Bayesian assis-tance techniques improved inferences. This may be due to participantsdiscounting the informativeness of the data based on their perceptionshat it might have been manipulated. We present an analysis in Sup-plemental Material that provides partial support for this explanation.

The beneﬁts of eliciting data-oriented predictions from visualizationusers have been demonstrated in prior work by Kim, Hullman, and col-leagues [41, 42, 37]. Our work extends these ﬁndings using a formalBayesian evaluative framework. One possible explanation, congru-ent with the ﬁndings that eliciting probabilistic predictions improvesuncertainty comprehension of Hullman et al. [37], is that interactingwith the prior elicitation interface better prepared participants to rea-son about uncertainty in the observed data. Researchers and authorswho want to engage visualization users to think more deeply aboutestimates should consider eliciting subjective uncertainty as an alter-native or complement to visualizing uncertainty in estimates.

Given the potential utility of Bayesian models of cognition to visu-alization, as demonstrated by our work and prior work [43, 65], it isworth considering the importance of assumptions of these models andthe design requirements of using such approaches.

Using Bayesian models of cognition in visualization assumes thatusers have prior beliefs, they can articulate them when guided to doso, and that greater alignment between how they update their beliefsand how a Bayesian would is desirable (Sec. 3.1). A common questionmight be, can I trust the prior beliefs that a participant provides? Werefer the reader to literature in economics and psychology for detailedevidence suggesting that people can provide priors unincentivized, andthat elicited or inferred representations of people’s prior beliefs haspredictive value for their later behavior (Sec. 2.1).When it comes to applications of Bayesian cognition to visualiza-tion design and evaluation, even though it is reasonable to believethat elicited priors are not a perfect representation of a user’s priorbeliefs, we ﬁnd evidence that they can still be useful to consider ininteraction. Prior elicitation itself may be beneﬁcial for promptinga more uncertainty-aware mindset on the part of a visualization user.Moreover, when multiple belief updates by the same person can be ob-served, as might be the case in visual analytics scenarios, a Bayesianframework can enable detecting patterns of irrational movement or un-certainty reduction in beliefs even if users are far from the predictedBayesian posterior, due to noise in eliciting prior beliefs or approxi-mate Bayesian behavior [6]. For example, regardless of the distancebetween their posterior beliefs and normative Bayesian posterior be-liefs, if a person increasingly shifts their beliefs without becomingmore certain over time, or becomes much more certain without anyshifts in beliefs, it is relatively obvious that their belief formation is notresponding appropriately to data. It may be worth exploring how priorelicitation could be avoided while still gaining the beneﬁt of Bayesianmodels for bias detection in visual analytics settings where its reason-able to infer a prior based on data that the system has observed theanalyst examining in the past.By explicitly suggesting to a user how they should update their be-liefs in light of new data, Bayesian assistance poses interesting ques-tions about when Bayesian inference is the most appropriate normativestandard. For example, under what conditions should a user who isdistrustful of a data source be guided to integrate the new informationinto their prior beliefs? While this question is beyond the scope of ourwork, we believe that there are a number of cases where valid data isrejected irrationally by users, such as when distrust in the source of amedia report (e.g., a Conservative leaning publication) leads a Demo-crat to reject new information that is in fact trustworthy.In cases where a simple Bayesian model that assumes a user takesdata at “face value” seems clearly inappropriate, such as when a datasource is well known to not be trustworthy, Bayesian modeling canhelp visualization researchers arrive at a more precise understanding ofinﬂuences external to the data. Factors that shape data reception, likethe inﬂuence of one’s a priori trust in the data source, the interaction between the speciﬁc parameter estimate and one’s beliefs about thesource [7, 15], the tendency to reject one’s beliefs entirely upon realiz-ing one was misinformed, or the tendency for people to diverge froma Bayesian’s tendency to form posterior beliefs with less variance thantheir prior or the likelihood even cases where the prior and likelihoodwould seem disparate are all fair game for including in more sophisti-cated Bayesian models in the form of “hyperpriors” (distributions overparameters of the priors). We believe such “pseudo-Bayesian” mod-els could provide the basis for understanding a large class of cognitivebiases that affect judgments from visualizations.

How to use Bayesian cognition for understanding or improving be-lief updating from visualizations may at ﬁrst seem complicated. Wesuggest that a natural starting place to apply the approach involvesﬁrst determining what parameter(s) a visualization supports estimat-ing. The parameter(s) should correspond to statistics on the observeddata that the author believes are most important to the user and infer-ence task: a population-level proportion (rate), a bivariate relationship(with parameters, e.g., of a slope and intercept), an average.A Bayesian model can be speciﬁed to estimate the posterior prob-ability of the parameter(s) given a prior distribution and likelihoodfunction assumed to characterize data generation. As our experimentdemonstrates, even a simple model may sufﬁce to drive improved in-ferences. While Bayesian modeling is ﬂexible to varying forms ofprior and posterior distributions, model speciﬁcation is often simpli-ﬁed by looking to a family of distributions associated with a type ofparameter and likelihood to identify the conjugate prior (e.g., a Betadistribution for probability, a truncated Gaussian for a positive-valuedrandom variable, a Gamma for a duration, etc.). Textbooks aimed atreaders new to Bayesian modeling provide accessible explanations andexamples of common model formats [44, 49] The Bayesian model weemployed for a Binominal likelihood function to generate Bayesianassistance has just a single parameter. However, the general intu-ition behind Bayesian assistance applies to other data generating pro-cesses like Gaussians, where the mean of the normative posterior is theweighted average between the mean of the prior and the observed dataweighted by the amount of information in each distribution. Moredetail on how to calculate posterior parameters when the likelihoodfunction follows other distributions (e.g., Normal distribution) is inSupplemental Material.We believe that the potential for Bayesian assistance to be used asa design strategy in visualization analysis and communication settingsextends far beyond the demonstration we presented here. For exam-ple, while we use an individual’s prior from a single belief update todrive the two forms of Bayesian assistance, recent work from eco-nomics suggests that how a person updates their beliefs in light of newdata is a stable individual trait [5, 6, 24, 52]. Personalizing data rep-resentations based on an individual’s “update type” (e.g., tendency tooverweight vs. underweight their prior or data) may be beneﬁcial invisual analytics or communication settings.

ONCLUSION

We showed how personalizing the presentation of visualized data us-ing Bayesian inference can assist untrained visualizations users in up-dating their beliefs more like Bayesians. Through a large experiment(N=4,800), we found that presenting a Uncertainty Analogy or Pos-terior Visualization improved belief updating for proportion estimatescompared to typical presentations of uncertainty for small datasets,and, in some cases, for large datasets for which people tend to deviatemore from normative inference. By comparing to visualizing uncer-tainty in the data via a shaded interval, we show that better responsiveto new information captured by data may require more sophisticated,theoretically-driven approaches like Bayesian cognition. Further, anaggregate level analysis of updating suggested that prior elicitationalone may improve Bayesian reasoning. Our Bayesian framework canbe applied to gain insight into belief formation, better deﬁne “norma-tive” consumption of data visualizations, and guide interactions withdata in a range of contexts. A CKNOWLEDGEMENTS

This work was supported by NSF award R EFERENCES [1] J. Abeler, D. Nosenzo, and C. Raymond. Preferences for truth-telling.

Econometrica , 87(4):1115–1153, 2019.[2] S. Ambuehl and S. Li. Belief updating and the demand for information.

Games and Economic Behavior , 109:21–39, 2018.[3] O. Armantier, G. Topa, W. Van der Klaauw, and B. Zafar. An overviewof the survey of consumer expectations.

Economic Policy Review , (23-2):51–72, 2017.[4] L. Armona, A. Fuster, and B. Zafar. Home price expectations and be-havior: Evidence from a randomized information experiment.

Review ofEconomic Studies, forthcoming , 2017.[5] P. Atanasov, J. Witkowski, L. Ungar, B. Mellers, and P. Tetlock. Smallsteps to accuracy: Incremental belief updaters are better forecasters.

Or-ganizational Behavior and Human Decision Processes , 160:19–35, 2020.[6] N. Augenblick and M. Rabin. Belief movement, uncertainty reduc-tion, and rational updating.

UC Berkeley-Haas and Harvard UniversityMimeo , 2018.[7] E. W. Austin and Q. Dong. Source v. content effects on judgments ofnews believability.

Journalism Quarterly , 71(4):973–983, 1994.[8] R. Bachmann, T. O. Berg, and E. R. Sims. Inﬂation expectations andreadiness to spend: Cross-sectional evidence.

American Economic Jour-nal: Economic Policy , 7(1):1–35, 2015.[9] M. Bailey, E. D´avila, T. Kuchler, and J. Stroebel. House price beliefs andmortgage leverage choice.

The Review of Economic Studies , 86(6):2403–2452, 2018.[10] V. Balara. Fox news poll: Voters split on abortion, but majority wants roev. wade to endure, 2019.[11] L. Bastin, P. F. Fisher, and J. Wood. Visualizing uncertainty in multi-spectral remotely sensed imagery.

Computers & Geosciences , 28(3):337–350, 2002.[12] D. J. Benjamin, M. Rabin, and C. Raymond. A model of nonbelief in thelaw of large numbers.

Journal of the European Economic Association ,14(2):515–544, 2016.[13] C. Binder and A. Rodrigue. Household informedness and long-run inﬂa-tion expectations: Experimental evidence.

Southern Economic Journal ,85(2):580–598, 2018.[14] M. Bloch and H. Fairﬁeld. For the elderly, diseases that overlap.

The NewYork Times , Apr 15, 2013, ,, 2013.[15] R. Blom. Believing false political headlines and discrediting truthful po-litical headlines: The interaction between news source trust and newscontent expectancy.

Journalism , 0(0):1464884918765316, 0.[16] T. Callaghan. Point of view: Why vaccine opponents think they knowmore than medical experts.

Vital Record, News from TEXAS A&M Health ,2019.[17] C. Camerer. Individual decision making.

Handbook of experimental eco-nomics , 1995.[18] A. Cavallo, G. Cruces, and R. Perez-Truglia. Inﬂation expectations, learn-ing, and supermarket prices: Evidence from survey experiments.

Ameri-can Economic Journal: Macroeconomics , 9(3):1–35, 2017.[19] J. Cohen.

Statistical power analysis for the behavioral sciences . Rout-ledge, 2013.[20] O. Coibion, Y. Gorodnichenko, and S. Kumar. How do ﬁrms formtheir expectations? new survey evidence.

American Economic Review ,108(9):2671–2713, 2018.[21] W. G. Cole and J. E. Davidson. Graphic representation can lead to fastand accurate bayesian reasoning. In

Proceedings. Symposium on Com-puter Applications in Medical Care , pages 227–231. American MedicalInformatics Association, 1989.[22] M. Correll and M. Gleicher. Error bars considered harmful: Exploring al-ternate encodings for mean and error.

IEEE transactions on visualizationand computer graphics , 20(12):2142–2151, 2014.[23] S. Dehaene.

The number sense: How the mind creates mathematics . OUPUSA, 2011.[24] J. Dominitz and C. F. Manski. Measuring and interpreting expectations ofequity returns.

Journal of Applied Econometrics , 26(3):352–370, 2011. [25] F. DAcunto, D. Hoang, and M. Weber. The effect of unconventional ﬁscalpolicy on consumption expenditure. Technical report, National Bureau ofEconomic Research, 2016.[26] C. R. Ehlschlaeger, A. M. Shortridge, and M. F. Goodchild. Visualiz-ing spatial data uncertainty using animation.

Computers & Geosciences ,23(4):387–395, 1997.[27] D. Feng, L. Kwock, Y. Lee, and R. Taylor. Matching visual saliency toconﬁdence in plots of uncertain data.

IEEE Transactions on Visualizationand Computer Graphics , 16(6):980–989, 2010.[28] M. Fernandes, L. Walls, S. Munson, J. Hullman, and M. Kay. Uncertaintydisplays using quantile dotplots or cdfs improve transit decision-making.In

Proceedings of the 2018 CHI Conference on Human Factors in Com-puting Systems , page 144. ACM, 2018.[29] W. R. Ferrell and P. J. McGoey. A model of calibration for subjec-tive probabilities.

Organizational Behavior and Human Performance ,26(1):32–53, 1980.[30] A. Fuster, B. Hebert, and D. Laibson. Natural expectations, macroe-conomic dynamics, and asset pricing.

NBER Macroeconomics Annual ,26(1):1–48, 2012.[31] R. Garcia-Retamero and U. Hoffrage. Visual representation of statisticalinformation improves diagnostic inferences in doctors and their patients.

Social Science & Medicine , 83:27–33, 2013.[32] J. Geraghty. A small biden slump?

The National Review , 2019.[33] G. Gigerenzer and U. Hoffrage. How to improve bayesian reasoning with-out instruction: frequency formats.

Psychological review , 102(4):684,1995.[34] T. L. Grifﬁths and J. B. Tenenbaum. Optimal predictions in everydaycognition.

Psychological science , 17(9):767–773, 2006.[35] J. M. Hofman, D. G. Goldstein, and J. Hullman. How visualizing infer-ential uncertainty can mislead readers about treatment effects in scientiﬁcresults. In

Proceedings of the 2020 CHI Conference on Human Factorsin Computing Systems . ACM, 2020.[36] J. Hullman. Why authors dont visualize uncertainty.

IEEE transactionson visualization and computer graphics , 2019.[37] J. Hullman, M. Kay, Y.-S. Kim, and S. Shrestha. Imagining replications:Graphical prediction & discrete visualizations improve recall & estima-tion of effect uncertainty.

IEEE transactions on visualization and com-puter graphics , 24(1):446–456, 2018.[38] J. Hullman, P. Resnick, and E. Adar. Hypothetical outcome plots outper-form error bars and violin plots for inferences about reliability of variableordering.

PloS one , 10(11):e0142444, 2015.[39] A. Kale, F. Nguyen, M. Kay, and J. Hullman. Hypothetical outcomeplots help untrained observers judge trends in ambiguous data.

IEEEtransactions on visualization and computer graphics , 2018.[40] M. Kay, T. Kola, J. R. Hullman, and S. A. Munson. When (ish) is mybus?: User-centered visualizations of uncertainty in everyday, mobile pre-dictive systems. In

Proceedings of the 2016 CHI Conference on HumanFactors in Computing Systems , pages 5092–5103. ACM, 2016.[41] Y.-S. Kim, J. Hullman, and M. Agrawala. Generating personalized spatialanalogies for distances and areas. In

Proceedings of the 2016 CHI Con-ference on Human Factors in Computing Systems , pages 38–48. ACM,2016.[42] Y.-S. Kim, K. Reinecke, and J. Hullman. Explaining the gap: Visualizingone’s predictions improves recall and comprehension of data. In

Pro-ceedings of the 2017 CHI Conference on Human Factors in ComputingSystems , pages 1375–1386. ACM, 2017.[43] Y.-S. Kim, L. Walls, P. Krafft, and J. Hullman. A bayesian cognitionapproach to improve data visualization. In

Proceedings of the 2019 CHIConference on Human Factors in Computing Systems . ACM, 2019.[44] J. Kruschke.

Doing Bayesian data analysis: A tutorial with R, JAGS, andStan . Academic Press, 2014.[45] S. Kullback and R. A. Leibler. On information and sufﬁciency.

Theannals of mathematical statistics , 22(1):79–86, 1951.[46] C. F. Manski. Survey measurement of probabilistic macroeconomicexpectations: progress and promise.

NBER Macroeconomics Annual ,32(1):411–471, 2018.[47] N. McCurdy, J. Gerdes, and M. Meyer. A framework for externalizingimplicit error using visualization.

IEEE transactions on visualization andcomputer graphics , 25(1):925–935, 2018.[48] R. MCELREATH.

Rethinking an R package for ﬁtting and manipulatingBayesian models, version 1.56 , 2016.[49] R. McElreath.

Statistical Rethinking: A Bayesian Course with Examplesin R and Stan . CRC Press, 2016.50] K. O. McGraw and S. Wong. A common language effect size statistic.

Psychological bulletin , 111(2):361, 1992.[51] L. Micallef, P. Dragicevic, and J.-D. Fekete. Assessing the effect of visu-alizations on bayesian reasoning through crowdsourcing.

IEEE Transac-tions on Visualization and Computer Graphics , 18(12):2536–2545, 2012.[52] M. M. Moebius, M. Niederle, P. Niehaus, and T. S. Rosenblat. Managingself-conﬁdence: Theory and experimental evidence. Technical report,National Bureau of Economic Research, 2011.[53] A. Ottley, B. Metevier, P. Han, and R. Chang. Visually communicatingbayesian statistics to laypersons. In

Technical Report . Tufts University,2012.[54] A. Ottley, E. M. Peck, L. T. Harrison, D. Afergan, C. Ziemkiewicz, H. A.Taylor, P. K. Han, and R. Chang. Improving bayesian reasoning: Theeffects of phrasing, visualization, and spatial ability.

IEEE transactionson visualization and computer graphics , 22(1):529–538, 2015.[55] K. Potter, M. Kirby, D. Xiu, and C. R. Johnson. Interactive visualizationof probability and cumulative density functions.

International journal foruncertainty quantiﬁcation , 2(4), 2012.[56] C. Roth and J. Wohlfart. How do expectations about the macroeconomyaffect personal expectations and behavior? 2018.[57] M. Scherer. Biden falls in new democratic primary poll, as warren andsanders make slight gains.

The Washington Post , 2019.[58] A. Schotter and I. Trevino. Belief elicitation in the laboratory.

Annu. Rev.Econ. , 6(1):103–128, 2014.[59] D. M. Sobel, J. B. Tenenbaum, and A. Gopnik. Children’s causal infer-ences from indirect evidence: Backwards blocking and bayesian reason-ing in preschoolers.

Cognitive science , 28(3):303–333, 2004.[60] M. Steyvers, J. B. Tenenbaum, E.-J. Wagenmakers, and B. Blum. In-ferring causal networks from observations and interventions.

Cognitivescience , 27(3):453–489, 2003.[61] J. B. Tenenbaum, T. L. Grifﬁths, and C. Kemp. Theory-based bayesianmodels of inductive learning and reasoning.

Trends in cognitive sciences ,10(7):309–318, 2006.[62] J. Tsai, S. Miller, and A. Kirlik. Interactive visualizations to improvebayesian reasoning. In

Proceedings of the human factors and ergonomicssociety annual meeting , volume 55, pages 385–389. SAGE PublicationsSage CA: Los Angeles, CA, 2011.[63] A. Tversky and D. Kahneman. Belief in the law of small numbers.

Psy-chological bulletin , 76(2):105, 1971.[64] Y. Wu, W. J. Shih, and D. F. Moore. Elicitation of a beta prior for bayesianinference in clinical trials.

Biometrical Journal , 50(2):212–223, 2008.[65] Y. Wu, L. Xu, R. Chang, and E. Wu. Towards a bayesian model of datavisualization cognition, 2017.[66] A. C. Zimmer. Verbal vs. numerical processing of subjective probabil-ities. In