Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Stefan M. Herzog is active.

Publication


Featured researches published by Stefan M. Herzog.


Proceedings of the National Academy of Sciences of the United States of America | 2016

Boosting medical diagnostics by pooling independent judgments

Ralf H. J. M. Kurvers; Stefan M. Herzog; Ralph Hertwig; Jens Krause; Patricia A. Carney; Andy Bogart; Giuseppe Argenziano; Iris Zalaudek; Max Wolf

Significance Collective intelligence is considered to be one of the most promising approaches to improve decision making. However, up to now, little is known about the conditions underlying the emergence of collective intelligence in real-world contexts. Focusing on two key areas of medical diagnostics (breast and skin cancer detection), we here show that similarity in doctors’ accuracy is a key factor underlying the emergence of collective intelligence in these contexts. This result paves the way for innovative and more effective approaches to decision making in medical diagnostics and beyond, and to the scientific analyses of those approaches. Collective intelligence refers to the ability of groups to outperform individual decision makers when solving complex cognitive problems. Despite its potential to revolutionize decision making in a wide range of domains, including medical, economic, and political decision making, at present, little is known about the conditions underlying collective intelligence in real-world contexts. We here focus on two key areas of medical diagnostics, breast and skin cancer detection. Using a simulation study that draws on large real-world datasets, involving more than 140 doctors making more than 20,000 diagnoses, we investigate when combining the independent judgments of multiple doctors outperforms the best doctor in a group. We find that similarity in diagnostic accuracy is a key condition for collective intelligence: Aggregating the independent judgments of doctors outperforms the best doctor in a group whenever the diagnostic accuracy of doctors is relatively similar, but not when doctors’ diagnostic accuracy differs too much. This intriguingly simple result is highly robust and holds across different group sizes, performance levels of the best doctor, and collective intelligence rules. The enabling role of similarity, in turn, is explained by its systematic effects on the number of correct and incorrect decisions of the best doctor that are overruled by the collective. By identifying a key factor underlying collective intelligence in two important real-world contexts, our findings pave the way for innovative and more effective approaches to complex real-world decision making, and to the scientific analyses of those approaches.


Trends in Cognitive Sciences | 2014

Harnessing the wisdom of the inner crowd

Stefan M. Herzog; Ralph Hertwig

Ever since Galtons classic demonstration of the wisdom of crowds in estimating the weight of a slaughtered ox, scholars of the mind and the public alike have been fascinated by the counterintuitive accuracy achieved by simply averaging a number of peoples estimates. Surprisingly, individuals can, to some extent, harness the wisdom of crowds within the confines of their own mind by averaging self-generated, nonredundant estimates.


Experimental Psychology | 2014

Haunted by a doppelgänger: irrelevant facial similarity affects rule-based judgments.

Bettina von Helversen; Stefan M. Herzog; Jörg Rieskamp

Judging other people is a common and important task. Every day professionals make decisions that affect the lives of other people when they diagnose medical conditions, grant parole, or hire new employees. To prevent discrimination, professional standards require that decision makers render accurate and unbiased judgments solely based on relevant information. Facial similarity to previously encountered persons can be a potential source of bias. Psychological research suggests that people only rely on similarity-based judgment strategies if the provided information does not allow them to make accurate rule-based judgments. Our study shows, however, that facial similarity to previously encountered persons influences judgment even in situations in which relevant information is available for making accurate rule-based judgments and where similarity is irrelevant for the task and relying on similarity is detrimental. In two experiments in an employment context we show that applicants who looked similar to high-performing former employees were judged as more suitable than applicants who looked similar to low-performing former employees. This similarity effect was found despite the fact that the participants used the relevant résumé information about the applicants by following a rule-based judgment strategy. These findings suggest that similarity-based and rule-based processes simultaneously underlie human judgment.


Medical Decision Making | 2014

Surrogate Decision Making Do We Have to Trade Off Accuracy and Procedural Satisfaction

Renato Frey; Ralph Hertwig; Stefan M. Herzog

Objective. Making surrogate decisions on behalf of incapacitated patients can raise difficult questions for relatives, physicians, and society. Previous research has focused on the accuracy of surrogate decisions (i.e., the proportion of correctly inferred preferences). Less attention has been paid to the procedural satisfaction that patients’ surrogates and patients attribute to specific approaches to making surrogate decisions. The objective was to investigate hypothetical patients’ and surrogates’ procedural satisfaction with specific approaches to making surrogate decisions and whether implementing these preferences would lead to tradeoffs between procedural satisfaction and accuracy. Methods. Study 1 investigated procedural satisfaction by assigning participants (618 in a mixed-age but relatively young online sample and 50 in an older offline sample) to the roles of hypothetical surrogates or patients. Study 2 (involving 64 real multigenerational families with a total of 253 participants) investigated accuracy using 24 medical scenarios. Results. Hypothetical patients and surrogates had closely aligned preferences: Procedural satisfaction was highest with a patient-designated surrogate, followed by shared surrogate decision-making approaches and legally assigned surrogates. These approaches did not differ substantially in accuracy. Limitations are that participants’ preferences regarding existing and novel approaches to making surrogate decisions can only be elicited under hypothetical conditions. Conclusions. Next to decision making by patient-designated surrogates, shared surrogate decision making is the preferred approach among patients and surrogates alike. This approach appears to impose no tradeoff between procedural satisfaction and accuracy. Therefore, shared decision making should be further studied in representative samples of the general population, and if people’s preferences prove to be robust, they deserve to be weighted more strongly in legal frameworks in addition to patient-designated surrogates.


Medical Decision Making | 2017

The Potential of Collective Intelligence in Emergency Medicine: Pooling Medical Students’ Independent Decisions Improves Diagnostic Performance

Juliane E. Kämmer; Wolf E. Hautz; Stefan M. Herzog; Olga Kunina-Habenicht; Ralf H. J. M. Kurvers

Background. Evidence suggests that pooling multiple independent diagnoses can improve diagnostic accuracy in well-defined tasks. We investigated whether this is also the case for diagnostics in emergency medicine, an ill-defined task environment where diagnostic errors are rife. Methods. A computer simulation study was conducted based on empirical data from 2 published experimental studies. In the computer experiments, 285 medical students independently diagnosed 6 simulated patients arriving at the emergency room with dyspnea. Participants’ diagnoses (n = 1,710), confidence ratings, and expertise levels were entered into a computer simulation. Virtual groups of different sizes were randomly created, and 3 collective intelligence rules (follow-the-plurality rule, follow-the-most-confident rule, and follow-the-most-senior rule) were applied to combine the independent decisions into a final diagnosis. For different group sizes, the performance levels (i.e., percentage of correct diagnoses) of the 3 collective intelligence rules were compared with each other and against the average individual accuracy. Results. For all collective intelligence rules, combining independent decisions substantially increased performance relative to average individual performance. For groups of 4 or fewer, the follow-the-most-confident rule outperformed the other rules; for larger groups, the follow-the-plurality rule performed best. For example, combining 5 independent decisions using the follow-the-plurality rule increased diagnostic accuracy by 22 percentage points. These results were robust across case difficulty and expertise level. Limitations of the study include the use of simulated patients diagnosed by medical students. Whether results generalize to clinical practice is currently unknown. Conclusion. Combining independent decisions may substantially improve the quality of diagnoses in emergency medicine and may thus enhance patient safety.


Psychological Science | 2013

The Crowd Within and the Benefits of Dialectical Bootstrapping A Reply to White and Antonakis (2013)

Stefan M. Herzog; Ralph Hertwig

Can the “wisdom of crowds” (Surowiecki, 2004) be exploited within a single mind? Yes, one can increase accuracy by averaging multiple estimates from the same person (Herzog & Hertwig, 2009; Hourihan & Benjamin, 2010; Müller-Trede, 2011; Rauhut & Lorenz, 2011; Stroop, 1932; Vul & Pashler, 2008; White & Antonakis, 2013; Winkler & Clemen, 2004). We proposed boosting this crowd-within effect with what we called dialectical bootstrapping (Herzog & Hertwig, 2009; hereafter, H&H): averaging a person’s first estimate with his or her second, “dialectical” estimate, derived from knowledge and assumptions different from those motivating the first estimate. A dialectical estimate ideally has an error with a different sign relative to the first estimate—which fosters the chance of error cancellation. There are different ways to elicit a dialectical estimate. We tested one, the consider-the-opposite strategy (Lord, Lepper, & Preston, 1984), and found that averaging first and dialectical estimates improved accuracy more than simply asking people to make an estimate anew and averaging the two estimates (i.e., reliability condition). White and Antonakis (2013; hereafter, W&A) reanalyzed our data using a different accuracy measure, concluding that “dialectical instructions are not needed to achieve the wisdom of many in one mind” (p. 116). Here, we delineate where we agree and disagree with W&A. We concur with W&A that the crowd within works. W&A observed (as have we and other researchers) that averaging two estimates from the same person improves accuracy. Moreover, they obtained this result across different measures of accuracy. We also agree with W&A that “dialectical instructions are not needed to achieve the wisdom of many in one mind” (p. 116); in our previous article, we pointed out (H&H, p. 236) that passage of time appears to be enough to boost the gains obtained by averaging (Vul & Pashler, 2008). Additionally, we highlighted that “accuracy in [our] reliability condition increased as a result of aggregation” (p. 234). Our disagreement with W&A concerns the following question: Can dialectical bootstrapping boost the crowd-within effect beyond the gains observed in the reliability condition (i.e., gains expected to occur when averaging any noisy estimates)? Dialectical Bootstrapping: Does It Have Surplus Value?


JAMA Dermatology | 2017

Sun Protection Factor Communication of Sunscreen Effectiveness: A Web-Based Study of Perception of Effectiveness by Dermatologists

Stefan M. Herzog; Henry W. Lim; Melissa S. Williams; Isa D. de Maddalena; Uli Osterwalder; Christian Surber

Sun Protection Factor Communication of Sunscreen Effectiveness: A Web-Based Study of Perception of Effectiveness by Dermatologists The sun protection factor (SPF) is commonly used to convey a sunscreen’s effectiveness in protecting against UV radiation that causes sunburn (ie, erythema-inducing radiation [EIR]).1 Importantly, the EIR burden depends on the proportion of EIR actually transmitted through the sunscreen to the skin (% EIR transmitted) and not on the proportion of EIR absorbed by the sunscreen (% EIR absorbed). Doubling SPF from, say, 30 to 60 halves % EIR transmitted from 3.3% to 1.7%, thus doubling protection2 (Figure, A). Unfortunately, however, media and health professionals often incorrectly state that SPFs beyond 30 offer only minor improvements in sun protection, arguing that the increase in % EIR absorbed by the sunscreen is less pronounced than the corresponding increase in SPF values3,4 (eg, 96.7% < 98.3% vs 30 60). However, only changes in % EIR transmitted directly relate to changes in SPF; changes in % EIR absorbed do not. In this study, we evaluated whether dermatology experts are able to adequately assess improvements in sunscreen effectiveness based on the following information formats: SPF vs % EIR absorbed vs % EIR transmitted.


Proceedings of the National Academy of Sciences of the United States of America | 2017

Reach and speed of judgment propagation in the laboratory

Mehdi Moussaïd; Stefan M. Herzog; Juliane E. Kämmer; Ralph Hertwig

Significance Individual judgments, feelings, and behaviors can spread from person to person in social networks, similarly to the propagation of infectious diseases. Despite major implications for many social phenomena, the underlying social-contagion processes are poorly understood. We examined how participants’ perceptual judgments spread from one person to another and across diffusion chains. We gauged the speed, reach, and scale of social contagion. Judgment propagation tended to slow down with increasing social distance from the source. Crucially, it vanished beyond a social horizon of three to four people. These results advance the understanding of some of the mechanisms underlying social-contagion phenomena as well as their scope across domains as diverse as political mobilization, health practices, and emotions. In recent years, a large body of research has demonstrated that judgments and behaviors can propagate from person to person. Phenomena as diverse as political mobilization, health practices, altruism, and emotional states exhibit similar dynamics of social contagion. The precise mechanisms of judgment propagation are not well understood, however, because it is difficult to control for confounding factors such as homophily or dynamic network structures. We introduce an experimental design that renders possible the stringent study of judgment propagation. In this design, experimental chains of individuals can revise their initial judgment in a visual perception task after observing a predecessor’s judgment. The positioning of a very good performer at the top of a chain created a performance gap, which triggered waves of judgment propagation down the chain. We evaluated the dynamics of judgment propagation experimentally. Despite strong social influence within pairs of individuals, the reach of judgment propagation across a chain rarely exceeded a social distance of three to four degrees of separation. Furthermore, computer simulations showed that the speed of judgment propagation decayed exponentially with the social distance from the source. We show that information distortion and the overweighting of other people’s errors are two individual-level mechanisms hindering judgment propagation at the scale of the chain. Our results contribute to the understanding of social-contagion processes, and our experimental method offers numerous new opportunities to study judgment propagation in the laboratory.


Nature Human Behaviour | 2018

Publisher Correction: Social learning strategies for matters of taste

Pantelis P. Analytis; Daniel Barkoczi; Stefan M. Herzog

The version of the Supplementary Information file that was originally published with this Article was not the latest version provided by the authors. In the captions of Supplementary Figs. 2 and 8, the median standard error values were reported to be 0.0028 in both cases; instead, in both instances, the values should have been 0.0015. These have now been updated and the Supplementary Information file replaced.


Nature Human Behaviour | 2018

Social learning strategies for matters of taste

Pantelis P. Analytis; Daniel Barkoczi; Stefan M. Herzog

Most choices people make are about ‘matters of taste’, on which there is no universal, objective truth. Nevertheless, people can learn from the experiences of individuals with similar tastes who have already evaluated the available options—a potential harnessed by recommender systems. We mapped recommender system algorithms to models of human judgement and decision-making about ‘matters of fact’ and recast the latter as social learning strategies for matters of taste. Using computer simulations on a large-scale, empirical dataset, we studied how people could leverage the experiences of others to make better decisions. Our simulations showed that experienced individuals can benefit from relying mostly on the opinions of seemingly similar people; by contrast, inexperienced individuals cannot reliably estimate similarity and are better off picking the mainstream option despite differences in taste. Crucially, the level of experience beyond which people should switch to similarity-heavy strategies varies substantially across individuals and depends on how mainstream (or alternative) an individual’s tastes are and the level of dispersion in taste similarity with the other people in the group.Analytis et al. study social learning strategies for matters of taste and test their performance on a large-scale dataset. They show why a strategy’s success depends both on people’s level of experience and how their tastes relate to those of others.

Collaboration


Dive into the Stefan M. Herzog's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge