Todd Kulesza | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Todd Kulesza is active.

Explore More

Publication

Featured researches published by Todd Kulesza.

intelligent user interfaces | 2009

Fixing the program my computer learned: barriers for end users, challenges for the machine

Todd Kulesza; Weng-Keen Wong; Simone Stumpf; Stephen Perona; Rachel White; Margaret M. Burnett; Ian Oberst; Andrew J. Ko

The results of a machine learning from user behavior can be thought of as a program, and like all programs, it may need to be debugged. Providing ways for the user to debug it matters, because without the ability to fix errors users may find that the learned programs errors are too damaging for them to be able to trust such programs. We present a new approach to enable end users to debug a learned program. We then use an early prototype of our new approach to conduct a formative study to determine where and when debugging issues arise, both in general and also separately for males and females. The results suggest opportunities to make machine-learned programs more effective tools.

Ksii Transactions on Internet and Information Systems | 2011

Why-oriented end-user debugging of naive Bayes text classification

Todd Kulesza; Simone Stumpf; Weng-Keen Wong; Margaret M. Burnett; Stephen Perona; Andrew J. Ko; Ian Oberst

Machine learning techniques are increasingly used in intelligent assistants, that is, software targeted at and continuously adapting to assist end users with email, shopping, and other tasks. Examples include desktop SPAM filters, recommender systems, and handwriting recognition. Fixing such intelligent assistants when they learn incorrect behavior, however, has received only limited attention. To directly support end-user “debugging” of assistant behaviors learned via statistical machine learning, we present a Why-oriented approach which allows users to ask questions about how the assistant made its predictions, provides answers to these “why” questions, and allows users to interactively change these answers to debug the assistants current and future predictions. To understand the strengths and weaknesses of this approach, we then conducted an exploratory study to investigate barriers that participants could encounter when debugging an intelligent assistant using our approach, and the information those participants requested to overcome these barriers. To help ensure the inclusiveness of our approach, we also explored how gender differences played a role in understanding barriers and information needs. We then used these results to consider opportunities for Why-oriented approaches to address user barriers and information needs.

symposium on visual languages and human-centric computing | 2008

Can feature design reduce the gender gap in end-user software development environments?

Valentina Grigoreanu; Jill Cao; Todd Kulesza; Christopher Bogart; Kyle Rector; Margaret M. Burnett; Susan Wiedenbeck

Recent research has begun to report that female end-user programmers are often more reluctant than males to employ features that are useful for testing and debugging. These earlier findings suggest that, unless such features can be changed in some appropriate way, there are likely to be important gender differences in end-user programmerspsila benefits from these features. In this paper, we compare end-user programmerspsila feature usage in an environment that supports end-user debugging, against an extension of the same environment with two features designed to help ameliorate the effects of low self-efficacy. Our results show ways in which these features affect female versus male enduser programmerspsila self-efficacy, attitudes, usage of testing and debugging features, and performance.

human factors in computing systems | 2014

Structured labeling for facilitating concept evolution in machine learning

Todd Kulesza; Saleema Amershi; Rich Caruana; Danyel Fisher; Denis X. Charles

Labeling data is a seemingly simple task required for training many machine learning systems, but is actually fraught with problems. This paper introduces the notion of concept evolution, the changing nature of a persons underlying concept (the abstract notion of the target class a person is labeling for, e.g., spam email, travel related web pages) which can result in inconsistent labels and thus be detrimental to machine learning. We introduce two structured labeling solutions, a novel technique we propose for helping people define and refine their concept in a consistent manner as they label. Through a series of five experiments, including a controlled lab study, we illustrate the impact and dynamics of concept evolution in practice and show that structured labeling helps people label more consistently in the presence of concept evolution than traditional labeling.

human factors in computing systems | 2012

Tell me more?: the effects of mental model soundness on personalizing an intelligent agent

Todd Kulesza; Simone Stumpf; Margaret M. Burnett; Irwin Kwan

What does a user need to know to productively work with an intelligent agent? Intelligent agents and recommender systems are gaining widespread use, potentially creating a need for end users to understand how these systems operate in order to fix their agents personalized behavior. This paper explores the effects of mental model soundness on such personalization by providing structural knowledge of a music recommender system in an empirical study. Our findings show that participants were able to quickly build sound mental models of the recommender systems reasoning, and that participants who most improved their mental models during the study were significantly more likely to make the recommender operate to their satisfaction. These results suggest that by helping end users understand a systems reasoning, intelligent agents may elicit more and better feedback, thus more closely aligning their output with each users intentions.

symposium on visual languages and human-centric computing | 2010

Explanatory Debugging: Supporting End-User Debugging of Machine-Learned Programs

Todd Kulesza; Simone Stumpf; Margaret M. Burnett; Weng-Keen Wong; Yann Riche; Travis Moore; Ian Oberst; Amber Shinsel; Kevin McIntosh

Many machine-learning algorithms learn rules of behavior from individual end users, such as task-oriented desktop organizers and handwriting recognizers. These rules form a “program” that tells the computer what to do when future inputs arrive. Little research has explored how an end user can debug these programs when they make mistakes. We present our progress toward enabling end users to debug these learned programs via a Natural Programming methodology. We began with a formative study exploring how users reason about and correct a text-classification program. From the results, we derived and prototyped a concept based on “explanatory debugging”, then empirically evaluated it. Our results contribute methods for exposing a learned program’s logic to end users and for eliciting user corrections to improve the program’s predictions.

IEEE Transactions on Software Engineering | 2014

You Are the Only Possible Oracle: Effective Test Selection for End Users of Interactive Machine Learning Systems

Alex Groce; Todd Kulesza; Chaoqiang Zhang; Shalini Shamasunder; Margaret M. Burnett; Weng-Keen Wong; Simone Stumpf; Shubhomoy Das; Amber Shinsel; Forrest Bice; Kevin McIntosh

How do you test a program when only a single user, with no expertise in software testing, is able to determine if the program is performing correctly? Such programs are common today in the form of machine-learned classifiers. We consider the problem of testing this common kind of machine-generated program when the only oracle is an end user: e.g., only you can determine if your email is properly filed. We present test selection methods that provide very good failure rates even for small test suites, and show that these methods work in both large-scale random experiments using a “gold standard” and in studies with real users. Our methods are inexpensive and largely algorithm-independent. Key to our methods is an exploitation of properties of classifiers that is not possible in traditional software testing. Our results suggest that it is plausible for time-pressured end users to interactively detect failures-even very hard-to-find failures-without wading through a large number of successful (and thus less useful) tests. We additionally show that some methods are able to find the arguably most difficult-to-detect faults of classifiers: cases where machine learning algorithms have high confidence in an incorrect result.

human factors in computing systems | 2016

Human-Centred Machine Learning

Marco Gillies; Rebecca Fiebrink; Atau Tanaka; Jérémie Garcia; Frédéric Bevilacqua; Alexis Heloir; Fabrizio Nunnari; Wendy E. Mackay; Saleema Amershi; Bongshin Lee; Nicolas D'Alessandro; Joëlle Tilmanne; Todd Kulesza; Baptiste Caramiaux

Machine learning is one of the most important and successful techniques in contemporary computer science. It involves the statistical inference of models (such as classifiers) from data. It is often conceived in a very impersonal way, with algorithms working autonomously on passively collected data. However, this viewpoint hides considerable human work of tuning the algorithms, gathering the data, and even deciding what should be modeled in the first place. Examining machine learning from a human-centered perspective includes explicitly recognising this human work, as well as reframing machine learning workflows based on situated human working practices, and exploring the co-adaptation of humans and systems. A human-centered understanding of machine learning in human context can lead not only to more usable machine learning tools, but to new ways of framing learning computationally. This workshop will bring together researchers to discuss these issues and suggest future research questions aimed at creating a human-centered approach to machine learning.

intelligent user interfaces | 2012

Towards recognizing "cool": can end users help computer vision recognize subjective attributes of objects in images?

William Curran; Travis Moore; Todd Kulesza; Weng-Keen Wong; Sinisa Todorovic; Simone Stumpf; Rachel White; Margaret M. Burnett

Recent computer vision approaches are aimed at richer image interpretations that extend the standard recognition of objects in images (e.g., cars) to also recognize object attributes (e.g., cylindrical, has-stripes, wet). However, the more idiosyncratic and abstract the notion of an object attribute (e.g., cool car), the more challenging the task of attribute recognition. This paper considers whether end users can help vision algorithms recognize highly idiosyncratic attributes, referred to here as subjective attributes. We empirically investigated how end users recognized three subjective attributes of carscool, cute, and classic. Our results suggest the feasibility of vision algorithms recognizing subjective attributes of objects, but an interactive approach beyond standard supervised learning from labeled training examples is needed.

Proceedings of the 2009 ICSE Workshop on Software Engineering Foundations for End User Programming | 2009

End-user software engineering and distributed cognition

Margaret M. Burnett; Christopher Bogart; Jill Cao; Valentina Grigoreanu; Todd Kulesza; Joseph Lawrance

End-user programmers may not be aware of many software engineering practices that would add greater discipline to their efforts, and even if they are aware of them, these practices may seem too costly (in terms of time) to use. Without taking advantage of at least some of these practices, the software these end users create seems likely to continue to be less reliable than it could be. We are working on several ways of lowering both the perceived and actual costs of systematic software engineering practices, and on making their benefits more visible and immediate. Our approach is to leverage the users cognitive effort through the use of distributed cognition, in which the system and user collaboratively work systematically to reason about the program the end user is creating. This paper demonstrates this concept with a few of our past efforts, and then presents three of our current efforts in this direction.

Explore More