Ethan Fast | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ethan Fast is active.

Explore More

Publication

Featured researches published by Ethan Fast.

genetic and evolutionary computation conference | 2010

Designing better fitness functions for automated program repair

Ethan Fast; Claire Le Goues; Stephanie Forrest; Westley Weimer

Evolutionary methods have been used to repair programs automatically, with promising results. However, the fitness function used to achieve these results was based on a few simple test cases and is likely too simplistic for larger programs and more complex bugs. We focus here on two aspects of fitness evaluation: efficiency and precision. Efficiency is an issue because many programs have hundreds of test cases, and it is costly to run each test on every individual in the population. Moreover, the precision of fitness functions based on test cases is limited by the fact that a program either passes a test case, or does not, which leads to a fitness function that can take on only a few distinct values. This paper investigates two approaches to enhancing fitness functions for program repair, incorporating (1) test suite selection to improve efficiency and (2) formal specifications to improve precision. We evaluate test suite selection on 10 programs, improving running time for automated repair by 81%. We evaluate program invariants using the Fitness Distance Correlation (FDC) metric, demonstrating significant improvements and smoother evolution of repairs

human factors in computing systems | 2014

Emergent, crowd-scale programming practice in the IDE

Ethan Fast; Daniel Steffee; Lucy Wang; Joel Brandt; Michael S. Bernstein

While emergent behaviors are uncodified across many domains such as programming and writing, interfaces need explicit rules to support users. We hypothesize that by codifying emergent programming behavior, software engineering interfaces can support a far broader set of developer needs. To explore this idea, we built Codex, a knowledge base that records common practice for the Ruby programming language by indexing over three million lines of popular code. Codex enables new data-driven interfaces for programming systems: statistical linting, identifying code that is unlikely to occur in practice and may constitute a bug; pattern annotation, automatically discovering common programming idioms and annotating them with metadata using expert crowdsourcing; and library generation, constructing a utility package that encapsulates and reflects emergent software practice. We evaluate these applications to find Codex captures a broad swatch of programming practice, statistical linting detects problematic code snippets, and pattern annotation discovers nontrivial idioms such as basic HTTP authentication and database migration templates. Our work suggests that operationalizing practice-driven knowledge in structured domains such as programming can enable a new class of user interfaces.

user interface software and technology | 2013

Crowd-scale interactive formal reasoning and analytics

Ethan Fast; Colleen Lee; Alex Aiken; Michael S. Bernstein; Daphne Koller; Eric Smith

Large online courses often assign problems that are easy to grade because they have a fixed set of solutions (such as multiple choice), but grading and guiding students is more difficult in problem domains that have an unbounded number of correct answers. One such domain is derivations: sequences of logical steps commonly used in assignments for technical, mathematical and scientific subjects. We present DeduceIt, a system for creating, grading, and analyzing derivation assignments in any formal domain. DeduceIt supports assignments in any logical formalism, provides students with incremental feedback, and aggregates student paths through each proof to produce instructor analytics. DeduceIt benefits from checking thousands of derivations on the web: it introduces a proof cache, a novel data structure which leverages a crowd of students to decrease the cost of checking derivations and providing real-time, constructive feedback. We evaluate DeduceIt with 990 students in an online compilers course, finding students take advantage of its incremental feedback and instructors benefit from its structured insights into course topics. Our work suggests that automated reasoning can extend online assignments and large-scale education to many new domains.

human factors in computing systems | 2016

Augur: Mining Human Behaviors from Fiction to Power Interactive Systems

Ethan Fast; William McGrath; Pranav Rajpurkar; Michael S. Bernstein

From smart homes that prepare coffee when we wake, to phones that know not to interrupt us during important conversations, our collective visions of HCI imagine a future in which computers understand a broad range of human behaviors. Today our systems fall short of these visions, however, because this range of behaviors is too large for designers or programmers to capture manually. In this paper, we instead demonstrate it is possible to mine a broad knowledge base of human behavior by analyzing more than one billion words of modern fiction. Our resulting knowledge base, Augur, trains vector models that can predict many thousands of user activities from surrounding objects in modern contexts: for example, whether a user may be eating food, meeting with a friend, or taking a selfie. Augur uses these predictions to identify actions that people commonly take on objects in the world and estimate a users future activities given their current situation. We demonstrate Augur-powered, activity-based systems such as a phone that silences itself when the odds of you answering it are low, and a dynamic music player that adjusts to your present activity. A field deployment of an Augur-powered wearable camera resulted in 96% recall and 71% precision on its unsupervised predictions of common daily activities. A second evaluation where human judges rated the systems predictions over a broad set of input images found that 94% were rated sensible.

user interface software and technology | 2016

Meta: Enabling Programming Languages to Learn from the Crowd

Ethan Fast; Michael S. Bernstein

Collectively authored programming resources such as Q&A sites and open-source libraries provide a limited window into how programs are constructed, debugged, and run. To address these limitations, we introduce Meta: a language extension for Python that allows programmers to share functions and track how they are used by a crowd of other programmers. Meta functions are shareable via URL and instrumented to record runtime data. Combining thousands of Meta functions with their collective runtime data, we demonstrate tools including an optimizer that replaces your function with a more efficient version written by someone else, an auto-patcher that saves your program from crashing by finding equivalent functions in the community, and a proactive linter that warns you when a function fails elsewhere in the community. We find that professional programmers are able to use Meta for complex tasks (creating new Meta functions that, for example, cross-validate a logistic regression), and that Meta is able to find 44 optimizations (for a 1.45 times average speedup) and 5 bug fixes across the crowd.

empirical methods in natural language processing | 2016

Identifying Dogmatism in Social Media: Signals and Models.

Ethan Fast; Eric Horvitz

We explore linguistic and behavioral features of dogmatism in social media and construct statistical models that can identify dogmatic comments. Our model is based on a corpus of Reddit posts, collected across a diverse set of conversational topics and annotated via paid crowdsourcing. We operationalize key aspects of dogmatism described by existing psychology theories (such as over-confidence), finding they have predictive power. We also find evidence for new signals of dogmatism, such as the tendency of dogmatic posts to refrain from signaling cognitive processes. When we use our predictive model to analyze millions of other Reddit posts, we find evidence that suggests dogmatism is a deeper personality trait, present for dogmatic users across many different domains, and that users who engage on dogmatic comments tend to show increases in dogmatic posts themselves.

human factors in computing systems | 2018

Iris: A Conversational Agent for Complex Tasks

Ethan Fast; Binbin Chen; Julia Mendelsohn; Jonathan Bassen; Michael S. Bernstein

Today, most conversational agents are limited to simple tasks supported by standalone commands, such as getting directions or scheduling an appointment. To support more complex tasks, agents must be able to generalize from and combine the commands they already understand. This paper presents a new approach to designing conversational agents inspired by linguistic theory, where agents can execute complex requests interactively by combining commands through nested conversations. We demonstrate this approach in Iris, an agent that can perform open-ended data science tasks such as lexical analysis and predictive modeling. To power Iris, we have created a domain-specific language that transforms Python functions into combinable automata and regulates their combinations through a type system. Running a user study to examine the strengths and limitations of our approach, we find that data scientists completed a modeling task 2.6 times faster with Iris than with Jupyter Notebook.

learning at scale | 2018

OARS: exploring instructor analytics for online learning

Jonathan Bassen; Iris Howley; Ethan Fast; John C. Mitchell; Candace Thille

Learning analytics systems have the potential to bring enormous value to online education. Unfortunately, many instructors and platforms do not adequately leverage learning analytics in their courses today. In this paper, we report on the value of these systems from the perspective of course instructors. We study these ideas through OARS, a modular and real-time learning analytics system that we deployed across more than ten online courses with tens of thousands of learners. We leverage this system as a starting point for semi-structured interviews with a diverse set of instructors. Our study suggests new design goals for learning analytics systems, the importance of real-time analytics to many instructors, and the value of flexibility in data selection and aggregation for an instructor when working with an analytics system.

international joint conference on artificial intelligence | 2017

Lexicons on Demand: Neural Word Embeddings for Large-Scale Text Analysis

Ethan Fast; Binbin Chen; Michael S. Bernstein

Human language is colored by a broad range of topics, but existing text analysis tools only focus on a small number of them. We present Empath, a tool that can generate and validate new lexical categories on demand from a small set of seed terms (like “bleed” and “punch” to generate the category violence). Empath draws connotations between words and phrases by learning a neural embedding across billions of words on the web. Given a small set of seed words that characterize a category, Empath uses its neural embedding to discover new related terms, then validates the category with a crowd-powered filter. Empath also analyzes text across 200 built-in, pre-validated categories we have generated such as neglect, government, and social media. We show that Empath’s data-driven, human validated categories are highly correlated (r=0.906) with similar categories in LIWC.

human factors in computing systems | 2015

Text Mining Emergent Human Behaviors for Interactive Systems

Ethan Fast; Pranav Rajpurkar; Michael S. Bernstein

People engage with thousands of situations, activities, and objects on a daily basis. Hand-coding this knowledge into interactive systems is prohibitively labor-intensive, but fiction captures a vast number of human lives in moment to moment detail. In this paper, we bootstrap a knowledge graph of human activities by text mining a large dataset of modern fiction on the web. Our knowledge graph, Augur, describes human actions over time as conditioned by nearby locations, people, and objects. Applications can use this graph to react to human behavior in a data-driven way. We demonstrate an Augur-enhanced video game world in which non-player characters follow realistic patterns of behavior, interact with their environment and each other, and respond to the users behavior.

Explore More