Walter S. Lasecki | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Walter S. Lasecki is active.

Explore More

Publication

Featured researches published by Walter S. Lasecki.

user interface software and technology | 2012

Real-time captioning by groups of non-experts

Walter S. Lasecki; Christopher D. Miller; Adam Sadilek; Andrew Abumoussa; Donato Borrello; Raja S. Kushalnagar; Jeffrey P. Bigham

Real-time captioning provides deaf and hard of hearing people immediate access to spoken language and enables participation in dialogue with others. Low latency is critical because it allows speech to be paired with relevant visual cues. Currently, the only reliable source of real-time captions are expensive stenographers who must be recruited in advance and who are trained to use specialized keyboards. Automatic speech recognition (ASR) is less expensive and available on-demand, but its low accuracy, high noise sensitivity, and need for training beforehand render it unusable in real-world situations. In this paper, we introduce a new approach in which groups of non-expert captionists (people who can hear and type) collectively caption speech in real-time on-demand. We present Legion:Scribe, an end-to-end system that allows deaf people to request captions at any time. We introduce an algorithm for merging partial captions into a single output stream in real-time, and a captioning interface designed to encourage coverage of the entire audio stream. Evaluation with 20 local participants and 18 crowd workers shows that non-experts can provide an effective solution for captioning, accurately covering an average of 93.2% of an audio stream with only 10 workers and an average per-word latency of 2.9 seconds. More generally, our model in which multiple workers contribute partial inputs that are automatically merged in real-time may be extended to allow dynamic groups to surpass constituent individuals (even experts) on a variety of human performance tasks.

user interface software and technology | 2011

Real-time crowd control of existing interfaces

Walter S. Lasecki; Kyle I. Murray; Samuel White; Robert C. Miller; Jeffrey P. Bigham

Crowdsourcing has been shown to be an effective approach for solving difficult problems, but current crowdsourcing systems suffer two main limitations: (i) tasks must be repackaged for proper display to crowd workers, which generally requires substantial one-off programming effort and support infrastructure, and (ii) crowd workers generally lack a tight feedback loop with their task. In this paper, we introduce Legion, a system that allows end users to easily capture existing GUIs and outsource them for collaborative, real-time control by the crowd. We present mediation strategies for integrating the input of multiple crowd workers in real-time, evaluate these mediation strategies across several applications, and further validate Legion by exploring the space of novel applications that it enables.

user interface software and technology | 2013

Chorus: a crowd-powered conversational assistant

Walter S. Lasecki; Rachel Wesley; Jeffrey Nichols; Anand Kulkarni; James F. Allen; Jeffrey P. Bigham

Despite decades of research attempting to establish conversational interaction between humans and computers, the capabilities of automated conversational systems are still limited. In this paper, we introduce Chorus, a crowd-powered conversational assistant. When using Chorus, end users converse continuously with what appears to be a single conversational partner. Behind the scenes, Chorus leverages multiple crowd workers to propose and vote on responses. A shared memory space helps the dynamic crowd workforce maintain consistency, and a game-theoretic incentive mechanism helps to balance their efforts between proposing and voting. Studies with 12 end users and 100 crowd workers demonstrate that Chorus can provide accurate, topical responses, answering nearly 93% of user queries appropriately, and staying on-topic in over 95% of responses. We also observed that Chorus has advantages over pairing an end user with a single crowd worker and end users completing their own tasks in terms of speed, quality, and breadth of assistance. Chorus demonstrates a new future in which conversational assistants are made usable in the real world by combining human and machine intelligence, and may enable a useful new way of interacting with the crowds powering other systems.

conference on computer supported cooperative work | 2013

Real-time crowd labeling for deployable activity recognition

Walter S. Lasecki; Young Chol Song; Henry A. Kautz; Jeffrey P. Bigham

Systems that automatically recognize human activities offer the potential of timely, task-relevant information and support. For example, prompting systems can help keep people with cognitive disabilities on track and surveillance systems can warn of activities of concern. Current automatic systems are difficult to deploy because they cannot identify novel activities, and, instead, must be trained in advance to recognize important activities. Identifying and labeling these events is time consuming and thus not suitable for real-time support of already-deployed activity recognition systems. In this paper, we introduce Legion:AR, a system that provides robust, deployable activity recognition by supplementing existing recognition systems with on-demand, real-time activity identification using input from the crowd. Legion:AR uses activity labels collected from crowd workers to train an automatic activity recognition system online to automatically recognize future occurrences. To enable the crowd to keep up with real-time activities, Legion:AR intelligently merges input from multiple workers into a single ordered label set. We validate Legion:AR across multiple domains and crowds and discuss features that allow appropriate privacy and accuracy tradeoffs.

user interface software and technology | 2014

Glance: rapidly coding behavioral video with the crowd

Walter S. Lasecki; Mitchell Gordon; Danai Koutra; Malte F. Jung; Steven P. Dow; Jeffrey P. Bigham

Behavioral researchers spend considerable amount of time coding video data to systematically extract meaning from subtle human actions and emotions. In this paper, we present Glance, a tool that allows researchers to rapidly query, sample, and analyze large video datasets for behavioral events that are hard to detect automatically. Glance takes advantage of the parallelism available in paid online crowds to interpret natural language queries and then aggregates responses in a summary view of the video data. Glance provides analysts with rapid responses when initially exploring a dataset, and reliable codings when refining an analysis. Our experiments show that Glance can code nearly 50 minutes of video in 5 minutes by recruiting over 60 workers simultaneously, and can get initial feedback to analysts in under 10 seconds for most clips. We present and compare new methods for accurately aggregating the input of multiple workers marking the spans of events in video data, and for measuring the quality of their coding in real-time before a baseline is established by measuring the variance between workers. Glances rapid responses to natural language queries, feedback regarding question ambiguity and anomalies in the data, and ability to build on prior context in followup queries allow users to have a conversation-like interaction with their data - opening up new possibilities for naturally exploring video data.

Archive | 2013

Mechanical Turk is Not Anonymous

Matthew Lease; Jessica Hullman; Jeffrey P. Bigham; Michael S. Bernstein; Juho Kim; Walter S. Lasecki; Saeideh Bakhshi; Tanushree Mitra; Robert C. Miller

While Amazon’s Mechanical Turk (AMT) online workforce has been characterized by many people as being anonymous, we expose an aspect of AMT’s system design that can be exploited to reveal a surprising amount of information about many AMT Workers, which may include personally identifying information (PII). This risk of PII exposure may surprise many Workers and Requesters today, as well as impact current institutional review board (IRB) oversight of human subjects research involving AMT Workers as participants. We assess the potential multi-faceted impact of such PII exposure for each stakeholder group: Workers, Requesters, and AMT itself. We discuss potential remedies each group may explore, as well as the responsibility of each group with regard to privacy protection. This discussion leads us to further situate issues of crowd worker privacy amidst broader ethical, economic, and regulatory issues, and we conclude by oering a set of recommendations to each stakeholder group.

human factors in computing systems | 2015

Apparition: Crowdsourced User Interfaces that Come to Life as You Sketch Them

Walter S. Lasecki; Juho Kim; Nicholas Rafter; Onkur Sen; Jeffrey P. Bigham; Michael S. Bernstein

Prototyping allows designers to quickly iterate and gather feedback, but the time it takes to create even a Wizard-of-Oz prototype reduces the utility of the process. In this paper, we introduce crowdsourcing techniques and tools for prototyping interactive systems in the time it takes to describe the idea. Our Apparition system uses paid microtask crowds to make even hard-to-automate functions work immediately, allowing more fluid prototyping of interfaces that contain interactive elements and complex behaviors. As users sketch their interface and describe it aloud in natural language, crowd workers and sketch recognition algorithms translate the input into user interface elements, add animations, and provide Wizard-of-Oz functionality. We discuss how design teams can use our approach to reflect on prototypes or begin user studies within seconds, and how, over time, Apparition prototypes can become fully-implemented versions of the systems they simulate. Powering Apparition is the first self-coordinated, real-time crowdsourcing infrastructure. We anchor this infrastructure on a new, lightweight write-locking mechanism that workers can use to signal their intentions to each other.

human factors in computing systems | 2014

Selfsourcing personal tasks

Jaime Teevan; Daniel J. Liebling; Walter S. Lasecki

Large tasks can be overwhelming. For example, many people have thousands of digital photographs that languish in unorganized archives because it is difficult and time consuming to gather them into meaningful collections. Such tasks are hard to start because they seem to require long uninterrupted periods of effort to make meaningful progress. We propose the idea of selfsourcing as a way to help people to perform large personal information tasks by breaking them into manageable microtasks. Using ideas from crowdsourcing and task management, selfsourcing can help people take advantage of existing gaps in time and recover quickly from interruptions. We present several achievable selfsourcing scenarios and explore how they can facilitate information work in interruption-driven environments.

ACM Transactions on Accessible Computing | 2014

Accessibility Evaluation of Classroom Captions

Raja S. Kushalnagar; Walter S. Lasecki; Jeffrey P. Bigham

Real-time captioning enables deaf and hard of hearing (DHH) people to follow classroom lectures and other aural speech by converting it into visual text with less than a five second delay. Keeping the delay short allows end-users to follow and participate in conversations. This article focuses on the fundamental problem that makes real-time captioning difficult: sequential keyboard typing is much slower than speaking. We first surveyed the audio characteristics of 240 one-hour-long captioned lectures on YouTube, such as speed and duration of speaking bursts. We then analyzed how these characteristics impact caption generation and readability, considering specifically our human-powered collaborative captioning approach. We note that most of these characteristics are also present in more general domains. For our caption comparison evaluation, we transcribed a classroom lecture in real-time using all three captioning approaches. We recruited 48 participants (24 DHH) to watch these classroom transcripts in an eye-tracking laboratory. We presented these captions in a randomized, balanced order. We show that both hearing and DHH participants preferred and followed collaborative captions better than those generated by automatic speech recognition (ASR) or professionals due to the more consistent flow of the resulting captions. These results show the potential to reliably capture speech even during sudden bursts of speed, as well as for generating “enhanced” captions, unlike other human-powered captioning approaches.

human factors in computing systems | 2015

RegionSpeak: Quick Comprehensive Spatial Descriptions of Complex Images for Blind Users

Yu Zhong; Walter S. Lasecki; Erin L. Brady; Jeffrey P. Bigham

Blind people often seek answers to their visual questions from remote sources, however, the commonly adopted single-image, single-response model does not always guarantee enough bandwidth between users and sources. This is especially true when questions concern large sets of information, or spatial layout, e.g., where is there to sit in this area, what tools are on this work bench, or what do the buttons on this machine do? Our RegionSpeak system addresses this problem by providing an accessible way for blind users to (i) combine visual information across multiple photographs via image stitching, em (ii) quickly collect labels from the crowd for all relevant objects contained within the resulting large visual area in parallel, and (iii) then interactively explore the spatial layout of the objects that were labeled. The regions and descriptions are displayed on an accessible touchscreen interface, which allow blind users to interactively explore their spatial layout. We demonstrate that workers from Amazon Mechanical Turk are able to quickly and accurately identify relevant regions, and that asking them to describe only one region at a time results in more comprehensive descriptions of complex images. RegionSpeak can be used to explore the spatial layout of the regions identified. It also demonstrates broad potential for helping blind users to answer difficult spatial layout questions.

Explore More