Brandyn White | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Brandyn White is active.

Explore More

Publication

Featured researches published by Brandyn White.

user interface software and technology | 2010

VizWiz: nearly real-time answers to visual questions

Jeffrey P. Bigham; Chandrika Jayant; Hanjie Ji; Greg Little; Andrew Miller; Robert C. Miller; Robin Miller; Aubrey Tatarowicz; Brandyn White; Samual White; Tom Yeh

The lack of access to visual information like text labels, icons, and colors can cause frustration and decrease independence for blind people. Current access technology uses automatic approaches to address some problems in this space, but the technology is error-prone, limited in scope, and quite expensive. In this paper, we introduce VizWiz, a talking application for mobile phones that offers a new alternative to answering visual questions in nearly real-time - asking multiple people on the web. To support answering questions quickly, we introduce a general approach for intelligently recruiting human workers in advance called quikTurkit so that workers are available when new questions arrive. A field deployment with 11 blind participants illustrates that blind people can effectively use VizWiz to cheaply answer questions in their everyday lives, highlighting issues that automatic approaches will need to address to be useful. Finally, we illustrate the potential of using VizWiz as part of the participatory design of advanced tools by using it to build and evaluate VizWiz::LocateIt, an interactive mobile tool that helps blind people solve general visual search problems.

Proceedings of the Tenth International Workshop on Multimedia Data Mining | 2010

Web-scale computer vision using MapReduce for multimedia data mining

Brandyn White; Tom Yeh; Jimmy J. Lin; Larry S. Davis

This work explores computer vision applications of the MapReduce framework that are relevant to the data mining community. An overview of MapReduce and common design patterns are provided for those with limited MapReduce background. We discuss both the high level theory and the low level implementation for several computer vision algorithms: classifier training, sliding windows, clustering, bag-of-features, background subtraction, and image registration. Experimental results for the k-means clustering and single Gaussian background subtraction algorithms are performed on a 410 node Hadoop cluster.

international conference on multimedia and expo | 2007

Automatically Tuning Background Subtraction Parameters using Particle Swarm Optimization

Brandyn White; Mubarak Shah

A common trait of background subtraction algorithms is that they have learning rates, thresholds, and initial values that are hand-tuned for a scenario in order to produce the desired subtraction result; however, the need to tune these parameters makes it difficult to use state-of-the-art methods, fuse multiple methods, and choose an algorithm based on the current application as it requires the end-user to become proficient in tuning a new parameter set. The proposed solution is to automate this task by using a particle swarm optimization (PSO) algorithm to maximize a fitness function compared to provided ground-truth images. The fitness function used is the F-measure, which is the harmonic mean of recall and precision. This method reduces the total pixel error of the Mixture of Gaussians background subtraction algorithm by more than 50% on the diverse Wallflower data-set.

computer vision and pattern recognition | 2010

VizWiz::LocateIt - enabling blind people to locate objects in their environment

Jeffrey P. Bigham; Chandrika Jayant; Andrew Miller; Brandyn White; Tom Yeh

Blind people face a number of challenges when interacting with their environments because so much information is encoded visually. Text is pervasively used to label objects, colors carry special significance, and items can easily become lost in surroundings that cannot be quickly scanned. Many tools seek to help blind people solve these problems by enabling them to query for additional information, such as color or text shown on the object. In this paper we argue that many useful problems may be better solved by direclty modeling them as search problems, and present a solution called VizWiz::LocateIt that directly supports this type of interaction. VizWiz::LocateIt enables blind people to take a picture and ask for assistance in finding a specific object. The request is first forwarded to remote workers who outline the object, enabling efficient and accurate automatic computer vision to guide users interactively from their existing cellphones. A two-stage algorithm is presented that uses this information to guide users to the appropriate object interactively from their phone.

IEEE Transactions on Visualization and Computer Graphics | 2012

Interactive 3D Model Acquisition and Tracking of Building Block Structures

Andrew Miller; Brandyn White; Emiko Charbonneau; Zach Kanzler; Joseph J. LaViola

We present a prototype system for interactive construction and modification of 3D physical models using building blocks.Our system uses a depth sensing camera and a novel algorithm for acquiring and tracking the physical models. The algorithm,Lattice-First, is based on the fact that building block structures can be arranged in a 3D point lattice where the smallest block unit is a basis in which to derive all the pieces of the model. The algorithm also makes it possible for users to interact naturally with the physical model as it is acquired, using their bare hands to add and remove pieces. We present the details of our algorithm, along with examples of the models we can acquire using the interactive system. We also show the results of an experiment where participants modify a block structure in the absence of visual feedback. Finally, we discuss two proof-of-concept applications: a collaborative guided assembly system where one user is interactively guided to build a structure based on another users design, and a game where the player must build a structure that matches an on-screen silhouette.

international conference on multimedia retrieval | 2014

Multi-Modal Image Retrieval for Complex Queries using Small Codes

Behjat Siddiquie; Brandyn White; Abhishek Sharma; Larry S. Davis

We propose a unified framework for image retrieval capable of handling complex and descriptive queries of multiple modalities in a scalable manner. A novel aspect of our approach is that it supports query specification in terms of objects, attributes and spatial relationships, thereby allowing for substantially more complex and descriptive queries. We allow these complex queries to be specified in three different modalities - images, sketches and structured textual descriptions. Furthermore, we propose a unique multi-modal hashing algorithm capable of mapping queries of different modalities to the same binary representation, enabling efficient and scalable image retrieval based on multi-modal queries. Extensive experimental evaluation shows that our approach outperforms the state-of-the-art image retrieval and hashing techniques on the MSRC and SUN09 datasets by about 100%, while the performance on a dataset of 1M images, from Flickr, demonstrates its scalability.

international world wide web conferences | 2011

A case for query by image and text content: searching computer help using screenshots and keywords

Tom Yeh; Brandyn White; Jose San Pedro; Boriz Katz; Larry S. Davis

The multimedia information retrieval community has dedicated extensive research effort to the problem of content-based image retrieval (CBIR). However, these systems find their main limitation in the difficulty of creating pictorial queries. As a result, few systems offer the option of querying by visual examples, and rely on automatic concept detection and tagging techniques to provide support for searching visual content using textual queries. This paper proposes and studies a practical multimodal web search scenario, where CBIR fits intuitively to improve the retrieval of rich information queries. Many online articles contain useful know-how knowledge about computer applications. These articles tend to be richly illustrated by screenshots. We present a system to search for such software know-how articles that leverages the visual correspondences between screenshots. Users can naturally create pictorial queries simply by taking a screenshot of the application to retrieve a list of articles containing a matching screenshot. We build a prototype comprising 150k articles that are classified into walkthrough, book, gallery, and general categories, and provide a comprehensive evaluation of this system, focusing on technical (accuracy of CBIR techniques) and usability (perceived system usefulness) aspects. We also consider the study of added value features of such a visual-supported search, including the ability to perform cross-lingual queries. We find that the system is able to retrieve matching screenshots for a wide variety of programs, across language boundaries, and provide subjectively more useful results than keyword-based web and image search engines.

Multimodal Technologies for Perception of Humans | 2008

Person and Vehicle Tracking in Surveillance Video

Andrew Miller; Arslan Basharat; Brandyn White; Jingen Liu; Mubarak Shah

This evaluation for person and vehicle tracking in surveillance presented some new challenges. The dataset was large and very high-quality, but with difficult scene properties involving illumination changes, unusual lighting conditions, and complicated occlusion of objects. Since this is a well-researched scenario [1], our submission was based primarily on our existing projects for automated object detection and tracking in surveillance. We also added several new features that are practical improvements for handling the difficulties of this dataset.

CLEaR | 2006

Multiple vehicle tracking in surveillance videos

Yun Zhai; Phillip Berkowitz; Andrew Miller; Khurram Shafique; Aniket A. Vartak; Brandyn White; Mubarak Shah

In this paper, we present KNIGHT, a Windows-based stand-alone object detection, tracking and classification software, which is built upon Microsoft Windows technologies. The object detection component assumes stationary background settings and models background pixel values using Mixture of Gaussians. Gradient-based background subtraction is used to handle scenarios of sudden illumination change. Connected-component algorithm is applied to detected foreground pixels for finding object-level moving blobs. The foreground objects are further tracked based on a pixel-voting technique with the occlusion and entry/exit reasonings. Motion correspondences are established using the color, size, spatial and motion information of objects. We have proposed a texture-based descriptor to classify moving objects into two groups: vehicles and persons. In this component, feature descriptors are computed from image patches, which are partitioned by concentric squares. SVM is used to build the object classifier. The system has been used in the VACE-CLEAR evaluation forum for the vehicle tracking task. Corresponding system performance is presented in this paper.

Archive | 2011

BBN VISER TRECVID 2011 Multimedia Event Detection System

Pradeep Natarajan; Prem Natarajan; Vasant Manohar; Shuang Wu; Stavros Tsakalidis; Shiv Naga Prasad Vitaladevuni; Xiaodan Zhuang; Rohit Prasad; Guangnan Ye; Dong Liu; I-Hong Jhuo; Shih-Fu Chang; Hamid Izadinia; Imran Saleemi; Mubarak Shah; Brandyn White; Tom Yeh; Larry S. Davis

Explore More