Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Hsueh-Cheng Wang is active.

Publication


Featured researches published by Hsueh-Cheng Wang.


Journal of Vision | 2012

The attraction of visual attention to texts in real-world scenes

Hsueh-Cheng Wang; Marc Pomplun

When we look at real-world scenes, attention seems disproportionately attracted by texts that are embedded in these scenes, for instance, on signs or billboards. The present study was aimed at verifying the existence of this bias and investigating its underlying factors. For this purpose, data from a previous experiment were reanalyzed and four new experiments measuring eye movements during the viewing of real-world scenes were conducted. By pairing text objects with matching control objects and regions, the following main results were obtained: (a) Greater fixation probability and shorter minimum fixation distance of texts confirmed the higher attractiveness of texts; (b) the locations where texts are typically placed contribute partially to this effect; (c) specific visual features of texts, rather than typically salient features (e.g., color, orientation, and contrast), are the main attractors of attention; (d) the meaningfulness of texts does not add to their attentional capture; and (e) the attraction of attention depends to some extent on the observers familiarity with the writing system and language of a given text.


Quarterly Journal of Experimental Psychology | 2010

Estimating the Effect of Word Predictability on Eye Movements in Chinese Reading using Latent Semantic Analysis and Transitional Probability

Hsueh-Cheng Wang; Marc Pomplun; Minglei Chen; Hwa-Wei Ko; Keith Rayner

Latent semantic analysis (LSA) and transitional probability (TP), two computational methods used to reflect lexical semantic representation from large text corpora, were employed to examine the effects of word predictability on Chinese reading. Participants’ eye movements were monitored, and the influence of word complexity (number of strokes), word frequency, and word predictability on different eye movement measures (first-fixation duration, gaze duration, and total time) were examined. We found influences of TP on first-fixation duration and gaze duration and of LSA on total time. The results suggest that TP reflects an early stage of lexical processing while LSA reflects a later stage.


Vision Research | 2014

The roles of scene gist and spatial dependency among objects in the semantic guidance of attention in real-world scenes.

Chia-Chien Wu; Hsueh-Cheng Wang; Marc Pomplun

A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task.


international conference on robotics and automation | 2017

Enabling independent navigation for visually impaired people through a wearable vision-based feedback system

Hsueh-Cheng Wang; Robert K. Katzschmann; Santani Teng; Brandon Araki; Laura Giarré; Daniela Rus

This work introduces a wearable system to provide situational awareness for blind and visually impaired people. The system includes a camera, an embedded computer and a haptic device to provide feedback when an obstacle is detected. The system uses techniques from computer vision and motion planning to (1) identify walkable space; (2) plan step-by-step a safe motion trajectory in the space, and (3) recognize and locate certain types of objects, for example the location of an empty chair. These descriptions are communicated to the person wearing the device through vibrations. We present results from user studies with low- and high-level tasks, including walking through a maze without collisions, locating a chair, and walking through a crowded environment while avoiding people.


Journal of Vision | 2010

Semantic guidance of eye movements during real-world scene inspection

Alex D. Hwang; Hsueh-Cheng Wang; Marc Pomplun

Semantic guidance of eye movements during real-world scene inspection Alex D. Hwang ([email protected]) Hsueh-Cheng Wang ([email protected]) Marc Pomplun ([email protected]) Department of Computer Science, University of Massachusetts Boston 100 Morrissey Blvd., Boston, MA 02125-3393, USA Abstract This is the first study to measure semantic guidance during scene inspection, based on the efforts by two other research groups, namely the development of the LabelMe object- annotated image database and the LSA@CU text/word latent semantic analysis tool, which computes the conceptual distance between two terms. Our analysis reveals the existence of semantic guidance during scene inspection, that is, eye movements during scene inspection being guided by a semantic factor reflecting the conceptual relation between the currently fixated object and the target object of the following saccade. This guidance may facilitate memorization of the scene for later recall by viewing semantically related objects consecutively. Keywords: semantic guidance; contextual guidance; eye tracking; eye movement, scene inspection. Introduction Real-world scenes are filled with objects representing not only visual information, but also meanings and semantic relations with other objects in the scene. The guidance of eye movements based on visual appearance (low-level visual features) has been well studied in terms of both bottom-up (e.g., Bruce & Tsotsos, 2006; Henderson, 2003; Itti & Koch, 2001; Parkhurst, Law & Niebur, 2002) and top- down control of visual attention (e.g., Henderson, Brockmole, Castelhano & Mack, 2007; Hwang, Higgins & Pomplun, 2009; Peters & Itti, 2007; Pomplun, 2006; Zelinsky, 2008; Zelinsky, Zhang, Yu, Chen & Samaras, 2006) as well as neurological aspects (e.g., Corbetta & Shulman, 2002; Egner, Monti, Trittschuh, Wienecke, Hirsch & Mesulam, 2008). Although there has been research on high-level contextual effects on visual search using global features (e.g., Neider & Zelinski, 2006; Torralba, Oliva, Castelhano & Henderson, 2006) and primitive semantic effects based on co- occurrence of objects in term of implicit learning (e.g., Chun & Jiang, 1998; Chun & Phelps, 1999; Manginelli & Pollmann, 2008), effects on eye movements by object meaning and object relations, Semantic guidance, have not been studied because of a few hurdles that make such study more complicated: (1) Object segmentation is difficult, (2) semantic relations among objects are hard to define, and (3) a quantitative measure of semantic guidance has to be developed. Automated segmentation of images and labeling is one of the crucial steps for further understanding of image context, and there have been numerous attempts to solve this problem, ranging from global classification of scenes to individual region labeling (Athanasiadis, Mylonas, Avrithis, & Kollias, (2007); Boutell, Luo, Shena & Brown, 2004; Le Saux, & Amato, 2004; Luo & Savakis, 2001), but results were rather disappointing compared to human performance. Thanks to the LabelMe object-annotated image database (Russell, Torralba, Murphy & Freeman, 2008) developed by the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), various scenes with annotated objects are available to the public, which helps to bypass the first hurdle. In order to convincingly compute semantic or conceptual relations between objects, the co-occurrence of objects has to be analyzed in a large number of scenes images. Unfortunately, collecting and analyzing a sufficient amount of annotated scenes is unfeasible with the currently available data sources. Since semantic relations are formed at the conceptual rather than at the visual level, relations do not have to be derived from image databases. Any database that can generate a collection of contexts or knowledge can be used to represent the semantic meaning of objects. A useful mathematical method for such representation for computer modeling and simulation is Latent Semantic Analysis (LSA), which is based on the analysis of representative corpora of natural text. It transforms the occurrence matrix from large corpora into a relation between the terms/concepts, and a relation between those concepts and the documents (Landauer, Foltz & Laham, 1998). Since annotated data in LabelMe are text descriptions of objects, their semantic or conceptual relation can be processed with LSA. In this study, the LSA@CU text/word latent semantic analysis tool is used to pass the second hurdle. Equipped with above tools, we computed a series of semantic salience maps for each labeled object in a subject’s visual scan path. These salience maps approximated the transition probabilities for the following saccade to the other labeled objects in the scene, assuming that eye movements were entirely guided by the semantic relations between objects. Under this assumption, the probability of a gaze transition between two objects is proportionate to the strength of their semantic relation. Subsequently, the amount of semantic guidance was measured by the Receiver Operator Characteristic (ROC), which computes the extent


international conference on robotics and automation | 2017

Duckietown: An open, inexpensive and flexible platform for autonomy education and research

Liam Paull; Jacopo Tani; Heejin Ahn; Javier Alonso-Mora; Luca Carlone; Michal Čáp; Yu Fan Chen; Changhyun Choi; Jeff Dusek; Yajun Fang; Daniel Hoehener; Shih-Yuan Liu; Michael Novitzky; Igor Franzoni Okuyama; Jason Pazis; Guy Rosman; Valerio Varricchio; Hsueh-Cheng Wang; Dmitry S. Yershov; Hang Zhao; Michael R. Benjamin; Christopher E. Carr; Maria T. Zuber; Sertac Karaman; Emilio Frazzoli; Domitilla Del Vecchio; Daniela Rus; Jonathan P. How; John J. Leonard; Andrea Censi

Duckietown is an open, inexpensive and flexible platform for autonomy education and research. The platform comprises small autonomous vehicles (“Duckiebots”) built from off-the-shelf components, and cities (“Duckietowns”) complete with roads, signage, traffic lights, obstacles, and citizens (duckies) in need of transportation. The Duckietown platform offers a wide range of functionalities at a low cost. Duckiebots sense the world with only one monocular camera and perform all processing onboard with a Raspberry Pi 2, yet are able to: follow lanes while avoiding obstacles, pedestrians (duckies) and other Duckiebots, localize within a global map, navigate a city, and coordinate with other Duckiebots to avoid collisions. Duckietown is a useful tool since educators and researchers can save money and time by not having to develop all of the necessary supporting infrastructure and capabilities. All materials are available as open source, and the hope is that others in the community will adopt the platform for education and research.


Behavior Research Methods | 2014

Predicting raters’ transparency judgments of English and Chinese morphological constituents using latent semantic analysis

Hsueh-Cheng Wang; Li-Chuan Hsu; Yi-Min Tien; Marc Pomplun

The morphological constituents of English compounds (e.g., “butter” and “fly” for “butterfly”) and two-character Chinese compounds may differ in meaning from the whole word. Subjective differences and ambiguity of transparency make judgments difficult, and a computational alternative based on a general model might be a way to average across subjective differences. In the present study, we propose two approaches based on latent semantic analysis (Landauer & Dumais in Psychological Review 104:211–240, 1997): Model 1 compares the semantic similarity between a compound word and each of its constituents, and Model 2 derives the dominant meaning of a constituent from a clustering analysis of morphological family members (e.g., “butterfingers” or “buttermilk” for “butter”). The proposed models successfully predicted participants’ transparency ratings, and we recommend that experimenters use Model 1 for English compounds and Model 2 for Chinese compounds, on the basis of differences in raters’ morphological processing in the different writing systems. The dominance of lexical meaning, semantic transparency, and the average similarity between all pairs within a morphological family are provided, and practical applications for future studies are discussed.


international solid-state circuits conference | 2016

24.1 A 0.6V 8mW 3D vision processor for a navigation device for the visually impaired

Dongsuk Jeon; Nathan Ickes; Priyanka Raina; Hsueh-Cheng Wang; Daniela Rus; Anantha P. Chandrakasan

3D imaging devices, such as stereo and time-of-flight (ToF) cameras, measure distances to the observed points and generate a depth image where each pixel represents a distance to the corresponding location. The depth image can be converted into a 3D point cloud using simple linear operations. This spatial information provides detailed understanding of the environment and is currently employed in a wide range of applications such as human motion capture [1]. However, its distinct characteristics from conventional color images necessitate different approaches to efficiently extract useful information. This paper describes a low-power vision processor for processing such 3D image data. The processor achieves high energy-efficiency through a parallelized reconfigurable architecture and hardware-oriented algorithmic optimizations. The processor will be used as a part of a navigation device for the visually impaired (Fig. 24.1.1). This handheld or body-worn device is designed to detect safe areas and obstacles and provide feedback to a user. We employ a ToF camera as the main sensor in this system since it has a small form factor and requires relatively low computational complexity [2].


Vision Research | 2011

Semantic guidance of eye movements in real-world scenes

Alex D. Hwang; Hsueh-Cheng Wang; Marc Pomplun


Journal of Eye Movement Research | 2010

Object Frequency and Predictability Effects on Eye Fixation Durations in Real-World Scene Viewing

Hsueh-Cheng Wang; Alex D. Hwang; Marc Pomplun

Collaboration


Dive into the Hsueh-Cheng Wang's collaboration.

Top Co-Authors

Avatar

Marc Pomplun

University of Massachusetts Boston

View shared research outputs
Top Co-Authors

Avatar

Daniela Rus

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Alex D. Hwang

University of Massachusetts Boston

View shared research outputs
Top Co-Authors

Avatar

Chia-Chien Wu

University of Massachusetts Boston

View shared research outputs
Top Co-Authors

Avatar

Keith Rayner

University of California

View shared research outputs
Top Co-Authors

Avatar

Anantha P. Chandrakasan

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John J. Leonard

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Liam Paull

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Nathan Ickes

Massachusetts Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge