Roxanne L. Canosa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Roxanne L. Canosa is active.

Explore More

Publication

Featured researches published by Roxanne L. Canosa.

tests and proofs | 2009

Real-world vision: Selective perception and task

Roxanne L. Canosa

Visual perception is an inherently selective process. To understand when and why a particular region of a scene is selected, it is imperative to observe and describe the eye movements of individuals as they go about performing specific tasks. In this sense, vision is an active process that integrates scene properties with specific, goal-oriented oculomotor behavior. This study is an investigation of how task influences the visual selection of stimuli from a scene. Four eye tracking experiments were designed and conducted to determine how everyday tasks affect oculomotor behavior. A portable eyetracker was created for the specific purpose of bringing the experiments out of the laboratory and into the real world, where natural behavior is most likely to occur. The experiments provide evidence that the human visual system is not a passive collector of salient environemental stimuli, nor is vision general-purpose. Rather, vision is active and specific, tightly coupled to the requirements of a task and a plan of action. The experiments support the hypothesis that the purpose of selective attention is to maximize task efficiency by fixating relevant objects in the scene. A computational model of visual attention is presented that imposes a high-level constraint on the bottom-up salient properties of a scene for the purpose of locating regions that are likely to correspond to foreground objects rather than background or other salient nonobject stimuli. In addition to improving the correlation to human subject fixation densities over a strictly bottom-up model [Itti et al. 1998; Parkhurst et al. 2002], this model predicts a central fixation tendency when that tendency is warranted, and not as an artificially primed location bias.

human vision and electronic imaging conference | 2003

High-level aspects of oculomotor control during viewing of natural-task images

Roxanne L. Canosa; Jeff B. Pelz; Neil Mennie; Joseph Peak

Eye movements are an external manifestation of selective attention and can play an important role in indicating which attributes of a scene carry the most pertinent information. Models that predict gaze distribution often define a local conspicuity value that relies on low-level image features to indicate the perceived salience of an image region. While such bottom-up models have some success in predicting fixation densities in simple 2D images, success with natural scenes requires an understanding of the goals of the observer, including the perceived usefulness of an object in the context of an explicit or implicit task. In the present study, observers viewed natural images while their eye movements were recorded. Eye movement patterns revealed that subjects preferentially fixated objects relevant for potential actions implied by the gist of the scene, rather than selecting targets based purely on image features. A proto-object map is constructed that is based on highly textured regions of the image that predict the location of potential objects. This map is used as a mask to inhibit the unimportant low-level features and enhance the important features to constrain the regions of potential interest. The resulting importance map correlates well to subject fixations on natural-task images.

technical symposium on computer science education | 2008

Mock trials and role-playing in computer ethics courses

Roxanne L. Canosa; Joan M. Lucas

Mock trials are an effective and fun way of eliciting thoughtful dialogue from students, and encouraging them to produce incisive analyses of current ethical dilemmas related to computers and society. This paper describes our experience using mock trials in two computer ethics courses. Each trial was centered on a specific controversial and ethically or legally ambiguous topic related to current computer usage in society. Students participated in a series of mock trials during the term, alternating their role in each trial between jury, proponent, and opponent. Class participation was nearly 100% for every trial, with many students electing to define their own sub-role within their assigned major role. The logistics of the trials were initially difficult to administer and monitor; however, they quickly became manageable as we gained more experience with the opportunities and pitfalls associated with the mock-trial system, and as students volunteered suggestions for improvements.

Proceedings of SPIE | 2014

A comparison of histogram distance metrics for content-based image retrieval

Qianwen Zhang; Roxanne L. Canosa

The type of histogram distance metric selected for a CBIR query varies greatly and will affect the accuracy of the retrieval results. This paper compares the retrieval results of a variety of commonly used CBIR distance metrics: the Euclidean distance, the Manhattan distance, the vector cosine angle distance, histogram intersection distance, χ2 distance, Jensen-Shannon divergence, and the Earth Mover’s distance. A training set of ground-truth labeled images is used to build a classifier for the CBIR system, where the images were obtained from three commonly used benchmarking datasets: the WANG dataset (http://savvash.blogspot.com/2008/12/benchmark-databases-for-cbir.html), the Corel Subset dataset (http://vision.stanford.edu/resources_links.html), and the CalTech dataset (http://www.vision.caltech.edu/htmlfiles/). To implement the CBIR system, we use the Tamura texture features of coarseness, contrast, and directionality. We create texture histograms of the training set and the query images, and then measure the difference between a randomly selected query and the corresponding retrieved image using a k-nearest-neighbors approach. Precision and recall is used to evaluate the retrieval performance of the system, given a particular distance metric. Then, given the same query image, the distance metric is changed and performance of the system is evaluated once again.

technical symposium on computer science education | 2006

Image understanding as a second course in AI: preparing students for research

Roxanne L. Canosa

This paper describes the development and structure of a second course in artificial intelligence that was developed to meet the needs of upper-division undergraduate and graduate computer science and computer engineering students. These students already have a background in either computer vision or artificial intelligence, and desire to apply that knowledge to the design of algorithms that are able to automate the process of extracting semantic content from either static or dynamic imagery. Theory and methodology from diverse areas were incorporated into the course, including techniques from image processing, statistical pattern recognition, knowledge representation, multivariate analysis, cognitive modeling, and probabilistic inference. Students read selected current literature from the field, took turns presenting the selected literature to the class, and participated in discussions about the literature. Programming projects were required of all students, and in addition, graduate students were required to propose, design, implement, and defend an image understanding project of their own choosing. The course served as preparation for and an incubator of an active research group.

Proceedings of SPIE | 2009

Modeling decision-making in single- and multi-modal medical images

Roxanne L. Canosa

This research introduces a mode-specific model of visual saliency that can be used to highlight likely lesion locations and potential errors (false positives and false negatives) in single-mode PET and MRI images and multi-modal fused PET/MRI images. Fused-modality digital images are a relatively recent technological improvement in medical imaging; therefore, a novel component of this research is to characterize the perceptual response to these fused images. Three different fusion techniques were compared to single-mode displays in terms of observer error rates using synthetic human brain images generated from an anthropomorphic phantom. An eye-tracking experiment was performed with naïve (non-radiologist) observers who viewed the single- and multi-modal images. The eye-tracking data allowed the errors to be classified into four categories: false positives, search errors (false negatives never fixated), recognition errors (false negatives fixated less than 350 milliseconds), and decision errors (false negatives fixated greater than 350 milliseconds). A saliency model consisting of a set of differentially weighted low-level feature maps is derived from the known error and ground truth locations extracted from a subset of the test images for each modality. The saliency model shows that lesion and error locations attract visual attention according to low-level image features such as color, luminance, and texture.

document recognition and retrieval | 2007

Adding contextual information to improve character recognition on the Archimedes Palimpsest

Derek J. Walvoord; Roger L. Easton; Roxanne L. Canosa

The objective of the character recognition effort for the Archimedes Palimpsest is to provide a tool that allows scholars of ancient Greek mathematics to retrieve as much information as possible from the remaining degraded text. With this in mind, the current pattern recognition system does not output a single classification decision, as in typical target detection problems, but has been designed to provide intermediate results that allow the user to apply his or her own decisions (or evidence) to arrive at a conclusion. To achieve this result, a probabilistic network has been incorporated into our previous recognition system, which was based primarily on spatial correlation techniques. This paper reports on the revised tool and its recent success in the transciption process.

Proceedings of SPIE | 2014

Agglomerative clustering using hybrid features for image categorization

Karina Damico; Roxanne L. Canosa

This research project describes an agglomerative image clustering technique that is used for the purpose of automating image categorization. The system is implemented in two stages: feature vector formation, and feature space clustering. The features that we selected are based on texture salience (Gabor filters and a binary pattern descriptor). Global properties are encoded via a hierarchical spatial pyramid and local structure is encoded as a bit string, retained via a set of histograms. The transform can be computed efficiently – it involves only 16 operations (8 comparisons and 8 additions) per 3x3 region. A disadvantage is that it is not invariant to rotation or scale changes; however, the spatial pyramid representing global structure helps to ameliorate this problem. An agglomerative clustering technique is implemented and evaluated based on ground-truth values and a human subjective rating.

Proceedings of SPIE | 2013

Body-part estimation from Lucas-Kanade tracked Harris points

Vladimir Pribula; Roxanne L. Canosa

Skeleton estimation from single-camera grayscale images is generally accomplished using model-based techniques. Multiple cameras are sometimes used; however, skeletal points extracted from a single subject using multiple images are usually too sparse to be helpful for localizing body parts. For this project, we use a single viewpoint without any model-based assumptions to identify a central source of motion, the body, and its associated extremities. Harris points are tracked using Lucas-Kanade refinement with a weighted kernel found from expectation maximization. The algorithm tracks key image points and trajectories and re-represents them as complex vectors describing the motion of a specific body part. Normalized correlation is calculated from these vectors to form a matrix of graph edge weights, which is subsequently partitioned using a graph-cut algorithm to identify dependent trajectories. The resulting Harris points are clustered into rigid component centroids using mean shift, and the extremity centroids are connected to their nearest body centroid to complete the body-part estimation. We collected ground truth labels from seven participants for body parts that are compared to the clusters given by our algorithm.

Proceedings of SPIE | 2013

Smarter compositing with the Kinect

Alex Karantza; Roxanne L. Canosa

A image processing pipeline is presented that applies principles from the computer graphics technique of deferred shading to composite rendered objects into a live scene viewed by a Kinect. Issues involving the presentation of the Kinects output are addressed, and algorithms for improving the believability and aesthetic matching of the rendered scene against the real scene are proposed. An implementation of this pipeline using GLSL shaders to perform this pipeline at interactive framerates is given. The results of experiments with this program are provided that show promise that the approaches evaluated here can be applied to improve other implementations.

Explore More