Richard Zanibbi
Rochester Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Richard Zanibbi.
International Journal on Document Analysis and Recognition | 2004
Richard Zanibbi; Dorothea Blostein; R. Cordy
Abstract.Table characteristics vary widely. Consequently, a great variety of computational approaches have been applied to table recognition. In this survey, the table recognition literature is presented as an interaction of table models, observations, transformations, and inferences. A table model defines the physical and logical structure of tables; the model is used to detect tables and to analyze and decompose the detected tables. Observations perform feature measurements and data lookup, transformations alter or restructure data, and inferences generate and test hypotheses. This presentation clarifies both the decisions made by a table recognizer and the assumptions and inferencing techniques that underlie these decisions.
International Journal on Document Analysis and Recognition | 2012
Richard Zanibbi; Dorothea Blostein
Document recognition and retrieval technologies complement one another, providing improved access to increasingly large document collections. While recognition and retrieval of textual information is fairly mature, with wide-spread availability of optical character recognition and text-based search engines, recognition and retrieval of graphics such as images, figures, tables, diagrams, and mathematical expressions are in comparatively early stages of research. This paper surveys the state of the art in recognition and retrieval of mathematical expressions, organized around four key problems in math retrieval (query construction, normalization, indexing, and relevance feedback), and four key problems in math recognition (detecting expressions, detecting and classifying symbols, analyzing symbol layout, and constructing a representation of meaning). Of special interest is the machine learning problem of jointly optimizing the component algorithms in a math recognition system, and developing effective indexing, retrieval and relevance feedback algorithms for math retrieval. Another important open problem is developing user interfaces that seamlessly integrate recognition and retrieval. Activity in these important research areas is increasing, in part because math notation provides an excellent domain for studying problems common to many document and graphics recognition and retrieval applications, and also because mature applications will likely provide substantial benefits for education, research, and mathematical literacy.
symposium on usable privacy and security | 2009
Kurt Alfred Kluever; Richard Zanibbi
We present a technique for using content-based video labeling as a CAPTCHA task. Our CAPTCHAs are generated from YouTube videos, which contain labels (tags) supplied by the person that uploaded the video. They are graded using a videos tags, as well as tags from related videos. In a user study involving 184 participants, we were able to increase the human success rate on our video CAPTCHA from roughly 70% to 90%, while keeping the success rate of a tag frequency-based attack fixed at around 13%. Through a different parameterization of the challenge generation and grading algorithms, we were able to reduce the success rate of the same attack to 2%, while still increasing the human success rate from 70% to 75%. The usability and security of our video CAPTCHA appears to be comparable to existing CAPTCHAs, and a majority of participants (60%) indicated that they found the video CAPTCHAs more enjoyable than traditional CAPTCHAs in which distorted text must be transcribed.
international conference on pattern recognition | 2002
Dorothea Blostein; James R. Cordy; Richard Zanibbi
Compiler techniques are effective and efficient in processing textual programming languages. These techniques can be adapted to recognition and processing of two-dimensional languages (diagrams). Already, grammars and parsers have been used in a variety of diagram-recognition and diagram-processing tasks. Here we explore the use of two other compiler techniques in pattern recognition systems. The first is compiler-style use of trees and tree transformation. The second is a multi-pass control structure, with a clear separation between layout, lexical, syntactic, and semantic analysis. Our proposal is illustrated on a case study involving recognition of hand-drawn mathematics notation.
computer vision and pattern recognition | 2016
Siyu Zhu; Richard Zanibbi
We propose a system that finds text in natural scenes using a variety of cues. Our novel data-driven method incorporates coarse-to-fine detection of character pixels using convolutional features (Text-Conv), followed by extracting connected components (CCs) from characters using edge and color features, and finally performing a graph-based segmentation of CCs into words (Word-Graph). For Text-Conv, the initial detection is based on convolutional feature maps similar to those used in Convolutional Neural Networks (CNNs), but learned using Convolutional k-means. Convolution masks defined by local and neighboring patch features are used to improve detection accuracy. The Word-Graph algorithm uses contextual information to both improve word segmentation and prune false character/word detections. Different definitions for foreground (text) regions are used to train the detection stages, some based on bounding box intersection, and others on bounding box and pixel intersection. Our system obtains pixel, character, and word detection f-measures of 93.14%, 90.26%, and 86.77% respectively for the ICDAR 2015 Robust Reading Focused Scene Text dataset, out-performing state-of-the-art systems. This approach may work for other detection targets with homogenous color in natural scenes.
document engineering | 2013
Francisco Álvaro; Richard Zanibbi
We consider the difficult problem of classifying spatial relationships between symbols and subexpressions in handwritten mathematical expressions. We first improve existing geometric features based on bounding boxes and center points, normalizing them using the distance between the centers of the two symbols or subexpressions in question. We then propose a novel feature set for layout classification, using polar histograms computed over points in handwritten strokes. A series of experiments are presented in which a Support Vector Machine is used with these new features to classify spatial relationships of five types in the MathBrush corpus (horizontal, superscript, subscript, below, and inside (e.g. in a square root)). The normalized geometric features provide an improvement over previously published results, while the shape-based features provide a natural representation with results comparable to those for the geometric features. Combining the features produced a very small improvement in accuracy.
international conference on document analysis and recognition | 2001
Richard Zanibbi; Dorothea Blostein; James R. Cordy
The structure of mathematics notation is particularly difficult to recognize in handwritten notation because irregular symbol placements are common. We present an efficient and robust method of parsing handwritten and typeset mathematics notation without backtracking. The system is designed to be easily adaptable to various dialects of mathematics notation. The following strategies are used: (1) separate the analysis of layout, syntax, and semantics, (2) recursively apply search functions and image partitioning to recognize dominant and nested baselines, and (3) use tree transformations to express computations in a compact, efficiently executable form.
graphics recognition | 2001
Dorothea Blostein; Edward Lank; Arlis Rose; Richard Zanibbi
The user interface is critical to the success of a diagram recognition system. It is difficult to define precise goals for a user interface, and even more difficult to quantify performance of a user interface. In this paper, we discuss some of the many research questions related to user interfaces in diagram recognition systems. We relate experiences we have gathered during the construction of two on-line diagram recognition systems, one for UML (Unified Modeling Language) notation and the other for mathematical notation. The goal of this paper is to encourage discussion. The graphics recognition community needs strategies and criteria for designing, implementing, and evaluating user interfaces.
international conference on document analysis and recognition | 2011
Richard Zanibbi; Amit Pillay; Harold Mouchère; Christian Viard-Gaudin; Dorothea Blostein
Evaluating mathematical expression recognition involves a complex interaction of input primitives (e.g. pen/finger strokes), recognized symbols, and recognized spatial structure. Existing performance metrics simplify this problem by separating the assessment of spatial structure from the assessment of symbol segmentation and classification. These metrics do not characterize the overall accuracy of a pen-based mathematics recognition, making it difficult to compare math recognition algorithms, and preventing the use of machine learning algorithms requiring a criterion function characterizing overall system performance. To address this problem, we introduce performance metrics that bridge the gap from handwritten strokes to spatial structure. Our metrics are computed using bipartite graphs that represent classification, segmentation and spatial structure at the stroke level. Overall correctness of an expression is measured by counting the number of relabelings of nodes and edges needed to make the bipartite graph for a recognition result match the bipartite graph for ground truth. This metric may also be used with other primitive types (e.g. image pixels).
international conference on frontiers in handwriting recognition | 2014
Harold Mouchère; Christian Viard-Gaudin; Richard Zanibbi; Utpal Garain
We present the outcome of the latest edition of the CROHME competition, dedicated to on-line handwritten mathematical expression recognition. In addition to the standard full expression recognition task from previous competitions, CROHME 2014 features two new tasks. The first is dedicated to isolated symbol recognition including a reject option for invalid symbol hypotheses, and the second concerns recognizing expressions that contain matrices. System performance is improving relative to previous competitions. Data and evaluation tools used for the competition are publicly available.