João Paulo Costeira
Instituto Superior Técnico
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by João Paulo Costeira.
International Journal of Computer Vision | 1998
João Paulo Costeira; Takeo Kanade
The structure-from-motion problem has been extensively studied in the field of computer vision. Yet, the bulk of the existing work assumes that the scene contains only a single moving object. The more realistic case where an unknown number of objects move in the scene has received little attention, especially for its theoretical treatment. In this paper we present a new method for separating and recovering the motion and shape of multiple independently moving objects in a sequence of images. The method does not require prior knowledge of the number of objects, nor is dependent on any grouping of features into an object at the image level. For this purpose, we introduce a mathematical construct of object shapes, called the shape interaction matrix, which is invariant to both the object motions and the selection of coordinate systems. This invariant structure is computable solely from the observed trajectories of image features without grouping them into individual objects. Once the matrix is computed, it allows for segmenting features into objects by the process of transforming it into a canonical form, as well as recovering the shape and motion of each object. The theory works under a broad set of projection models (scaled orthography, paraperspective and affine) but they must be linear, so it excludes projective “cameras”.
international conference on computer vision | 1995
João Paulo Costeira; Takeo Kanade
The structure from motion problem has been extensively studied in the field of computer vision. Yet, the bulk of the existing work assumes that the scene contains only a single moving object. The more realistic case where an unknown number of objects move in the scene has received little attention, especially for its theoretical treatment. We present a new method for separating and recovering the motion and shape of multiple independently moving objects in a sequence of images. The method does not require prior knowledge of the number of objects, nor is dependent on any grouping of features into an object at the image level. For this purpose, we introduce a mathematical construct of object shapes, called the shape interaction matrix, which is invariant to both the object motions and the selection of coordinate systems. This invariant structure is computable solely from the observed trajectories of image features without grouping them into individual objects. Once the structure is computed, it allows for segmenting features into objects by the process of transforming it into a canonical form, as well as recovering the shape and motion of each object.<<ETX>>
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003
João Maciel; João Paulo Costeira
We propose a new methodology for reliably solving the correspondence problem between sparse sets of points of two or more images. This is a key step inmost problems of computer vision and, so far, no general method exists to solve it. Our methodology is able to handle most of the commonly used assumptions in a unique formulation, independent of the domain of application and type of features. It performs correspondence and outlier rejection in a single step and achieves global optimality with feasible computation. Feature selection and correspondence are first formulated as an integer optimization problem. This is a blunt formulation, which considers the whole combinatorial space of possible point selections and correspondences. To find its global optimal solution, we build a concave objective function and relax the search domain into its convex-hull. The special structure of this extended problem assures its equivalence to the original one, but it can be optimally solved by efficient algorithms that avoid combinatorial search. This methodology can use any criterion provided it can be translated into cost functions with continuous second derivatives.
international conference on computer vision | 2013
Ricardo Silveira Cabral; Fernando De la Torre; João Paulo Costeira; Alexandre Bernardino
Low rank models have been widely used for the representation of shape, appearance or motion in computer vision problems. Traditional approaches to fit low rank models make use of an explicit bilinear factorization. These approaches benefit from fast numerical methods for optimization and easy kernelization. However, they suffer from serious local minima problems depending on the loss function and the amount/type of missing data. Recently, these low-rank models have alternatively been formulated as convex problems using the nuclear norm regularizer, unlike factorization methods, their numerical solvers are slow and it is unclear how to kernelize them or to impose a rank a priori. This paper proposes a unified approach to bilinear factorization and nuclear norm regularization, that inherits the benefits of both. We analyze the conditions under which these approaches are equivalent. Moreover, based on this analysis, we propose a new optimization algorithm and a rank continuation strategy that outperform state-of-the-art approaches for Robust PCA, Structure from Motion and Photometric Stereo with outliers and missing data.
Computer Vision and Image Understanding | 2009
Manuel Marques; João Paulo Costeira
Reconstructing a 3D scene from a moving camera is one of the most important issues in the field of computer vision. In this scenario, not all points are known in all images (e.g. due to occlusion), thus generating missing data. On the other hand, successful 3D reconstruction algorithms like Tomasi & Kanades factorization method, require an orthographic model for the data, which is adequate in close-up views. The state-of-the-art handles the missing points in this context by enforcing rank constraints on the point track matrix. However, quite frequently, close-up views tend to capture planar surfaces producing degenerate data. Estimating missing data using the rank constraint requires that all known measurements are full rank in all images of the sequence. If one single frame is degenerate, the whole sequence will produce high errors on the reconstructed shape, even though the observation matrix verifies the rank 4 constraint. In this paper, we propose to solve the structure from motion problem with degenerate data, introducing a new factorization algorithm that imposes the full scaled-orthographic model in one single optimization procedure. By imposing all model constraints, a unique (correct) 3D shape is estimated regardless of the data degeneracies. Experiments show that remarkably good reconstructions are obtained with an approximate models such as orthography.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015
Ricardo Silveira Cabral; Fernando De la Torre; João Paulo Costeira; Alexandre Bernardino
In the last few years, image classification has become an incredibly active research topic, with widespread applications. Most methods for visual recognition are fully supervised, as they make use of bounding boxes or pixelwise segmentations to locate objects of interest. However, this type of manual labeling is time consuming, error prone and it has been shown that manual segmentations are not necessarily the optimal spatial enclosure for object classifiers. This paper proposes a weakly-supervised system for multi-label image classification. In this setting, training images are annotated with a set of keywords describing their contents, but the visual concepts are not explicitly segmented in the images. We formulate the weakly-supervised image classification as a low-rank matrix completion problem. Compared to previous work, our proposed framework has three advantages: (1) Unlike existing solutions based on multiple-instance learning methods, our model is convex. We propose two alternative algorithms for matrix completion specifically tailored to visual data, and prove their convergence. (2) Unlike existing discriminative methods, our algorithm is robust to labeling errors, background noise and partial occlusions. (3) Our method can potentially be used for semantic segmentation. Experimental validation on several data sets shows that our method outperforms state-of-the-art classification algorithms, while effectively capturing each class appearance.
international conference on image analysis and processing | 1999
Luis Jordao; Matteo Perrone; João Paulo Costeira; José Santos-Victor
This paper describes a method for the detection and tracking of human face and facial features. Skin segmentation is learnt from samples of an image. After detecting a moving object, the corresponding area is searched for clusters of pixels with a known distribution. Since we only use the hue (color) component this process is quite insensitive to illumination changes. The face localization procedure looks for areas in the segmented area which resemble a head. Using simple heuristics, the located head is searched and its centroid is fed back to a camera motion control algorithm which tries to keep the face centered in the image using a pan-tilt camera unit. Furthermore the system is capable of tracking, in every frame, the three main features of a human face. Since precise eye location is computationally intensive, an eye and mouth locator using fast morphological and linear filters is developed. This allows for frame-by-frame checking, which reduces the probability of tracking a non-basis feature, yielding a higher success ratio. Velocity and robustness are the main advantages of this fast facial feature detector.
computer vision and pattern recognition | 2008
N.P. da Silva; João Paulo Costeira
Segmenting arbitrary unions of linear subspaces is an important tool for computer vision tasks such as motion and image segmentation, SfM or object recognition. We segment subspaces by searching for the orthogonal complement of the subspace supported by the majority of the observations, i.e., the maximum consensus subspace. It is formulated as a Grassmannian optimization problem: a smooth, constrained but nonconvex program is immersed into the Grassmann manifold, resulting in a low dimensional and unconstrained program solved with an efficient optimization algorithm. Nonconvexity implies that global optimality depends on the initialization. However, by finding the maximum consensus subspace, outlier rejection becomes an inherent property of the method. Besides robustness, it does not rely on prior global detection procedures (e.g., rank of data matrices), which is the case of most current works. We test our algorithm in both synthetic and real data, where no outlier was ever classified as inlier.
Image and Vision Computing | 2002
João Maciel; João Paulo Costeira
We propose a new methodology for reliably solving the correspondence problem between points of two or more images. This is a key step in most problems of Computer Vision and, so far, no general method exists to solve it. Our methodology is able to handle most of the commonly used assumptions in a unique formulation, independent of the domain of application and type of features. It performs correspondence and outlier rejection in a single step, and achieves global optimality with feasible computation. Feature selection and correspondence are first formulated as an integer optimization problem. This is a blunt formulation, which considers the whole combinatorial space of possible point selections and correspondences. To find its global optimal solution we build a concave objective function and relax the search domain into its convex-hull. The special structure of this extended problem assures its equivalence to the original one, but it can be optimally solved by efficient algorithms that avoid combinatorial search.
european conference on computer vision | 2012
Gustavo Carneiro; Nuno Silva; Alessio Del Bue; João Paulo Costeira
Artistic image understanding is an interdisciplinary research field of increasing importance for the computer vision and the art history communities. For computer vision scientists, this problem offers challenges where new techniques can be developed; and for the art history community new automatic art analysis tools can be developed. On the positive side, artistic images are generally constrained by compositional rules and artistic themes. However, the low-level texture and color features exploited for photographic image analysis are not as effective because of inconsistent color and texture patterns describing the visual classes in artistic images. In this work, we present a new database of monochromatic artistic images containing 988 images with a global semantic annotation, a local compositional annotation, and a pose annotation of human subjects and animal types. In total, 75 visual classes are annotated, from which 27 are related to the theme of the art image, and 48 are visual classes that can be localized in the image with bounding boxes. Out of these 48 classes, 40 have pose annotation, with 37 denoting human subjects and 3 representing animal types. We also provide a complete evaluation of several algorithms recently proposed for image annotation and retrieval. We then present an algorithm achieving remarkable performance over the most successful algorithm hitherto proposed for this problem. Our main goal with this paper is to make this database, the evaluation process, and the benchmark results available for the computer vision community.