Ali Osman Ulusoy
Brown University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ali Osman Ulusoy.
computer vision and pattern recognition | 2017
Gernot Riegler; Ali Osman Ulusoy; Andreas Geiger
We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.
international conference on computer vision | 2009
Ali Osman Ulusoy; Fatih Calakli; Gabriel Taubin
In this paper we present a new “one-shot” method to reconstruct the shape of dynamic 3D objects and scenes based on active illumination. In common with other related prior-art methods, a static grid pattern is projected onto the scene, a video sequence of the illuminated scene is captured, a shape estimate is produced independently for each video frame, and the one-shot property is realized at the expense of space resolution. The main challenge in grid-based one-shot methods is to engineer the pattern and algorithms so that the correspondence between pattern grid points and their images can be established very fast and without uncertainty. We present an efficient one-shot method which exploits simple geometric constraints to solve the correspondence problem. We also introduce De Bruijn spaced grids, a novel grid pattern, and show with strong empirical data that the resulting scheme is much more robust compared to those based on uniform spaced grids.
computer vision and pattern recognition | 2010
Ali Osman Ulusoy; Fatih Calakli; Gabriel Taubin
A structured-light technique can greatly simplify the problem of shape recovery from images. There are currently two main research challenges in design of such techniques. One is handling complicated scenes involving texture, occlusions, shadows, sharp discontinuities, and in some cases even dynamic change; and the other is speeding up the acquisition process by requiring small number of images and computationally less demanding algorithms. This paper presents a “one-shot” variant of such techniques to tackle the aforementioned challenges. It works by projecting a static grid pattern onto the scene and identifying the correspondence between grid stripes and the camera image. The correspondence problem is formulated using a novel graphical model and solved efficiently using loopy belief propagation. Unlike prior approaches, the proposed approach uses non-deterministic geometric constraints, thereby can handle spurious connections of stripe images. The effectiveness of the proposed approach is verified on a variety of complicated real scenes.
international conference on 3d vision | 2015
Ali Osman Ulusoy; Andreas Geiger; Michael J. Black
This paper presents a novel probabilistic foundation for volumetric 3D reconstruction. We formulate the problem as inference in a Markov random field, which accurately captures the dependencies between the occupancy and appearance of each voxel, given all input images. Our main contribution is an approximate highly parallelized discrete-continuous inference algorithm to compute the marginal distributions of each voxels occupancy and appearance. In contrast to the MAP solution, marginals encode the underlying uncertainty and ambiguity in the reconstruction. Moreover, the proposed algorithm allows for a Bayes optimal prediction with respect to a natural reconstruction loss. We compare our method to two state-of-the-art volumetric reconstruction algorithms on three challenging aerial datasets with LIDAR ground truth. Our experiments demonstrate that the proposed algorithm compares favorably in terms of reconstruction accuracy and the ability to expose reconstruction uncertainty.
IEEE Journal of Selected Topics in Signal Processing | 2012
Maria I. Restrepo; Brandon A. Mayer; Ali Osman Ulusoy; Joseph L. Mundy
This paper presents a new volumetric representation for categorizing objects in large-scale 3-D scenes reconstructed from image sequences. This work uses a probabilistic volumetric model (PVM) that combines the ideas of background modeling and volumetric multi-view reconstruction to handle the uncertainty inherent in the problem of reconstructing 3-D structures from 2-D images. The advantages of probabilistic modeling have been demonstrated by recent application of the PVM representation to video image registration, change detection and classification of changes based on PVM context. The applications just mentioned, operate on 2-D projections of the PVM. This paper presents the first work to characterize and use the local 3-D information in the scenes. Two approaches to local feature description are proposed and compared: 1) features derived from a PCA analysis of model neighborhoods; and 2) features derived from the coefficients of a 3-D Taylor series expansion within each neighborhood. The resulting description is used in a bag-of-features approach to classify buildings, houses, cars, planes, and parking lots learned from aerial imagery collected over Providence, RI. It is shown that both feature descriptions explain the data with similar accuracy and their effectiveness for dense-feature categorization is compared for the different classes. Finally, 3-D extensions of the Harris corner detector and a Hessian-based detector are used to detect salient features. Both types of salient features are evaluated through object categorization experiments, where only features with maximal response are retained. For most saliency criteria tested, features based on the determinant of the Hessian achieved higher classification accuracy than Harris-based features.
international conference on 3d imaging, modeling, processing, visualization & transmission | 2012
Fatih Calakli; Ali Osman Ulusoy; Maria I. Restrepo; Gabriel Taubin; Joseph L. Mundy
This paper presents a novel framework for surface reconstruction from multi-view aerial imagery of large scale urban scenes, which combines probabilistic volumetric modeling with smooth signed distance surface estimation, to produce very detailed and accurate surfaces. Using a continuous probabilistic volumetric model which allows for explicit representation of ambiguities caused by moving objects, reflective surfaces, areas of constant appearance, and self-occlusions, the algorithm learns the geometry and appearance of a scene from a calibrated image sequence. An online implementation of Bayesian learning precess in GPUs significantly reduces the time required to process a large number of images. The probabilistic volumetric model of occupancy is subsequently used to estimate a smooth approximation of the signed distance function to the surface. This step, which reduces to the solution of a sparse linear system, is very efficient and scalable to large data sets. The proposed algorithm is shown to produce high quality surfaces in challenging aerial scenes where previous methods make large errors in surface localization. The general applicability of the algorithm beyond aerial imagery is confirmed against the Middlebury benchmark.
european conference on computer vision | 2014
Ali Osman Ulusoy; Joseph L. Mundy
This paper describes an approach to reconstruct the complete history of a 3-d scene over time from imagery. The proposed approach avoids rebuilding 3-d models of the scene at each time instant. Instead, the approach employs an initial 3-d model which is continuously updated with changes in the environment to form a full 4-d representation. This updating scheme is enabled by a novel algorithm that infers 3-d changes with respect to the model at one time step from images taken at a subsequent time step. This algorithm can effectively detect changes even when the illumination conditions between image collections are significantly different. The performance of the proposed framework is demonstrated on four challenging datasets in terms of 4-d modeling accuracy as well as quantitative evaluation of 3-d change detection.
international conference on computer vision | 2013
Ali Osman Ulusoy; Octavian Biris; Joseph L. Mundy
This paper presents a probabilistic volumetric framework for image based modeling of general dynamic 3-d scenes. The framework is targeted towards high quality modeling of complex scenes evolving over thousands of frames. Extensive storage and computational resources are required in processing large scale space-time (4-d) data. Existing methods typically store separate 3-d models at each time step and do not address such limitations. A novel 4-d representation is proposed that adaptively subdivides in space and time to explain the appearance of 3-d dynamic surfaces. This representation is shown to achieve compression of 4-d data and provide efficient spatio-temporal processing. The advances of the proposed framework is demonstrated on standard datasets using free-viewpoint video and 3-d tracking applications.
computer vision and pattern recognition | 2016
Ali Osman Ulusoy; Michael J. Black; Andreas Geiger
In this paper, we propose a non-local structured prior for volumetric multi-view 3D reconstruction. Towards this goal, we present a novel Markov random field model based on ray potentials in which assumptions about large 3D surface patches such as planarity or Manhattan world constraints can be efficiently encoded as probabilistic priors. We further derive an inference algorithm that reasons jointly about voxels, pixels and image segments, and estimates marginal distributions of appearance, occupancy, depth, normals and planarity. Key to tractable inference is a novel hybrid representation that spans both voxel and pixel space and that integrates non-local information from 2D image segmentations in a principled way. We compare our non-local prior to commonly employed local smoothness assumptions and a variety of state-of-the-art volumetric reconstruction baselines on challenging outdoor scenes with textureless and reflective surfaces. Our experiments indicate that regularizing over larger distances has the potential to resolve ambiguities where local regularizers fail.
computer vision and pattern recognition | 2017
Ali Osman Ulusoy; Michael J. Black; Andreas Geiger
Dense 3D reconstruction from RGB images is a highly ill-posed problem due to occlusions, textureless or reflective surfaces, as well as other challenges. We propose object-level shape priors to address these ambiguities. Towards this goal, we formulate a probabilistic model that integrates multi-view image evidence with 3D shape information from multiple objects. Inference in this model yields a dense 3D reconstruction of the scene as well as the existence and precise 3D pose of the objects in it. Our approach is able to recover fine details not captured in the input shapes while defaulting to the input models in occluded regions where image evidence is weak. Due to its probabilistic nature, the approach is able to cope with the approximate geometry of the 3D models as well as input shapes that are not present in the scene. We evaluate the approach quantitatively on several challenging indoor and outdoor datasets.