Sudipta N. Sinha | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sudipta N. Sinha is active.

Explore More

Publication

Featured researches published by Sudipta N. Sinha.

International Journal of Computer Vision | 2008

Detailed Real-Time Urban 3D Reconstruction from Video

Marc Pollefeys; David Nistér; Jan Michael Frahm; Amir Akbarzadeh; Philippos Mordohai; Brian Clipp; Chris Engels; David Gallup; Seon Joo Kim; Paul Merrell; C. Salmi; Sudipta N. Sinha; B. Talton; Liang Wang; Qingxiong Yang; Henrik Stewenius; Ruigang Yang; Greg Welch; Herman Towles

Abstract The paper presents a system for automatic, geo-registered, real-time 3D reconstruction from video of urban scenes. The system collects video streams, as well as GPS and inertia measurements in order to place the reconstructed models in geo-registered coordinates. It is designed using current state of the art real-time modules for all processing steps. It employs commodity graphics hardware and standard CPU’s to achieve real-time performance. We present the main considerations in designing the system and the steps of the processing pipeline. Our system extends existing algorithms to meet the robustness and variability necessary to operate out of the lab. To account for the large dynamic range of outdoor videos the processing pipeline estimates global camera gain changes in the feature tracking stage and efficiently compensates for these in stereo estimation without impacting the real-time performance. The required accuracy for many applications is achieved with a two-step stereo reconstruction process exploiting the redundancy across frames. We show results on real video sequences comprising hundreds of thousands of frames.

international conference on computer graphics and interactive techniques | 2008

Interactive 3D architectural modeling from unordered photo collections

Sudipta N. Sinha; Drew Steedly; Richard Szeliski; Maneesh Agrawala; Marc Pollefeys

We present an interactive system for generating photorealistic, textured, piecewise-planar 3D models of architectural structures and urban scenes from unordered sets of photographs. To reconstruct 3D geometry in our system, the user draws outlines overlaid on 2D photographs. The 3D structure is then automatically computed by combining the 2D interaction with the multi-view geometric information recovered by performing structure from motion analysis on the input photographs. We utilize vanishing point constraints at multiple stages during the reconstruction, which is particularly useful for architectural scenes where parallel lines are abundant. Our approach enables us to accurately model polygonal faces from 2D interactions in a single image. Our system also supports useful operations such as edge snapping and extrusions. Seamless texture maps are automatically generated by combining multiple input photographs using graph cut optimization and Poisson blending. The user can add brush strokes as hints during the texture generation stage to remove artifacts caused by unmodeled geometric structures. We build models for a variety of architectural scenes from collections of up to about a hundred photographs.

international symposium on 3d data processing visualization and transmission | 2006

Towards Urban 3D Reconstruction from Video

Amir Akbarzadeh; Jan Michael Frahm; Philippos Mordohai; Brian Clipp; Chris Engels; David Gallup; Paul Merrell; M. Phelps; Sudipta N. Sinha; B. Talton; Liang Wang; Qingxiong Yang; Henrik Stewenius; Ruigang Yang; Greg Welch; Herman Towles; David Nistér; Marc Pollefeys

The paper introduces a data collection system and a processing pipeline for automatic geo-registered 3D reconstruction of urban scenes from video. The system collects multiple video streams, as well as GPS and INS measurements in order to place the reconstructed models in geo- registered coordinates. Besides high quality in terms of both geometry and appearance, we aim at real-time performance. Even though our processing pipeline is currently far from being real-time, we select techniques and we design processing modules that can achieve fast performance on multiple CPUs and GPUs aiming at real-time performance in the near future. We present the main considerations in designing the system and the steps of the processing pipeline. We show results on real video sequences captured by our system.

international conference on computer vision | 2009

Piecewise planar stereo for image-based rendering

Sudipta N. Sinha; Drew Steedly; Richard Szeliski

We present a novel multi-view stereo method designed for image-based rendering that generates piecewise planar depth maps from an unordered collection of photographs.

computer vision and pattern recognition | 2011

Object stereo — Joint stereo matching and object segmentation

Michael Bleyer; Carsten Rother; Pushmeet Kohli; Daniel Scharstein; Sudipta N. Sinha

This paper presents a method for joint stereo matching and object segmentation. In our approach a 3D scene is represented as a collection of visually distinct and spatially coherent objects. Each object is characterized by three different aspects: a color model, a 3D plane that approximates the objects disparity distribution, and a novel 3D connectivity property. Inspired by Markov Random Field models of image segmentation, we employ object-level color models as a soft constraint, which can aid depth estimation in powerful ways. In particular, our method is able to recover the depth of regions that are fully occluded in one input view, which to our knowledge is new for stereo matching. Our model is formulated as an energy function that is optimized via fusion moves. We show high-quality disparity and object segmentation results on challenging image pairs as well as standard benchmarks. We believe our work not only demonstrates a novel synergy between the areas of image segmentation and stereo matching, but may also inspire new work in the domain of automatic and interactive object-level scene manipulation.

international conference on computer vision | 2005

Multi-view reconstruction using photo-consistency and exact silhouette constraints: a maximum-flow formulation

Sudipta N. Sinha; Marc Pollefeys

This paper describes a novel approach for reconstructing a closed continuous surface of an object from multiple calibrated color images and silhouettes. Any accurate reconstruction must satisfy (1) photo-consistency and (2) silhouette consistency constraints. Most existing techniques treat these cues identically in optimization frameworks where silhouette constraints are traded off against photo-consistency and smoothness priors. Our approach strictly enforces silhouette constraints, while optimizing photo-consistency and smoothness in a global graph-cut framework. We transform the reconstruction problem into computing max-flow/min-cut in a geometric graph, where any cut corresponds to a surface satisfying exact silhouette constraints (its silhouettes should exactly coincide with those of the visual hull); a minimum cut is the most photo-consistent surface amongst them. Our graph-cut formulation is based on the rim mesh, (the combinatorial arrangement of rims or contour generators from many views) which can be computed directly from the silhouettes. Unlike other methods, our approach enforces silhouette constraints without introducing a bias near the visual hull boundary and also recovers the rim curves. Results are presented for synthetic and real datasets.

international conference on computer vision | 2007

Multi-View Stereo via Graph Cuts on the Dual of an Adaptive Tetrahedral Mesh

Sudipta N. Sinha; Philippos Mordohai; Marc Pollefeys

We formulate multi-view 3D shape reconstruction as the computation of a minimum cut on the dual graph of a semi- regular, multi-resolution, tetrahedral mesh. Our method does not assume that the surface lies within a finite band around the visual hull or any other base surface. Instead, it uses photo-consistency to guide the adaptive subdivision of a coarse mesh of the bounding volume. This generates a multi-resolution volumetric mesh that is densely tesselated in the parts likely to contain the unknown surface. The graph-cut on the dual graph of this tetrahedral mesh produces a minimum cut corresponding to a triangulated surface that minimizes a global surface cost functional. Our method makes no assumptions about topology and can recover deep concavities when enough cameras observe them. Our formulation also allows silhouette constraints to be enforced during the graph-cut step to counter its inherent bias for producing minimal surfaces. Local shape refinement via surface deformation is used to recover details in the reconstructed surface. Reconstructions of the Multi- View Stereo Evaluation benchmark datasets and other real datasets show the effectiveness of our method.

computer vision and pattern recognition | 2012

Real-time image-based 6-DOF localization in large-scale environments

Hyon Lim; Sudipta N. Sinha; Michael F. Cohen; Matthew Uyttendaele

We present a real-time approach for image-based localization within large scenes that have been reconstructed offline using structure from motion (Sfm). From monocular video, our method continuously computes a precise 6-DOF camera pose, by efficiently tracking natural features and matching them to 3D points in the Sfm point cloud. Our main contribution lies in efficiently interleaving a fast keypoint tracker that uses inexpensive binary feature descriptors with a new approach for direct 2D-to-3D matching. The 2D-to-3D matching avoids the need for online extraction of scale-invariant features. Instead, offline we construct an indexed database containing multiple DAISY descriptors per 3D point extracted at multiple scales. The key to the efficiency of our method lies in invoking DAISY descriptor extraction and matching sparingly during localization, and in distributing this computation over a window of successive frames. This enables the algorithm to run in real-time, without fluctuations in the latency over long durations. We evaluate the method in large indoor and outdoor scenes. Our algorithm runs at over 30 Hz on a laptop and at 12 Hz on a low-power, mobile computer suitable for onboard computation on a quadrotor micro aerial vehicle.

european conference on computer vision | 2010

A multi-stage linear approach to structure from motion

Sudipta N. Sinha; Drew Steedly; Richard Szeliski

We present a new structure from motion (Sfm) technique based on point and vanishing point (VP) matches in images. First, all global camera rotations are computed from VP matches as well as relative rotation estimates obtained from pairwise image matches. A new multi-staged linear technique is then used to estimate all camera translations and 3D points simultaneously. The proposed method involves first performing pairwise reconstructions, then robustly aligning these in pairs, and finally aligning all of them globally by simultaneously estimating their unknown relative scales and translations. In doing so, measurements inconsistent in three views are efficiently removed. Unlike sequential Sfm, the proposed method treats all images equally, is easy to parallelize and does not require intermediate bundle adjustments. There is also a reduction of drift and significant speedups up to two order of magnitude over sequential Sfm. We compare our method with a standard Sfm pipeline [1] and demonstrate that our linear estimates are accurate on a variety of datasets, and can serve as good initializations for final bundle adjustment. Because we exploit VPs when available, our approach is particularly well-suited to the reconstruction of man-made scenes.

Computer Vision and Image Understanding | 2006

Pan-tilt-zoom camera calibration and high-resolution mosaic generation

Sudipta N. Sinha; Marc Pollefeys

In this paper, we discuss the problem of estimating parameters of a calibration model for active pan-tilt-zoom cameras. The variation of the intrinsic parameters of each camera over its full range of zoom settings is estimated through a two step procedure. We first determine the intrinsic parameters at the cameras lowest zoom setting very accurately by capturing an extended panorama. The camera intrinsics and radial distortion parameters are then determined at discrete steps in a monotonically increasing zoom sequence that spans the full zoom range of the camera. Our model incorporates the variation of radial distortion with camera zoom. Both calibration phases are fully automatic and do not assume any knowledge of the scene structure. High-resolution calibrated panoramic mosaics are also computed during this process. These fully calibrated panoramas are represented as multi-resolution pyramids of cube-maps. We describe a hierarchical approach for building multiple levels of detail in panoramas, by aligning hundreds of images captured within a 1-12× zoom range. Results are shown from datasets captured from two types of pan-tilt-zoom cameras placed in an uncontrolled outdoor environment. The estimated camera intrinsics model along with the cube-maps provides a calibration reference for images captured on the fly by the active pan-tilt-zoom camera under operation making our approach promising for active camera network calibration.

Explore More