Siyu Zhu
Hong Kong University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Siyu Zhu.
european conference on computer vision | 2016
Tianwei Shen; Siyu Zhu; Tian Fang; Runze Zhang; Long Quan
Pairwise image matching of unordered image collections greatly affects the efficiency and accuracy of Structure-from-Motion (SfM). Insufficient match pairs may result in disconnected structures or incomplete components, while costly redundant pairs containing erroneous ones may lead to folded and superimposed structures. This paper presents a graph-based image matching method that tackles the issues of completeness, efficiency and consistency in a unified framework. Our approach starts by chaining all but singleton images using a visual-similarity-based minimum spanning tree. Then the minimum spanning tree is incrementally expanded to form locally consistent strong triplets. Finally, a global community-based graph algorithm is introduced to strengthen the global consistency by reinforcing potentially large connected components. We demonstrate the superior performance of our method in terms of accuracy and efficiency on both benchmark and Internet datasets. Our method also performs remarkably well on the challenging datasets of highly ambiguous and duplicated scenes.
computer vision and pattern recognition | 2014
Siyu Zhu; Tian Fang; Jianxiong Xiao; Long Quan
Global bundle adjustment usually converges to a non-zero residual and produces sub-optimal camera poses for local areas, which leads to loss of details for high- resolution reconstruction. Instead of trying harder to optimize everything globally, we argue that we should live with the non-zero residual and adapt the camera poses to local areas. To this end, we propose a segment-based approach to readjust the camera poses locally and improve the reconstruction for fine geometry details. The key idea is to partition the globally optimized structure from motion points into well-conditioned segments for re-optimization, reconstruct their geometry individually, and fuse everything back into a consistent global model. This significantly reduces severe propagated errors and estimation biases caused by the initial global adjustment. The results on several datasets demonstrate that this approach can significantly improve the reconstruction accuracy, while maintaining the consistency of the 3D structure between segments.
IEEE Transactions on Visualization and Computer Graphics | 2016
Jinglu Wang; Tian Fang; Qingkun Su; Siyu Zhu; Jingbo Liu; Shengnan Cai; Chiew-Lan Tai; Long Quan
Reconstructed building models using stereo-based methods inevitably suffer from noise, leading to the lack of regularity which is characterized by straightness of structural linear features and smoothness of homogeneous regions. We leverage the structural linear features embedded in the mesh to construct a novel surface scaffold structure for model regularization. The regularization comprises two iterative stages: (1) the linear features are semi-automatically proposed from images by exploiting photometric and geometric clues jointly; (2) the scaffold topology represented by spatial relations among the linear features is optimized according to data fidelity and topological rules, then the mesh is refined by adjusting itself to the consolidated scaffold. Our method has two advantages. First, the proposed scaffold representation is able to concisely describe semantic building structures. Second, the scaffold structure is embedded in the mesh, which can preserve the mesh connectivity and avoid stitching or intersecting surfaces in challenging cases. We demonstrate that our method can enhance structural characteristics and suppress irregularities in the building models robustly in some challenging datasets. Moreover, the regularization can significantly improve the results of general applications such as simplification and non-photorealistic rendering.
international conference on computer vision | 2015
Runze Zhang; Shiwei Li; Tian Fang; Siyu Zhu; Long Quan
In this paper, we propose an optimal decomposition approach to large-scale multi-view stereo from an initial sparse reconstruction. The success of the approach depends on the introduction of surface-segmentation-based camera clustering rather than sparse-point-based camera clustering, which suffers from the problems of non-uniform reconstruction coverage ratio and high redundancy. In details, we introduce three criteria for camera clustering and surface segmentation for reconstruction, and then we formulate these criteria into an energy minimization problem under constraints. To solve this problem, we propose a joint optimization in a hierarchical framework to obtain the final surface segments and corresponding optimal camera clusters. On each level of the hierarchical framework, the camera clustering problem is formulated as a parameter estimation problem of a probability model solved by a General Expectation-Maximization algorithm and the surface segmentation problem is formulated as a Markov Random Field model based on the probability estimated by the previous camera clustering process. The experiments on several Internet datasets and aerial photo datasets demonstrate that the proposed approach method generates more uniform and complete dense reconstruction with less redundancy, resulting in more efficient multi-view stereo algorithm.
asian conference on computer vision | 2014
Siyu Zhu; Tian Fang; Runze Zhang; Long Quan
For large-scale and highly redundant photo collections, eliminating statistical redundancy in multi-view geometry is of great importance to efficient 3D reconstruction. Our approach takes the full set of images with initial calibration and recovered sparse 3D points as inputs, and obtains a subset of views that preserve the final reconstruction accuracy and completeness well. We first construct an image quality graph, in which each vertex represents an input image, and the problem is then to determine a connected sub-graph guaranteeing a consistent reconstruction and maximizing the accuracy and completeness of the final reconstruction. Unlike previous works, which only address the problem of efficient structure from motion (SfM), our technique is highly applicable to the whole reconstruction pipeline, and solves the problems of efficient bundle adjustment, multi-view stereo (MVS), and subsequent variational refinement.
asian conference on computer vision | 2016
Tianwei Shen; Jinglu Wang; Tian Fang; Siyu Zhu; Long Quan
Current texture creation methods for image-based modeling suffer from color discontinuity issues due to drastically varying conditions of illumination, exposure and time during the image capturing process. This paper proposes a novel system that generates consistent textures for triangular meshes. The key to our system is a color correction framework for large-scale unordered image collections. We model the problem as a graph-structured optimization over the overlapping regions of image pairs. After reconstructing the mesh of the scene, we accurately calculate matched image regions by re-projecting images onto the mesh. Then the image collection is robustly adjusted using a non-linear least square solver over color histograms in an unsupervised fashion. Finally, a connectivity-preserving edge pruning method is introduced to accelerate the color correction process. This system is evaluated with crowdsourcing image collections containing medium-sized scenes and city-scale urban datasets. To the best of our knowledge, this system is the first consistent texturing system for image-based modeling that is capable of handling thousands of input images.
asian conference on computer vision | 2014
Runze Zhang; Tian Fang; Siyu Zhu; Long Quan
The fusion of a 3D reconstruction up to a similarity transformation from monocular videos and the metric positional measurements from GPS usually relies on the alignment of the two coordinate systems. When positional measurements provided by a low-cost GPS are corrupted by high-level noises, this approach becomes problematic. In this paper, we introduce a novel framework that uses similarity invariants to form a tetrahedral network of views for the fusion. Such a tetrahedral network decouples the alignment from the fusion to combat the high-level noises. Then, we update the similarity transformation each time a well-conditioned motion of cameras is successfully identified. Moreover, we develop a multi-scale sampling strategy to reduce the computational overload and to adapt the algorithm to different levels of noises. It is important to note that our optimization framework can be applied in both batch and incremental manners. Experiments on simulations and real datasets demonstrate the robustness and the efficiency of our method.
computer vision and pattern recognition | 2018
Siyu Zhu; Runze Zhang; Lei Zhou; Tianwei Shen; Tian Fang; Ping Tan; Long Quan
international conference on computer vision | 2017
Lei Zhou; Siyu Zhu; Tianwei Shen; Jinglu Wang; Tian Fang; Long Quan
international conference on computer vision | 2017
Runze Zhang; Siyu Zhu; Tian Fang; Long Quan