Zhiwei Li | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Zhiwei Li is active.

Explore More

Publication

Featured researches published by Zhiwei Li.

acm multimedia | 2007

Bipartite graph reinforcement model for web image annotation

Xiaoguang Rui; Mingjing Li; Zhiwei Li; Wei-Ying Ma; Nenghai Yu

Automatic image annotation is an effective way for managing and retrieving abundant images on the internet. In this paper, a bipartite graph reinforcement model (BGRM) is proposed for web image annotation. Given a web image, a set of candidate annotations is extracted from its surrounding text and other textual information in the hosting web page. As this set is often incomplete, it is extended to include more potentially relevant annotations by searching and mining a large-scale image database. All candidates are modeled as a bipartite graph. Then a reinforcement algorithm is performed on the bipartite graph to re-rank the candidates. Only those with the highest ranking scores are reserved as the final annotations. Experimental results on real web images demonstrate the effectiveness of the proposed model.

european conference on computer vision | 2010

Max-margin dictionary learning for multiclass image categorization

Xiao-Chen Lian; Zhiwei Li; Bao-Liang Lu; Lei Zhang

Visual dictionary learning and base (binary) classifier training are two basic problems for the recently most popular image categorization framework, which is based on the bag-of-visual-terms (BOV) models and multiclass SVM classifiers. In this paper, we study new algorithms to improve performance of this framework from these two aspects. Typically SVM classifiers are trained with dictionaries fixed, and as a result the traditional loss function can only be minimized with respect to hyperplane parameters (w and b). We propose a novel loss function for a binary classifier, which links the hinge-loss term with dictionary learning. By doing so, we can further optimize the loss function with respect to the dictionary parameters. Thus, this framework is able to further increase margins of binary classifiers, and consequently decrease the error bound of the aggregated classifier. On two benchmark dataset, Graz [1] and the fifteen scene category dataset [2], our experiment results significantly outperformed state-of-the-art works.

international conference on computer vision | 2015

MeshStereo: A Global Stereo Model with Mesh Alignment Regularization for View Interpolation

Chi Zhang; Zhiwei Li; Yanhua Cheng; Rui Cai; Hongyang Chao; Yong Rui

We present a novel global stereo model designed for view interpolation. Unlike existing stereo models which only output a disparity map, our model is able to output a 3D triangular mesh, which can be directly used for view interpolation. To this aim, we partition the input stereo images into 2D triangles with shared vertices. Lifting the 2D triangulation to 3D naturally generates a corresponding mesh. A technical difficulty is to properly split vertices to multiple copies when they appear at depth discontinuous boundaries. To deal with this problem, we formulate our objective as a two-layer MRF, with the upper layer modeling the splitting properties of the vertices and the lower layer optimizing a region-based stereo matching. Experiments on the Middlebury and the Herodion datasets demonstrate that our model is able to synthesize visually coherent new view angles with high PSNR, as well as outputting high quality disparity maps which rank at the first place on the new challenging high resolution Middlebury 3.0 benchmark.

international world wide web conferences | 2010

MindFinder: image search by interactive sketching and tagging

Changhu Wang; Zhiwei Li; Lei Zhang

In this technical demonstration, we showcase the MindFinder system

computer vision and pattern recognition | 2012

3D visual phrases for landmark recognition

Qiang Hao; Rui Cai; Zhiwei Li; Lei Zhang; Yanwei Pang; Feng Wu

computer vision and pattern recognition | 2011

Rank-SIFT: Learning to rank repeatable local interest points

Bing Li; Rong Xiao; Zhiwei Li; Rui Cai; Bao-Liang Lu; Lei Zhang

a novel image search engine. Different from existing interactive image search engines, most of which only provide image-level relevance feedback, MindFinder enables users to sketch and tag query images at object level. By considering the image database as a huge repository, MindFinder is able to help users present and refine their initial thoughts in their mind, and finally turn thoughts to a beautiful image(s). Multiple actions are enabled for users to flexibly design their queries in a bilateral interactive manner by leveraging the whole image database, including tagging, refining query by dragging and dropping objects from search results, as well as editing objects. After each action, the search results will be updated in real time to provide users up-to-date materials to further formulate the query. By the deliberate but easy design of the query, MindFinder not only tries to enable users to present on the query panel whatever they are imagining, but also returns to users the most similar images to the picture in users mind. By scaling up the image database to 10 million, MindFinder has the potential to reveal whatever in users mind, that is where the name MindFinder comes from.

international world wide web conferences | 2008

Improving relevance judgment of web search results with image excerpts

Zhiwei Li; Shuming Shi; Lei Zhang

In this paper, we study the problem of landmark recognition and propose to leverage 3D visual phrases to improve the performance. A 3D visual phrase is a triangular facet on the surface of a reconstructed 3D landmark model. In contrast to existing 2D visual phrases which are mainly based on co-occurrence statistics in 2D image planes, such 3D visual phrases explicitly characterize the spatial structure of a 3D object (landmark), and are highly robust to projective transformations due to viewpoint changes. We present an effective solution to discover, describe, and detect 3D visual phrases. The experiments on 10 landmarks have achieved promising results, which demonstrate that our approach provides a good balance between precision and recall of landmark recognition while reducing the dependence on post-verification to reject false positives.

computer vision and pattern recognition | 2010

Probabilistic models for supervised dictionary learning

Xiao-Chen Lian; Zhiwei Li; Changhu Wang; Bao-Liang Lu; Lei Zhang

Scale-invariant feature transform (SIFT) has been well studied in recent years. Most related research efforts focused on designing and learning effective descriptors to characterize a local interest point. However, how to identify stable local interest points is still a very challenging problem. In this paper, we propose a set of differential features, and based on them we adopt a data-driven approach to learn a ranking function to sort local interest points according to their stabilities across images containing the same visual objects. Compared with the handcrafted rule-based method used by the standard SIFT algorithm, our algorithm substantially improves the stability of detected local interest point on a very challenging benchmark dataset, in which images were generated under very different imaging conditions. Experimental results on the Oxford and PASCAL databases further demonstrate the superior performance of the proposed algorithm on both object image retrieval and category recognition.

international conference on multimedia and expo | 2007

Image Annotation in a Progressive Way

Bin Wang; Zhiwei Li; Nenghai Yu; Mingjing Li

Current web search engines return result pages containing mostly text summary even though the matched web pages may contain informative pictures. A text excerpt (i.e. snippet) is generated by selecting keywords around the matched query terms for each returned page to provide context for users relevance judgment. However, in many scenarios, we found that the pictures in web pages, if selected properly, could be added into search result pages and provide richer contextual description because a picture is worth a thousand words. Such new summary is named as image excerpts. By well designed user study, we demonstrate image excerpts can help users make much quicker relevance judgment of search results for a wide range of query types. To implement this idea, we propose a practicable approach to automatically generate image excerpts in the result pages by considering the dominance of each picture in each web page and the relevance of the picture to the query. We also outline an efficient way to incorporate image excerpts in web search engines. Web search engines can adopt our approach by slightly modifying their index and inserting a few low cost operations in their workflow. Our experiments on a large web dataset indicate the performance of the proposed approach is very promising.

computer vision and pattern recognition | 2013

Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition

Qiang Hao; Rui Cai; Zhiwei Li; Lei Zhang; Yanwei Pang; Feng Wu; Yong Rui

Dictionary generation is a core technique of the bag-of-visual-words (BOV) models when applied to image categorization. Most of previous approaches generate dictionaries by unsupervised clustering techniques, e.g. k-means. However, the features obtained by such kind of dictionaries may not be optimal for image classification. In this paper, we propose a probabilistic model for supervised dictionary learning (SDLM) which seamlessly combines an unsuper-vised model (a Gaussian Mixture Model) and a supervised model (a logistic regression model) in a probabilistic framework. In the model, image category information directly affects the generation of a dictionary. A dictionary obtained by this approach is a trade-off between minimization of distortions of clusters and maximization of discriminative power of image-wise representations, i.e. histogram representations of images. We further extend the model to incorporate spatial information during the dictionary learning process in a spatial pyramid matching like manner. We extensively evaluated the two models on various benchmark dataset and obtained promising results.

Explore More