Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ruofei Zhang is active.

Publication


Featured researches published by Ruofei Zhang.


international conference on computer vision | 2005

A probabilistic semantic model for image annotation and multimodal image retrieval

Ruofei Zhang; Zhongfei Zhang; Mingjing Li; Wei-Ying Ma; Hong-Jiang Zhang

This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7,736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.


IEEE Transactions on Image Processing | 2007

Effective Image Retrieval Based on Hidden Concept Discovery in Image Database

Ruofei Zhang; Zhongfei Zhang

This paper addresses content-based image retrieval in general, and in particular, focuses on developing a hidden semantic concept discovery methodology to address effective semantics-intensive image retrieval. In our approach, each image in the database is segmented into regions associated with homogenous color, texture, and shape features. By exploiting regional statistical information in each image and employing a vector quantization method, a uniform and sparse region-based representation is achieved. With this representation, a probabilistic model based on statistical-hidden-class assumptions of the image database is obtained, to which the expectation-maximization technique is applied to analyze semantic concepts hidden in the database. An elaborated retrieval algorithm is designed to support the probabilistic model. The semantic similarity is measured through integrating the posterior probabilities of the transformed query image, as well as a constructed negative example, to the discovered semantic concepts. The proposed approach has a solid statistical foundation; the experimental evaluations on a database of 10 000 general-purposed images demonstrate its promise and effectiveness


computer vision and pattern recognition | 2004

Hidden semantic concept discovery in region based image retrieval

Ruofei Zhang; Zhongfei Zhang

This work addresses content based image retrieval (CBIR), focusing on developing a hidden semantic concept discovery methodology to address effective semantics-intensive image retrieval. In our approach, each image in the database is segmented to region; associated with homogenous color, texture, and shape features. By exploiting regional statistical information in each image and employing a vector quantization method, a uniform and sparse region-based representation is achieved. With this representation a probabilistic model based on statistical-hidden-class assumptions of the image database is obtained, to which expectation-maximization (EM) technique is applied to analyze semantic concepts hidden in the database. An elaborated retrieval algorithm is designed to support the probabilistic model. The semantic similarity is measured through integrating the posterior probabilities of the transformed query image, as well as a constructed negative example, to the discovered semantic concepts. The proposed approach has a solid statistical foundation and the experimental evaluations on a database of 10,000 general-purposed images demonstrate its promise of the effectiveness.


knowledge discovery and data mining | 2010

Exploitation and exploration in a performance based contextual advertising system

Wei Li; Xuerui Wang; Ruofei Zhang; Ying Cui; Jianchang Mao; Rong Jin

The dynamic marketplace in online advertising calls for ranking systems that are optimized to consistently promote and capitalize better performing ads. The streaming nature of online data inevitably makes an advertising system choose between maximizing its expected revenue according to its current knowledge in short term (exploitation) and trying to learn more about the unknown to improve its knowledge (exploration), since the latter might increase its revenue in the future. The exploitation and exploration (EE) tradeoff has been extensively studied in the reinforcement learning community, however, not been paid much attention in online advertising until recently. In this paper, we develop two novel EE strategies for online advertising. Specifically, our methods can adaptively balance the two aspects of EE by automatically learning the optimal tradeoff and incorporating confidence metrics of historical performance. Within a deliberately designed offline simulation framework we apply our algorithms to an industry leading performance based contextual advertising system and conduct extensive evaluations with real online event log data. The experimental results and detailed analysis reveal several important findings of EE behaviors in online advertising and demonstrate that our algorithms perform superiorly in terms of ad reach and click-through-rate (CTR).


knowledge discovery and data mining | 2011

Bid landscape forecasting in online ad exchange marketplace

Ying Cui; Ruofei Zhang; Wei Li; Jianchang Mao

Display advertising has been a significant source of revenue for publishers and ad networks in online advertising ecosystem. One important business model in online display advertising is Ad Exchange marketplace, also called non-guaranteed delivery (NGD), in which advertisers buy targeted page views and audiences on a spot market through real-time auction. In this paper, we describe a bid landscape forecasting system in NGD marketplace for any advertiser campaign specified by a variety of targeting attributes. In the system, the impressions that satisfy the campaign targeting attributes are partitioned into multiple mutually exclusive samples. Each sample is one unique combination of quantified attribute values. We develop a divide-and-conquer approach that breaks down the campaign-level forecasting problem. First, utilizing a novel star-tree data structure, we forecast the bid for each sample using non-linear regression by gradient boosting decision trees. Then we employ a mixture-of-log-normal model to generate campaign-level bid distribution based on the sample-level forecasted distributions. The experiment results of a system developed with our approach show that it can accurately forecast the bid distributions for various campaigns running on the worlds largest NGD advertising exchange system, outperforming two baseline methods in term of forecasting errors.


knowledge discovery and data mining | 2012

Multimedia features for click prediction of new ads in display advertising

Haibin Cheng; Roelof van Zwol; Javad Azimi; Eren Manavoglu; Ruofei Zhang; Yang Zhou; Vidhya Navalpakkam

Non-guaranteed display advertising (NGD) is a multi-billion dollar business that has been growing rapidly in recent years. Advertisers in NGD sell a large portion of their ad campaigns using performance dependent pricing models such as cost-per-click (CPC) and cost-per-action (CPA). An accurate prediction of the probability that users click on ads is a crucial task in NGD advertising because this value is required to compute the expected revenue. State-of-the-art prediction algorithms rely heavily on historical information collected for advertisers, users and publishers. Click prediction of new ads in the system is a challenging task due to the lack of such historical data. The objective of this paper is to mitigate this problem by integrating multimedia features extracted from display ads into the click prediction models. Multimedia features can help us capture the attractiveness of the ads with similar contents or aesthetics. In this paper we evaluate the use of numerous multimedia features (in addition to commonly used user, advertiser and publisher features) for the purposes of improving click prediction in ads with no history. We provide analytical results generated over billions of samples and demonstrate that adding multimedia features can significantly improve the accuracy of click prediction for new ads, compared to a state-of-the-art baseline model.


Multimedia Systems | 2006

A probabilistic semantic model for image annotation and multi-modal image retrieval

Ruofei Zhang; Zhongfei Zhang; Mingjing Li; Wei-Ying Ma; Hong-Jiang Zhang

This paper addresses automatic image annotation problem and its application to multi-modal image retrieval. The contribution of our work is three-fold. (1) We propose a probabilistic semantic model in which the visual features and the textual words are connected via a hidden layer which constitutes the semantic concepts to be discovered to explicitly exploit the synergy among the modalities. (2) The association of visual features and textual words is determined in a Bayesian framework such that the confidence of the association can be provided. (3) Extensive evaluation on a large-scale, visually and semantically diverse image collection crawled from Web is reported to evaluate the prototype system based on the model. In the proposed probabilistic model, a hidden concept layer which connects the visual feature and the word layer is discovered by fitting a generative model to the training image and annotation words through an Expectation-Maximization (EM) based iterative learning procedure. The evaluation of the prototype system on 17,000 images and 7736 automatically extracted annotation words from crawled Web pages for multi-modal image retrieval has indicated that the proposed semantic model and the developed Bayesian framework are superior to a state-of-the-art peer system in the literature.


international world wide web conferences | 2011

A stochastic learning-to-rank algorithm and its application to contextual advertising

Maryam Karimzadehgan; Wei Li; Ruofei Zhang; Jianchang Mao

This paper is concerned with the problem of learning a model to rank objects (Web pages, ads and etc.). We propose a framework where the ranking model is both optimized and evaluated using the same information retrieval measures such as Normalized Discounted Cumulative Gain (NDCG) and Mean Average Precision (MAP). The main difficulty in direct optimization of NDCG and MAP is that these measures depend on the rank of objects and are not differentiable. Most learning-to-rank methods that attempt to optimize NDCG or MAP approximate such measures so that they can be differentiable. In this paper, we propose a simple yet effective stochastic optimization algorithm to directly minimize any loss function, which can be defined on NDCG or MAP for the learning-to-rank problem. The algorithm employs Simulated Annealing along with Simplex method for its parameter search and finds the global optimal parameters. Experiment results using NDCG-Annealing algorithm, an instance of the proposed algorithm, on LETOR benchmark data sets show that the proposed algorithm is both effective and stable when compared to the baselines provided in LETOR 3.0. In addition, we applied the algorithm for ranking ads in contextual advertising. Our method has shown to significantly improve relevance in offline evaluation and business metrics in online tests in a real large-scale advertising serving system. To scale our computations, we parallelize the algorithm in a MapReduce framework running on Hadoop.


multimedia information retrieval | 2003

Addressing CBIR efficiency, effectiveness, and retrieval subjectivity simultaneously

Ruofei Zhang; Zhongfei Zhang

This work is about Content Based Image Retrieval (CBIR), focusing on developing a Fast And Semantics-Tailored (FAST) image retrieval methodology. Specifically, the contributions of FAST methodology to the CBIR literature include: (1) development of a new indexing method based on fuzzy logic to incorporate color, texture, and shape information into a region based approach to improving the retrieval effectiveness and robustness (2) development of a new hierarchical indexing structure and the corresponding Hierarchical, Elimination-based A* Retrieval algorithm (HEAR) to significantly improve the retrieval efficiency without sacrificing the retrieval effectiveness; it is shown that HEAR is guaranteed to deliver a logarithm search in the average case (3) employment of user relevance feedbacks to tailor the semantic retrieval to each users individualized query preference through the novel Indexing Tree Pruning (ITP) and Adaptive Region Weight Updating (ARWU) algorithms. Theoretical analysis and experimental evaluations show that FAST methodology holds a great promise in delivering fast and semantics-tailored image retrieval in CBIR.


Archive | 2008

Multimedia Data Mining: A Systematic Introduction to Concepts and Theory

Zhongfei Zhang; Ruofei Zhang

Collecting the latest developments in the field, Multimedia Data Mining: A Systematic Introduction to Concepts and Theory defines multimedia data mining, its theory, and its applications. Two of the most active researchers in multimedia data mining explore how this young area has rapidly developed in recent years. The book first discusses the theoretical foundations of multimedia data mining, presenting commonly used feature representation, knowledge representation, statistical learning, and soft computing techniques. It then provides application examples that showcase the great potential of multimedia data mining technologies. In this part, the authors show how to develop a semantic repository training method and a concept discovery method in an imagery database. They demonstrate how knowledge discovery helps achieve the goal of imagery annotation. The authors also describe an effective solution to large-scale video search, along with an application of audio data classification and categorization. This novel, self-contained book examines how the merging of multimedia and data mining research can promote the understanding and advance the development of knowledge discovery in multimedia data.

Collaboration


Dive into the Ruofei Zhang's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wei Li

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xuerui Wang

University of Massachusetts Amherst

View shared research outputs
Researchain Logo
Decentralizing Knowledge