Shuhui Jiang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shuhui Jiang is active.

Explore More

Publication

Featured researches published by Shuhui Jiang.

international conference on multimedia and expo | 2013

Generating representative images for landmark by discovering high frequency shooting locations from community-contributed photos

Shuhui Jiang; Xueming Qian; Yao Xue; Fan Li; Xingsong Hou

Representative images generation offers a comprehensive knowledge for landmark and is a hot research area recent years. This paper presents a representative images generation system by discovering high frequency shooting locations from geo-tagged community-contributed photos. We discover that the views (e.g. far and near, front, back and side) of the photos taken in the same location are usually similar and but different in different shooting locations. Our system is realized by three steps: 1) Landmark dataset is filtered from social media by the combination of tags and geo-tags. 2) High frequency shooting locations are mined by geo-tag cluster. 3) Visual feature is then used for removing irrelevant images and ranking by intra and inter SIFT matching. This work is the first attempt to generate representative images by high frequency shooting locations mining. Evaluating on ten landmarks shows its effectiveness .

conference on multimedia modeling | 2015

Travel Recommendation via Author Topic Model Based Collaborative Filtering

Shuhui Jiang; Xueming Qian; Jialie Shen; Tao Mei

While automatic travel recommendation has attracted a lot of attentions, the existing approaches generally suffer from different kinds of weaknesses. For example, sparsity problem can significantly degrade the performance of traditional collaborative filtering (CF). If a user only visits very few locations, accurate similar user identification becomes very challenging due to lack of sufficient information. Motivated by this concern, we propose an Author Topic Collaborative Filtering (ATCF) method to facilitate comprehensive Points of Interest (POIs) recommendation for social media users. In our approach, the topics about user preference (e.g., cultural, cityscape, or landmark) are extracted from the textual description of photos by author topic model instead of from GPS (geo-tag). Consequently, unlike CF based approaches, even without GPS records, similar users could still be identified accurately according to the similarity of users’ topic preferences. In addition, ATCF doesn’t pre-define the category of travel topics. The category and user topic preference could be elicited simultaneously. Experiment results with a large test collection demonstrate various kinds of advantages of our approach.

acm multimedia | 2016

Deep Bi-directional Cross-triplet Embedding for Cross-Domain Clothing Retrieval

Shuhui Jiang; Yue Wu; Yun Fu

In this paper, we address two practical problems when shopping online: 1) What will I look like when wearing this clothing on the street? 2) How to find the exact same or similar clothing that other people are wearing on the street or in a movie? In this paper, we jointly solve these two problems with one bi-directional shop-to-street street-to-shop clothing retrieval framework. There are three main challenges of cross-domain clothing retrieval task. First is to learn the discrepancy (e.g., background, pose, illumination) between street domain and shop domain clothing. Second, both intra-domain and cross-domain similarity need to be considered during feature embedding. Third, there is large bias between the number of matched and non-matched street and shop pairs. To solve these challenges, in this paper, we propose a deep bi-directional cross-triplet embedding algorithm by extending the start-of-the-art triplet embedding into cross-domain retrieval scenario. Extensive experiments demonstrate the effectiveness of the proposed algorithm.

international symposium on circuits and systems | 2013

Mobile multimedia travelogue generation by exploring geo-locations and image tags

Shuhui Jiang; Xueming Qian; Ke Lan; Lei Zhang; Tao Mei

Traveling experience sharing has become pervasive. In this paper, we present a system which automatically generates a multimedia travelogue for mobile users. Multimedia travelogue shows user footprint with photos on the corresponding location on the map, which could offer the space distribution inner or between the locations. Meanwhile, location overviews with representative tags and images show comprehensive knowledge of the landmarks. The key challenges are: 1) when mapping footprint, some photos are without geo-tags; 2)user-contributed photos may be not clear and graceful; 3) to detect which landmark is actually on the photo to offer location overview. We solve these challenges by combining both geo-tags and tags to estimate the location. First we use the relationship of timestamp and geo-tag in the same trip group to estimate the footprints. Second we localize the landmark on the photo by tag match method which matching user tags with representative tags of the landmark. Experimental results on a Flickr image collection of nearly 2 million images of 6,581 users demonstrate the effectiveness of our approach.

international joint conference on artificial intelligence | 2017

Fashion Style Generator

Shuhui Jiang; Yun Fu

In this paper, we focus on a new problem: applying artificial intelligence to automatically generate fashion style images. Given a basic clothing image and a fashion style image (e.g., leopard print), we generate a clothing image with the certain style in real time with a neural fashion style generator. Fashion style generation is related to recent artistic style transfer works, but has its own challenges. The synthetic image should preserve the similar design as the basic clothing, and meanwhile blend the new style pattern on the clothing. Neither existing global nor patch based neural style transfer methods could well solve these challenges. In this paper, we propose an end-to-end feed-forward neural network which consists of a fashion style generator and a discriminator. The global and patch based style and content losses calculated by the discriminator alternatively back-propagate the generator network and optimize it. The global optimization stage preserves the clothing form and design and the local optimization stage preserves the detailed style pattern. Extensive experiments show that our method outperforms the state-of-the-arts.

multimedia signal processing | 2015

Visual summarization for place-of-interest by social-contextual constrained geo-clustering

Yayun Ren; Xueming Qian; Shuhui Jiang

With the rapid development of social networks, more and more users choose to share their own photos with their friends. Especially, users prefer to share the photos they took during traveling, thus there emerges many user generated content for place-of-interests (POIs). So based on the user contributed photos, we can summarize each POI by mining location-of-interest (LOI, which represents the attractive viewpoints of POI) and selecting some representative images from them. It is important for scheduling a traveling, and in this paper, an effective POI summarization approach is proposed by an improved geo-clustering with visual and views verification, which helps us to have a representative and comprehensive perception for POI. In our approach, we firstly collect POI related photos from social media, and filter the raw data by the combination of tags and geo-locations. Secondly, we mine LOIs for each POI by the improved geo-location clustering method. Finally, we employ visual and views verification to select images from LOIs to summarize the POI. We conduct a series of experiments based on Flcoickr dataset. Experimental results demonstrate the effectiveness of our proposed method.

acm multimedia | 2017

Deep Low-rank Sparse Collective Factorization for Cross-Domain Recommendation

Shuhui Jiang; Zhengming Ding; Yun Fu

In cross-domain recommendation, data sparsity becomes more and more serious when the ratings are expressed numerically, e.g., 5-star grades. In this work, we focus on borrowing the knowledge from other domains in the form of binary ratings, such as likes and dislikes for certain items. Most existing works conventionally assume that multiple domains share some common latent information across users and items. In practice, however, the related domains not only share the common latent feature of users and items, but also share some knowledge of rating patterns. Furthermore, conventional methods did not consider the hierarchical structures (i.e., genre, sub genre, detailed-category) in real-world recommendation system. In this paper, we propose a novel Deep Low-rank Sparse Collective Factorization (DLSCF) to facilitate the cross-domain recommendation. Specifically, the low-rank sparse decomposition is adopted to capture the most shared rating patterns with low-rank constraint while integrating the domain-specific patterns with group-sparse scheme. Furthermore, we factorize the rating pattern matrix in multiple layers to obtain the user/item latent category affiliation matrices, which could indicate the affiliation relation between latent categories and latent sub-categories. Experimental results on MoviePilot and Netfilx datasets demonstrate the effectiveness of our proposed algorithm at various sparsity levels, by comparing it with several state-of-the-art approaches.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2017

Learning Consensus Representation for Weak Style Classification

Shuhui Jiang; Ming Shao; Chengcheng Jia; Yun Fu

Style classification (e.g., Baroque and Gothic architecture style) is grabbing increasing attention in many fields such as fashion, architecture, and manga. Most existing methods focus on extracting discriminative features from local patches or patterns. However, the spread out phenomenon in style classification has not been recognized yet. It means that visually less representative images in a style class are usually very diverse and easily getting misclassified. We name them weak style images. Another issue when employing multiple visual features towards effective weak style classification is lack of consensus among different features. That is, weights for different visual features in the local patch should have been allocated similar values. To address these issues, we propose a Consensus Style Centralizing Auto-Encoder (CSCAE) for learning robust style features representation, especially for weak style classification. First, we propose a Style Centralizing Auto-Encoder (SCAE) which centralizes weak style features in a progressive way. Then, based on SCAE, we propose both the non-linear and linear version CSCAE which adaptively allocate weights for different features during the progressive centralization process. Consensus constraints are added based on the assumption that the weights of different features of the same patch should be similar. Specifically, the proposed linear counterpart of CSCAE motivated by the “shared weights” idea as well as group sparsity improves both efficacy and efficiency. For evaluations, we experiment extensively on fashion, manga and architecture style classification problems. In addition, we collect a new dataset—Online Shopping, for fashion style classification, which will be publicly available for vision based fashion style research. Experiments demonstrate the effectiveness of the SCAE and CSCAE on both public and newly collected datasets when compared with the most recent state-of-the-art works.

ACM Transactions on Multimedia Computing, Communications, and Applications | 2018

Deep Bidirectional Cross-Triplet Embedding for Online Clothing Shopping

Shuhui Jiang; Yue Wu; Yun Fu

In this article, we address the cross-domain (i.e., street and shop) clothing retrieval problem and investigate its real-world applications for online clothing shopping. It is a challenging problem due to the large discrepancy between street and shop domain images. We focus on learning an effective feature-embedding model to generate robust and discriminative feature representation across domains. Existing triplet embedding models achieve promising results by finding an embedding metric in which the distance between negative pairs is larger than the distance between positive pairs plus a margin. However, existing methods do not address the challenges in the cross-domain clothing retrieval scenario sufficiently. First, the intradomain and cross-domain data relationships need to be considered simultaneously. Second, the number of matched and nonmatched cross-domain pairs are unbalanced. To address these challenges, we propose a deep cross-triplet embedding algorithm together with a cross-triplet sampling strategy. The extensive experimental evaluations demonstrate the effectiveness of the proposed algorithms well. Furthermore, we investigate two novel online shopping applications, clothing trying on and accessories recommendation, based on a unified cross-domain clothing retrieval framework.

IEEE Transactions on Big Data | 2017

Deep Geo-constrained Auto-encoder for Non-landmark GPS Estimation

Shuhui Jiang; Yu Kong; Yun Fu

This paper addresses the problem of geotagging images, i.e., assigning GPS coordinates (i.e., latitude, longitude) to images using image contents. Due to the huge appearance variability of visual features across the world, the images’ contents and their GPS coordinates may be inconsistent. This means images captured from geographically close areas may appear visually distinct; and images with visually similar contents may be taken from geographically distant areas. In this paper, we propose a deep Geo-constrained Auto-encoder (DGAE) to solve these inconsistency problems. Using clustered GPS data and visual data, our approach identifies inconsistent data pairs (i.e., image, GPS). We then propose a novel deep learning framework that can learn similar feature representations for geographically close images and distinct feature representations for geographically distant images. We introduce two new constraints: the same-area constraint and the easy-confusing constraint to our feature learning networks. The former one penalizes images from the same area but with very distinct visual features, and the latter one penalizes images from distant areas but with very similar visual features. A deep architecture is developed to further improve learning discriminative features, which can disambiguate different geometric locations. Our approach is extensively evaluated on a newly-compiled large image geotagging dataset from large-scale community-contributed images with 664,720 images and outperforms comparison approaches.

Explore More