Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yanjie Fu is active.

Publication


Featured researches published by Yanjie Fu.


knowledge discovery and data mining | 2013

Learning geographical preferences for point-of-interest recommendation

Bin Liu; Yanjie Fu; Zijun Yao; Hui Xiong

The problem of point of interest (POI) recommendation is to provide personalized recommendations of places of interests, such as restaurants, for mobile users. Due to its complexity and its connection to location based social networks (LBSNs), the decision process of a user choose a POI is complex and can be influenced by various factors, such as user preferences, geographical influences, and user mobility behaviors. While there are some studies on POI recommendations, it lacks of integrated analysis of the joint effect of multiple factors. To this end, in this paper, we propose a novel geographical probabilistic factor analysis framework which strategically takes various factors into consideration. Specifically, this framework allows to capture the geographical influences on a users check-in behavior. Also, the user mobility behaviors can be effectively exploited in the recommendation model. Moreover, the recommendation model can effectively make use of user check-in count data as implicity user feedback for modeling user preferences. Finally, experimental results on real-world LBSNs data show that the proposed recommendation method outperforms state-of-the-art latent factor models with a significant margin.


IEEE Transactions on Knowledge and Data Engineering | 2015

A General Geographical Probabilistic Factor Model for Point of Interest Recommendation

Bin Liu; Hui Xiong; Spiros Papadimitriou; Yanjie Fu; Zijun Yao

The problem of point of interest (POI) recommendation is to provide personalized recommendations of places, such as restaurants and movie theaters. The increasing prevalence of mobile devices and of location based social networks (LBSNs) poses significant new opportunities as well as challenges, which we address. The decision process for a user to choose a POI is complex and can be influenced by numerous factors, such as personal preferences, geographical considerations, and user mobility behaviors. This is further complicated by the connection LBSNs and mobile devices. While there are some studies on POI recommendations, they lack an integrated analysis of the joint effect of multiple factors. Meanwhile, although latent factor models have been proved effective and are thus widely used for recommendations, adopting them to POI recommendations requires delicate consideration of the unique characteristics of LBSNs. To this end, in this paper, we propose a general geographical probabilistic factor model (Geo-PFM) framework which strategically takes various factors into consideration. Specifically, this framework allows to capture the geographical influences on a users check-in behavior. Also, user mobility behaviors can be effectively leveraged in the recommendation model. Moreover, based our Geo-PFM framework, we further develop a Poisson Geo-PFM which provides a more rigorous probabilistic generative process for the entire model and is effective in modeling the skewed user check-in count data as implicit feedback for better POI recommendations. Finally, extensive experimental results on three real-world LBSN datasets (which differ in terms of user mobility, POI geographical distribution, implicit response data skewness, and user-POI observation sparsity), show that the proposed recommendation methods outperform state-of-the-art latent factor models by a significant margin.


knowledge discovery and data mining | 2014

Exploiting geographic dependencies for real estate appraisal: a mutual perspective of ranking and clustering

Yanjie Fu; Hui Xiong; Yong Ge; Zijun Yao; Yu Zheng; Zhi-Hua Zhou

It is traditionally a challenge for home buyers to understand, compare and contrast the investment values of real estates. While a number of estate appraisal methods have been developed to value real property, the performances of these methods have been limited by the traditional data sources for estate appraisal. However, with the development of new ways of collecting estate-related mobile data, there is a potential to leverage geographic dependencies of estates for enhancing estate appraisal. Indeed, the geographic dependencies of the value of an estate can be from the characteristics of its own neighborhood (individual), the values of its nearby estates (peer), and the prosperity of the affiliated latent business area (zone). To this end, in this paper, we propose a geographic method, named ClusRanking, for estate appraisal by leveraging the mutual enforcement of ranking and clustering power. ClusRanking is able to exploit geographic individual, peer, and zone dependencies in a probabilistic ranking model. Specifically, we first extract the geographic utility of estates from geography data, estimate the neighborhood popularity of estates by mining taxicab trajectory data, and model the influence of latent business areas via ClusRanking. Also, we use a linear model to fuse these three influential factors and predict estate investment values. Moreover, we simultaneously consider individual, peer and zone dependencies, and derive an estate-specific ranking likelihood as the objective function. Finally, we conduct a comprehensive evaluation with real-world estate related data, and the experimental results demonstrate the effectiveness of our method.


knowledge discovery and data mining | 2015

Real Estate Ranking via Mixed Land-use Latent Models

Yanjie Fu; Guannan Liu; Spiros Papadimitriou; Hui Xiong; Yong Ge; Hengshu Zhu; Chen Zhu

Mixed land use refers to the effort of putting residential, commercial and recreational uses in close proximity to one another. This can contribute economic benefits, support viable public transit, and enhance the perceived security of an area. It is naturally promising to investigate how to rank real estate from the viewpoint of diverse mixed land use, which can be reflected by the portfolio of community functions in the observed area. To that end, in this paper, we develop a geographical function ranking method, named FuncDivRank, by incorporating the functional diversity of communities into real estate appraisal. Specifically, we first design a geographic function learning model to jointly capture the correlations among estate neighborhoods, urban functions, temporal effects, and user mobility patterns. In this way we can learn latent community functions and the corresponding portfolios of estates from human mobility data and Point of Interest (POI) data. Then, we learn the estate ranking indicator by simultaneously maximizing ranking consistency and functional diversity, in a unified probabilistic optimization framework. Finally, we conduct a comprehensive evaluation with real-world data. The experimental results demonstrate the enhanced performance of the proposed method for real estate appraisal.


international conference on data mining | 2015

Station Site Optimization in Bike Sharing Systems

Junming Liu; Qiao Li; Meng Qu; Weiwei Chen; Jingyuan Yang; Hui Xiong; Hao Zhong; Yanjie Fu

Bike sharing systems, aiming at providing the missing links in the public transportation systems, are becoming popular in urban cities. In an ideal bike sharing network, the station locations are usually selected in a way that there are balanced pick-ups and drop-offs among stations. This can help avoid expensive re-balancing operations and maintain high user satisfaction. However, it is a challenging task to develop such an efficient bike sharing system with appropriate station locations. Indeed, the bike station demand is influenced by multiple factors of surrounding environment and complex public transportation networks. Limited efforts have been made to develop demand-and-balance prediction models for bike sharing systems by considering all these factors. To this end, in this paper, we propose a bike sharing network optimization approach by considering multiple influential factors. The goal is to enhance the quality and efficiency of the bike sharing service by selecting the right station locations. Along this line, we first extract fine-grained discriminative features from human mobility data, point of interests (POI), as well as station network structures. Then, prediction models based on Artificial Neural Networks (ANN) are developed for predicting station demand and balance. In addition, based on the learned patterns of station demand and balance, a genetic algorithm based optimization model is built to choose a set of stations from a large number of candidates in a way such that the station usage is maximized and the number of unbalanced stations is minimized. Finally, the extensive experimental results on the NYC CitiBike sharing system show the advantages of our approach for optimizing the station site allocation in terms of the bike usage as well as the required re-balancing efforts.


international conference on data mining | 2014

Exploiting Heterogeneous Human Mobility Patterns for Intelligent Bus Routing

Yanchi Liu; Chuanren Liu; Nicholas Jing Yuan; Lian Duan; Yanjie Fu; Hui Xiong; Songhua Xu; Junjie Wu

Optimal planning for public transportation is one of the keys to sustainable development and better quality of life in urban areas. Compared to private transportation, public transportation uses road space more efficiently and produces fewer accidents and emissions. In this paper, we focus on the identification and optimization of flawed bus routes to improve utilization efficiency of public transportation services, according to peoples real demand for public transportation. To this end, we first provide an integrated mobility pattern analysis between the location traces of taxicabs and the mobility records in bus transactions. Based on mobility patterns, we propose a localized transportation mode choice model, with which we can accurately predict the bus travel demand for different bus routing. This model is then used for bus routing optimization which aims to convert as many people from private transportation to public transportation as possible given budget constraints on the bus route modification. We also leverage the model to identify region pairs with flawed bus routes, which are effectively optimized using our approach. To validate the effectiveness of the proposed methods, extensive studies are performed on real world data collected in Beijing which contains 19 million taxi trips and 10 million bus trips.


international conference on data mining | 2016

POI Recommendation: A Temporal Matching between POI Popularity and User Regularity

Zijun Yao; Yanjie Fu; Bin Liu; Yanchi Liu; Hui Xiong

Point of interest (POI) recommendation, which provides personalized recommendation of places to mobile users, is an important task in location-based social networks (LBSNs). However, quite different from traditional interest-oriented merchandise recommendation, POI recommendation is more complex due to the timing effects: we need to examine whether the POI fits a users availability. While there are some prior studies which included the temporal effect into POI recommendations, they overlooked the compatibility between time-varying popularity of POIs and regular availability of users, which we believe has a non-negligible impact on user decision-making. To this end, in this paper, we present a novel method which incorporates the degree of temporal matching between users and POIs into personalized POI recommendations. Specifically, we first profile the temporal popularity of POIs to show when a POI is popular for visit by mining the spatio-temporal human mobility and POI category data. Secondly, we propose latent user regularities to characterize when a user is regularly available for exploring POIs, which is learned with a user-POI temporal matching function. Finally, results of extensive experiments with real-world POI check-in and human mobility data demonstrate that our proposed user-POI temporal matching method delivers substantial advantages over baseline models for POI recommendation tasks.


IEEE Transactions on Mobile Computing | 2016

Service Usage Classification with Encrypted Internet Traffic in Mobile Messaging Apps

Yanjie Fu; Hui Xiong; Xinjiang Lu; Jin Yang; Can Chen

The rapid adoption of mobile messaging Apps has enabled us to collect massive amount of encrypted Internet traffic of mobile messaging. The classification of this traffic into different types of in-App service usages can help for intelligent network management, such as managing network bandwidth budget and providing quality of services. Traditional approaches for classification of Internet traffic rely on packet inspection, such as parsing HTTP headers. However, messaging Apps are increasingly using secure protocols, such as HTTPS and SSL, to transmit data. This imposes significant challenges on the performances of service usage classification by packet inspection. To this end, in this paper, we investigate how to exploit encrypted Internet traffic for classifying in-App usages. Specifically, we develop a system, named CUMMA, for classifying service usages of mobile messaging Apps by jointly modeling user behavioral patterns, network traffic characteristics, and temporal dependencies. Along this line, we first segment Internet traffic from traffic-flows into sessions with a number of dialogs in a hierarchical way. Also, we extract the discriminative features of traffic data from two perspectives: (i) packet length and (ii) time delay. Next, we learn a service usage predictor to classify these segmented dialogs into single-type usages or outliers. In addition, we design a clustering Hidden Markov Model (HMM) based method to detect mixed dialogs from outliers and decompose mixed dialogs into sub-dialogs of single-type usage. Indeed, CUMMA enables mobile analysts to identify service usages and analyze end-user in-App behaviors even for encrypted Internet traffic. Finally, the extensive experiments on real-world messaging data demonstrate the effectiveness and efficiency of the proposed method for service usage classification.


siam international conference on data mining | 2014

A New Framework for Traffic Anomaly Detection

Jinsong Lan; Cheng Long; Raymond Chi-Wing Wong; Youyang Chen; Yanjie Fu; Danhuai Guo; Shuguang Liu; Yong Ge; Yuanchun Zhou; Jianhui Li

Trajectory data is becoming more and more popular nowadays and extensive studies have been conducted on trajectory data. One important research direction about trajectory data is the anomaly detection which is to find all anomalies based on trajectory patterns in a road network. In this paper, we introduce a road segment-based anomaly detection problem, which is to detect the abnormal road segments each of which has its “real” traffic deviating from its “expected” traffic and to infer the major causes of anomalies on the road network. First, a deviation-based method is proposed to quantify the anomaly of reach road segment. Second, based on the observation that one anomaly from a road segment can trigger other anomalies from the road segments nearby, a diffusionbased method based on a heat diffusion model is proposed to infer the major causes of anomalies on the whole road network. To validate our methods, we conduct intensive experiments on a large real-world GPS dataset of about 23,000 taxis in Shenzhen, China to demonstrate the performance of our algorithms.


knowledge discovery and data mining | 2016

Days on Market: Measuring Liquidity in Real Estate Markets

Hengshu Zhu; Hui Xiong; Fangshuang Tang; Qi Liu; Yong Ge; Enhong Chen; Yanjie Fu

Days on Market (DOM) refers to the number of days a property is on the active market, which is an important measurement of market liquidity in real estate industry. Indeed, at the micro level, DOM is not only a special concern of house sellers, but also a useful indicator for potential buyers to evaluate the popularity of a house. At the macro level, DOM is an important indicator of real estate market status. However, it is very challenging to measure DOM, since there are a variety of factors which can impact on the DOM of a property. To this end, in this paper, we aim to measure real estate liquidity by examining multiple factors in a holistic manner. A special goal is to predict the DOM of a given property listing. Specifically, we first extract key features from multiple types of heterogeneous real estate-related data, such as house profiles and geo-social information of residential communities. Then, based on these features, we develop a multi-task learning based regression approach for predicting the DOM of real estates. This approach can effectively learn district-aware models for different property listings by considering multiple factors. Finally, we conduct extensive experiments on real-world real estate data collected in Beijing and develop a prototype system for practical use. The experimental results clearly validate the effectiveness of the proposed approach for measuring liquidity in real estate markets.

Collaboration


Dive into the Yanjie Fu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yong Ge

University of Arizona

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jiawei Zhang

Florida State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Enhong Chen

University of Science and Technology of China

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge