Benfu Lv
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Benfu Lv.
PLOS ONE | 2013
Qingyu Yuan; Elaine O. Nsoesie; Benfu Lv; Geng Peng; Rumi Chunara; John S. Brownstein
Several approaches have been proposed for near real-time detection and prediction of the spread of influenza. These include search query data for influenza-related terms, which has been explored as a tool for augmenting traditional surveillance methods. In this paper, we present a method that uses Internet search query data from Baidu to model and monitor influenza activity in China. The objectives of the study are to present a comprehensive technique for: (i) keyword selection, (ii) keyword filtering, (iii) index composition and (iv) modeling and detection of influenza activity in China. Sequential time-series for the selected composite keyword index is significantly correlated with Chinese influenza case data. In addition, one-month ahead prediction of influenza cases for the first eight months of 2012 has a mean absolute percent error less than 11%. To our knowledge, this is the first study on the use of search query data from Baidu in conjunction with this approach for estimation of influenza activity in China.
Annals of Operations Research | 2015
Ying Liu; Hong Li; Geng Peng; Benfu Lv; Chong Zhang
Online customer segmentation is a significant research topic of customer relationship management. Previous literatures mainly studied the differences between non-purchasers and purchasers, lacking further segmentation of online purchasers. There is still existing significant heterogeneity within purchaser-groups. This paper focuses on Chinese online purchaser segmentation based on large volume of real transaction data on Taobao.com, we firstly extracted and investigated Chinese online purchaser behavior indicators and classified them into six types by cluster analysis, these six categories are: economical purchasers, active-star purchasers, direct purchasers, high-loyalty purchasers, risk-averse purchasers and credibility-first purchasers; then we built an empirical model to estimate the sensitivity of each type of online purchasers to three mainstream promotion strategies (discount, advertising and word-of-mouth), and found that economical purchasers are the most sensitive to discount promotion; direct purchasers are the most sensitive to advertising promotion; active-star purchasers are the most sensitive to word-of-mouth promotion; finally, the implications of online purchaser classification for marketing strategies were discussed.
Proceedings of the Data Mining and Intelligent Knowledge Management Workshop on | 2012
Ying Liu; Benfu Lv; Geng Peng; Qingyu Yuan
The correlations between Internet search data and socio-economic Indicators have been proved in many studies, but the basis work of these studies - data preprocessing, determining the quality of the result, has lacked a systematic methodology. In this paper, we develop a comprehensive method for Internet search data preprocessing, which includes the critical steps: (a) keywords selection, (b) time difference measurement, and (c) leading index composition. Applying our method to study Chinese stock market price, we can get the leading keywords index with stable leading relation and high degree of fit. Specifically, the correlation coefficient between our leading keywords index and Shanghai Composite Index reaches 98.7%, and Granger test confirms that keywords index has significant prediction ability for Shanghai Composite Index. Adding keywords index to the AR model can reduce the MAPE from 3.8% to 1.4%, and each percentage point change of keywords index is correlated with 0.136 percentage point move in the same direction of Shanghai Composite Index in next period.
international conference on wireless communications, networking and mobile computing | 2007
Haiquan Long; Benfu Lv; Tianbo Zhao; Ying Liu
The performance and capabilities of Web search engines is an important and significant area of research. Millions of Chinese people use Web search engines every day. This paper describes the evaluation of three Chinese search engines based on human judgments. As a case study, three popular Chinese search engines are examined. We develop an online tool to assist human evaluation. The evaluation takes the relevancy aspects as well as the subjectivity of the search engines users into account. This study provides the results show that there is small degree of differences between them. It also provides the evidence suggests that there are some factors indeed influence the experience of search engine users. The report may be useful supplements to other studies of compare search engines that are being tried by search engine researchers.
African Journal of Business Management | 2016
Xiaoxuan Li; Qi Wu; Geng Peng; Benfu Lv
In many studies, search engine data were efficient to analyze and forecast as an explanatory variable, including the tourism volumes predictions. However, the search data and the tourism volumes were always interfered by the noise. Without noise-processing, the predictive ability of search engine data might be weak, even invalid. As a method of noise-processing, Hilbert-Huang Transform (HHT) could deal with non-linear and non-stationary data. This study proposed a model with denoising and forecasting by search engine data, namely CLSI-HHT. The search queries were composited into an index first, then the noise were extracted from the index and tourism volumes sequences by HHT. The study further forecast the tourism volumes with the effective series. The results demonstrated that CLSI-HHT model outperformed the baselines significantly while the index model without denoising performs nearly same as the time series model. Moreover, wavelet transform and filtering were compared with HHT on denoising and the results implied that HHT had higher signal noise ratio (SNR) and forecast more accurately. The study concluded that noise-processing was necessary for the tourism forecasting with search engine data, and HHT could be an effective method on denoising. Key words: Hilbert-Huang Transform (HHT), search engine data, noise-processing, wavelet transform.
Archive | 2012
Chong Zhang; Benfu Lv; Geng Peng; Ying Liu; Qingyu Yuan
The web search data, which recorded hundreds of Millions of searchers concerns and interests, reflected the trends of their behavior and provided essential data basis for the study of macro-economic issues. This paper established a concept frame based on commodity market and equilibrium price theory, revealed a certain correlation and lead-lag relationship between web search data and consumer price index (CPI). Empirical results indicated that there is a co-integration relationship between web search data and CPI. The model was able to obtain a good fit with CPI. Model fitting is 0.978.
Annals of Operations Research | 2015
Ying Liu; Yibing Chen; Sheng Wu; Geng Peng; Benfu Lv
Previous studies have revealed that Internet search data is a new source of data that can be used to predict the stock market. In this new, data-driven research field, choosing a method for preprocessing data is crucial to achieving accurate prediction performance. This paper proposes a preprocessing method of Internet search data: composite leading search index (CLSI), which is composed of three steps: (a) keyword selection, (b) time difference measurement, and (c) leading index composition. We demonstrate the validity of CLSI by comparing this method’s results with the results from search volume index (SVI), which is most commonly used in previous literatures. We build a time series model (TS) with error correction and support vector regression (SVR) for stock trend prediction, and combine into four models for comparison: SVI–TS, CLSI–TS, SVI–SVR, and CLSI–SVR. We test these four models in the context of the Chinese stock market, which interests more and more investors nowadays, and analyzed results in nine datasets: stable periods, peak periods and trough periods of Shanghai Composite Index, Shenzhen Composite Index, and Hushen 300 index respectively. The results show that using TS and SVR as forecasting models, CLSI performs better than SVI on majority of the test dataset while has almost the same performance with that of SVI on the remaining test dataset. It is to some extent convincing that CLSI is a more efficient preprocessing method of Internet search data for stock trend prediction.
Archive | 2012
Fan Liu; Benfu Lv; Geng Peng; Xiuting Li
Some new researches demonstrated that the search data can be used to detect public health trends and short-term syndrome surveillance. In this paper, we study the problem of influenza epidemics surveillance using Google search data. A hybrid model with dynamic search query set is developed, which was more accurate in influenza forecast than Google flu trends, especially for the irregular new influenza strain forecasts. This research is valuable for improving the timeliness of syndrome surveillance.
Archive | 2012
Ying Liu; Benfu Lv; Geng Peng; Chong Zhang
Internet search data can be used for the study of market transaction behaviors. We firstly establish a concept framework to reveal the lead-lag relationship between search data and stock market based on micro-perspective of investors’ behaviors. Then we develop three types of composite search indices: investor action index, market condition index, and macroeconomic index. The empirical test indicates the cointegration relationship between search indices and the annual return rate of Shanghai composite index. In the long-term trend, each percentage point increase in the three types of search indices separately, the annual return rate will increase 0.22, 0.56, 0.83 percentage points in the next month. Furthermore, Granger causality test shows that the search indices have significant predictive power for the annual return rate of Shanghai composite index.
ieee international conference on emergency management and management sciences | 2011
Dongsheng Liang; Benfu Lv
Based on our 2010s study that applied the Diffusion Theory Paradigm to analyze the Network News Communication of Crisis Event, this paper names the idea that the cumulated number of online news discussion follows the S-shaped curve over time in Diffusion Theory field as “Liang - Lv” Hypothesis. Then, it test the “Liang - Lv” Hypothesis for advancing the research by using more samples. This paper focuses on a single piece of news and goes along two study paths. One of them is the one-news-more-websites path; the other is the more-news-one-website path.