Wanying Ding
Drexel University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wanying Ding.
Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013
Wanying Ding; Xiaoli Song; Lifan Guo; Zunyan Xiong; Xiaohua Hu
Sentiment analysis studies the public opinions towards an entity, and it is an important research area in data mining. Recently, a lot of sentiment analysis models have been proposed, including supervised and unsupervised approaches. However, the role of supervised models has been undermined by the phenomenon of big data, and the unsupervised ones are drawing more and more attention. But, most current unsupervised methods are based on Latent Dirichlet Allocation (LDA), and they need to specify the number of aspects in advance, making them subjective. In addition, these methods treat factual words and opinioned words the same, and assume that one sentence contains only one aspect, all of which make the existing unsupervised methods unsatisfactory. To solve these problems, this paper proposes a novel hybrid Hierarchical Dirichlet Process-Latent Dirichlet Allocation (HDP-LDA) model. This model can automatically determine the number of aspects, distinguish factual words from opinioned words, and further effectively extracts the aspect specific sentiment words. Experiment result shows that our model can clearly capture the aspects people mentioned and the specific sentiment words they use in each aspect, improving the performance of sentiment analysis efficiently. At last, we compared our model with the influential topic models, namely, JST, AUSM and Maxine-LDA, on the online restaurant review, and found our model performs very well.
international acm sigir conference on research and development in information retrieval | 2017
Jianliang Gao; Bo Song; Zheng Chen; Weimao Ke; Wanying Ding; Xiaohua Hu
In this paper, we propose a novel k-anonymization scheme to counter deanonymization queries on social networks. With this scheme, all entities are protected by k-anonymization, which means the attackers cannot re-identify a target with confidence higher than 1/k. The proposed scheme minimizes the modification on original networks, and accordingly maximizes the utility preservation of published data while achieving k-anonymization privacy protection. Extensive experiments on real data sets demonstrate the effectiveness of the proposed scheme, where the efficacy of the k-anonymized networks is verified with the distributions of pagerank, betweenness, and their Kolmogorov-Smirnov (K-S) test.
Information Retrieval | 2017
Mengwen Liu; Wanying Ding; Dae Hoon Park; Yi Fang; Rui Yan; Xiaohua Hu
A number of online marketplaces enable customers to buy or sell used products, which raises the need for ranking tools to help them find desirable items among a huge pool of choices. To the best of our knowledge, no prior work in the literature has investigated the task of used product ranking which has its unique characteristics compared with regular product ranking. While there exist a few ranking metrics (e.g., price, conversion probability) that measure the “goodness” of a product, they do not consider the time factor, which is crucial in used product trading due to the fact that each used product is often unique while new products are usually abundant in supply or quantity. In this paper, we introduce a novel time-aware metric—“sellability”, which is defined as the time duration for a used item to be traded, to quantify the value of it. In order to estimate the “sellability” values for newly generated used products and to present users with a ranked list of the most relevant results, we propose a combined Poisson regression and listwise ranking model. The model has a good property in fitting the distribution of “sellability”. In addition, the model is designed to optimize loss functions for regression and ranking simultaneously, which is different from previous approaches that are conventionally learned with a single cost function, i.e., regression or ranking. We evaluate our approach in the domain of used vehicles. Experimental results show that the proposed model can improve both regression and ranking performance compared with non-machine learning and machine learning baselines.
2015 International Conference on Computing, Networking and Communications (ICNC) | 2015
Yue Shang; Wanying Ding; Mengwen Liu; Xiaoli Song; Tony Xiaohua Hu; Yuan An; Haohong Wang; Lifan Guo
Nowadays, search engines have become indispensable parts of modern human life, which create hundreds and thousands of search logs every second throughout the world. With the explosive growth of online information, a key issue for web search service is to better understand users need through the short search query to match the users preference as much as possible. However, due to the lack of the personal information in some scenario and the huge calculation when seeking for relevant user group, personalized search becomes a quite a challenging problem. In this work, we propose a novel scalable framework based on multimodal Restricted Boltzmann Machine (RBM) to do the user intent mining and prediction. This scalable framework works in an unsupervised manner, and is flexible to various situations regardless of the amount of individual information, in other words, it can handles scenarios without personal history information or limited personal history information, the more individual data the better accuracy of user intent prediction and more capable to reflect the individuals interests changing. The framework outputs a binary representation for each query log, thus to some extent, could solve data sparsity problem and reduce the computation complexity when looking for users with similar interests. The experiment results shown that, the model can learn reasonable user intent category during the learning procedure, according to the qualitative analysis of the top ranked context and websites for each class. And it can get a competitive performance when no individual data is offered. Moreover, by offering more individual data (10 history queries), the overall performance improves up to 10% of precision.
conference on information and knowledge management | 2015
Wanying Ding; Yue Shang; Lifan Guo; Xiaohua Hu; Rui Yan; Tingting He
international conference on big data | 2016
Wanying Ding; Yue Zhang; Chaomei Chen; Xiaohua Hu
Archive | 2016
Wanying Ding; Lifan Guo; Yue Shang; Haohong Wang
Archive | 2016
Wanying Ding; Yue Shang; Lifan Guo; Dae Hoon Park; Haohong Wang
knowledge discovery and data mining | 2015
Wanying Ding; Yue Shang; Dae Hoon Park; Lifan Guo; Xiaohua Hu
Archive | 2014
Dae Hoon Park; Lifan Guo; Wanying Ding; Haohong Wang