Shuotian Bai
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shuotian Bai.
Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013
Shuotian Bai; Bibo Hao; Ang Li; Sha Yuan; Rui Gao; Tingshao Zhu
Personality can be defined as a set of characteristics which makes a person unique. The study of personality is of central importance in psychology. Conventional personality assessment is performed by self-report inventory, which costs much manual efforts and cannot be done in real time. To solve these problems, this research aims to measure the Big-Five personality from the usages of Sina Microblog objectively. By conducting a user study with 444 users, this paper proposes multi-task regression and incremental regression algorithms to predict the Big-Five personality from online behaviors. The results indicate that personality can be predicted with a high accuracy through online Microblog usage.
PLOS ONE | 2016
Nan Zhao; Dongdong Jiao; Shuotian Bai; Tingshao Zhu
The increasing need of automated analyzing web texts especially the short texts on Social Network Services (SNS) brings new demands of computerized text analysis instruments. The psychometric properties are the basis of the extensive use of these instruments such as the Linguistic Inquiry and Word Count (LIWC). For this study, Sina Weibo statuses were analyzed via rater coding and Simplified Chinese version of LIWC (SCLIWC), in order to evaluate the validity of SCLIWC in detecting psychological expressions in Weibo statuses (n = 60) and in identifying the psychological meaning of a single Weibo status (n = 11). Significant correlations between human ratings and SCLIWC scores and the high sensitivities of capturing single statuses with certain expressions identified by raters, proved the validity of SCLIWC in detecting psychological expressions. The results also suggested that, the efficiency of SCLIWC in detecting psychological expressions of SNS short texts could be higher if using status count scoring method, rather than the word count method as the common usage of LIWC. However, SCLIWC may not perform well in identifying the psychological meaning of a single piece of SNS short text because of its over-identification of target expressions. This study provided primary evidence of validity of SCLIWC, as well as the proper way of using it efficiently on SNS short texts.
Web Intelligence and Agent Systems: An International Journal | 2014
Shuotian Bai; Sha Yuan; Bibo Hao; Tingshao Zhu
Personality can be defined as a set of characteristics which makes a person unique. Psychological theory suggests that people’s behavior is a reflection of personality. Therefore, it is feasible to predict personality through behavior. Conventional personality assessment is performed by self-report inventory. Participants need to fill in a tedious inventory to get their personality scores. In the large-scale investigation, every returned inventory needs manual computation, which costs much manual efforts and cannot be done in real time. In order to avoid these shortages, this research aims to objectively predict the Big-Five personality from the usage records of Sina Microblog. Since its initial launch in December, 2005, Sina Microblog has been the leading microblogging service provider in China. Millions of users upload and download resources via microblogging status everyday. Therefore, by conducting an online user survey of 444 active users, this paper analyzes the relation modes between personality and online behavior. Furthermore, this research proposes multi-task regression and incremental regression to predict the BigFive personality from online behaviors. The results indicate that correlation factors are significant between different personality dimensions. Besides, our training data set is reliable enough and multi-task regression performs better than other modeling algorithms.
Chinese Science Bulletin | 2015
Li Ang; Bibo Hao; Shuotian Bai; Tingshao Zhu
To improve social harmony and stability, it is essential to acquire public psychological profiles in real time. However, traditional methods of psychological assessment have failed to meet the requirement. This paper proposes a novel method for predicting psychological features based on web behavioral data. Using a microblogging platform, we built predicting models for identifying mental health status and subjective well-being. The correlation between the predicted and actual values of depression can reach 0.41, and the highest correlation on subjective well-being is 0.6. The results indicate an effective overall performance of the established predicting models. This study demonstrates that, based on web data analysis, it is possible to efficiently predict psychological features and to update the predicted outcomes in real time.
international conference on human centered computing | 2014
Xiaoqian Liu; Dong Nie; Shuotian Bai; Bibo Hao; Tingshao Zhu
Personality research on social media is a hot topic recently due to the rapid development of social medias well as the central importance of personality in psychology, but it is hard to acquire adequate appropriate labeled samples. Our research aims to choose the right users to be labeled to improve the accuracy of predicting. Given a set of Microblog users’ public information (e.g., number of followers) and a few labeled users, the task is to predict personality of other unlabeled users. The active learning regression algorithm has been employed to establish predicting model in this paper, and the experimental results demonstrate our method can fairly well predict the personality of Microblog users.
active media technology | 2014
Zengda Guan; Dong Nie; Bibo Hao; Shuotian Bai; Tingshao Zhu
Some research has been done to predict users’ personality based on their web behaviors. They usually use supervised learning methods to model on training dataset and predict on test dataset. However, when training dataset has different distributions from test dataset, which doesn’t meet independently identical distribution condition, traditional supervised learning models may perform not well on test dataset. Thus, we introduce a new regression transfer learning framework to deal with this problem, and propose two local regression instance-transfer methods. We use clustering and k-nearest-neighbor to reweight importance of each training instance to adapt to test dataset distribution, and then train a weighted risk regression model for prediction. We perform experiments on the condition that users dataset are from different genders and from different districts, and the results indicate that our methods can reduce mean square error about 30% to the most compared with non-transfer methods and be better than other transfer method in the whole.
advanced information networking and applications | 2017
Sha Yuan; Zhe Tao; Tingshao Zhu; Shuotian Bai
With the continuous growth of micro-blog services, Sina Weibo is increasingly found in the daily lives of ordinary Chinese individuals. More than one hundred million tweets are released in Sina Weibo everyday. By analyzing these mass data timely, media companies could learn how to generate buzz for new films, famous stars, or fashion shows more effectively. However, how to predict which topics will be the most popular search terms in Sina Weibo in realtime remains unknown. In this paper, we present a realtime hot topic prediction method in an online platform. Experiments are carried out on the platform to evaluate the proposed scheme. The results show that our model gets an average precision 44.32% and the median value is 45.83%. The proposed hot topic prediction method can predict the hot topics about 9.5 hours in average in advance.
international conference on pervasive computing | 2013
Zengda Guan; Shuotian Bai; Tingshao Zhu
When a task of a certain domain doesn’t have enough labels and good features, traditional supervised learning methods usually behave poorly. Transfer learning addresses this problem, which transfers data and knowledge from a related domain to improve the learning performance of the target task. Sometimes, the related task and the target task have the same labels, but have different data distributions and heterogeneous features. In this paper, we propose a general heterogeneous transfer learning framework which combines linear kernel and graph regulation. Linear kernel is used to project the original data of both domains to a Reproducing Kernel Hilbert Space, in which both tasks have the same feature dimensions and close distance of data distributions. Graph regulation is designed to preserve geometric structure of data. We present the algorithms in both unsupervised and supervised way. Experiments on synthetic dataset and real dataset about user web-behavior and personality are performed, and the effectiveness of our method is demonstrated.
conference on recommender systems | 2013
Rui Gao; Bibo Hao; Shuotian Bai; Lin Li; Ang Li; Tingshao Zhu
Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2014
Dong Nie; Zengda Guan; Bibo Hao; Shuotian Bai; Tingshao Zhu