Bibo Hao
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bibo Hao.
BHI 2013 Proceedings of the International Conference on Brain and Health Informatics - Volume 8211 | 2013
Rui Gao; Bibo Hao; He Li; Yusong Gao; Tingshao Zhu
The words that people use could reveal their emotional states, intentions, thinking styles, individual differences, etc. LIWC (Linguistic Inquiry and Word Count) has been widely used for psychological text analysis, and its dictionary is the core. The Traditional Chinese version of LIWC dictionary has been released, which is a translation of LIWC English dictionary. However, Simplified Chinese which is the worlds most widely used language has subtle differences with Traditional Chinese. Furthermore, both English LIWC dictionary and Traditional Chinese version dictionary were both developed for relatively formal text. Microblog has become more and more popular in China nowadays. Original LIWC dictionaries take less consideration on microblog popular words, which makes it less applicable for text analysis on microblog. In this study, a Simplified Chinese LIWC dictionary is established according to LIWC categories. After translating Traditional Chinese dictionary into Simplified Chinese, five thousand words most frequently used in microblog are added into the dictionary. Four graduate students of psychology rated whether each word belonged in a category. The reliability and validity of Simplified Chinese LIWC dictionary were tested by these four judges. This new dictionary could contribute to all the text analysis on microblog in future.
Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013
Shuotian Bai; Bibo Hao; Ang Li; Sha Yuan; Rui Gao; Tingshao Zhu
Personality can be defined as a set of characteristics which makes a person unique. The study of personality is of central importance in psychology. Conventional personality assessment is performed by self-report inventory, which costs much manual efforts and cannot be done in real time. To solve these problems, this research aims to measure the Big-Five personality from the usages of Sina Microblog objectively. By conducting a user study with 444 users, this paper proposes multi-task regression and incremental regression algorithms to predict the Big-Five personality from online behaviors. The results indicate that personality can be predicted with a high accuracy through online Microblog usage.
JMIR mental health | 2015
Li Guan; Bibo Hao; Qijin Cheng; Paul S. F. Yip; Tingshao Zhu
Background Traditional offline assessment of suicide probability is time consuming and difficult in convincing at-risk individuals to participate. Identifying individuals with high suicide probability through online social media has an advantage in its efficiency and potential to reach out to hidden individuals, yet little research has been focused on this specific field. Objective The objective of this study was to apply two classification models, Simple Logistic Regression (SLR) and Random Forest (RF), to examine the feasibility and effectiveness of identifying high suicide possibility microblog users in China through profile and linguistic features extracted from Internet-based data. Methods There were nine hundred and nine Chinese microblog users that completed an Internet survey, and those scoring one SD above the mean of the total Suicide Probability Scale (SPS) score, as well as one SD above the mean in each of the four subscale scores in the participant sample were labeled as high-risk individuals, respectively. Profile and linguistic features were fed into two machine learning algorithms (SLR and RF) to train the model that aims to identify high-risk individuals in general suicide probability and in its four dimensions. Models were trained and then tested by 5-fold cross validation; in which both training set and test set were generated under the stratified random sampling rule from the whole sample. There were three classic performance metrics (Precision, Recall, F1 measure) and a specifically defined metric “Screening Efficiency” that were adopted to evaluate model effectiveness. Results Classification performance was generally matched between SLR and RF. Given the best performance of the classification models, we were able to retrieve over 70% of the labeled high-risk individuals in overall suicide probability as well as in the four dimensions. Screening Efficiency of most models varied from 1/4 to 1/2. Precision of the models was generally below 30%. Conclusions Individuals in China with high suicide probability are recognizable by profile and text-based information from microblogs. Although there is still much space to improve the performance of classification models in the future, this study may shed light on preliminary screening of risky individuals via machine learning algorithms, which can work side-by-side with expert scrutiny to increase efficiency in large-scale-surveillance of suicide probability from online social media.
active media technology | 2014
Bibo Hao; Lin Li; Rui Gao; Ang Li; Tingshao Zhu
Subjective Well-being(SWB), which refers to how people experience the quality of their lives, is of great use to public policy-makers as well as economic, sociological research, etc. Traditionally, the measurement of SWB relies on time-consuming and costly self-report questionnaires. Nowadays, people are motivated to share their experiences and feelings on social media, so we propose to sense SWB from the vast user generated data on social media. By utilizing 1785 users’ social media data with SWB labels, we train machine learning models that are able to “sense” individual SWB. Our model, which attains the state-of-the-art prediction accuracy, can then be applied to identify large amount of social media users’ SWB in time with low cost.
PeerJ | 2015
Ang Li; Xiaoxiao Huang; Bibo Hao; Bridianne O'Dea; Helen Christensen; Tingshao Zhu
Introduction. Broadcasting a suicide attempt on social media has become a public health concern in many countries, particularly in China. In these cases, social media users are likely to be the first to witness the suicide attempt, and their attitudes may determine their likelihood of joining rescue efforts. This paper examines Chinese social media (Weibo) users’ attitudes towards suicide attempts broadcast on Weibo. Methods. A total of 4,969 Weibo posts were selected from a customised Weibo User Pool which consisted of 1.06 million active users. The selected posts were then independently coded by two researchers using a coding framework that assessed: (a) Themes, (b) General attitudes, (c) Stigmatising attitudes, (d) Perceived motivations, and (e) Desired responses. Results and Discussion. More than one third of Weibo posts were coded as “stigmatising” (35%). Among these, 22%, 16%, and 15% of posts were coded as “deceitful,” “pathetic,” and “stupid,” respectively. Among the posts which reflected different types of perceived motivations, 57% of posts were coded as “seeking attention.” Among the posts which reflected desired responses, 37% were “not saving” and 28% were “encouraging suicide.” Furthermore, among the posts with negative desired responses (i.e., “not saving” and “encouraging suicide”), 57% and 17% of them were related to different types of stigmatising attitudes and perceived motivations, respectively. Specifically, 29% and 26% of posts reflecting both stigmatising attitudes and negative desired responses were coded as “deceitful” and “pathetic,” respectively, while 66% of posts reflecting both perceived motivations, and negative desired responses were coded as “seeking attention.” Very few posts “promoted literacy” (2%) or “provided resources” (8%). Gender differences existed in multiple categories. Conclusions. This paper confirms the need for stigma reduction campaigns for Chinese social media users to improve their attitudes towards those who broadcast their suicide attempts on social media. Results of this study support the need for improved public health programs in China and may be insightful for other countries and other social media platforms.
international conference on cross-cultural design | 2013
Bibo Hao; Lin Li; Ang Li; Tingshao Zhu
The rapid development of social media brings about vast user generated content. Computational cyber-psychology, an interdisciplinary subject area, employs machine learning approaches to explore underlying psychological patterns. Our research aims at identifying users’ mental health status through their social media behavior. We collected both users’ social media data and mental health data from the most popular Chinses microblog service provider, Sina Weibo. By extracting linguistic and behavior features, and applying machine learning algorithms, we made preliminary exploration to identify users’ mental health status automaticly, which previously is mainly measured by well-designed psychological questionnaire. Our classification model achieves the accuracy of 72%, and the continous predicting model achieved correlation of 0.3 with questionnaire based score.
Web Intelligence and Agent Systems: An International Journal | 2014
Shuotian Bai; Sha Yuan; Bibo Hao; Tingshao Zhu
Personality can be defined as a set of characteristics which makes a person unique. Psychological theory suggests that people’s behavior is a reflection of personality. Therefore, it is feasible to predict personality through behavior. Conventional personality assessment is performed by self-report inventory. Participants need to fill in a tedious inventory to get their personality scores. In the large-scale investigation, every returned inventory needs manual computation, which costs much manual efforts and cannot be done in real time. In order to avoid these shortages, this research aims to objectively predict the Big-Five personality from the usage records of Sina Microblog. Since its initial launch in December, 2005, Sina Microblog has been the leading microblogging service provider in China. Millions of users upload and download resources via microblogging status everyday. Therefore, by conducting an online user survey of 444 active users, this paper analyzes the relation modes between personality and online behavior. Furthermore, this research proposes multi-task regression and incremental regression to predict the BigFive personality from online behaviors. The results indicate that correlation factors are significant between different personality dimensions. Besides, our training data set is reliable enough and multi-task regression performs better than other modeling algorithms.
Chinese Science Bulletin | 2015
Li Ang; Bibo Hao; Shuotian Bai; Tingshao Zhu
To improve social harmony and stability, it is essential to acquire public psychological profiles in real time. However, traditional methods of psychological assessment have failed to meet the requirement. This paper proposes a novel method for predicting psychological features based on web behavioral data. Using a microblogging platform, we built predicting models for identifying mental health status and subjective well-being. The correlation between the predicted and actual values of depression can reach 0.41, and the highest correlation on subjective well-being is 0.6. The results indicate an effective overall performance of the established predicting models. This study demonstrates that, based on web data analysis, it is possible to efficiently predict psychological features and to update the predicted outcomes in real time.
international conference on human centered computing | 2014
Xiaoqian Liu; Dong Nie; Shuotian Bai; Bibo Hao; Tingshao Zhu
Personality research on social media is a hot topic recently due to the rapid development of social medias well as the central importance of personality in psychology, but it is hard to acquire adequate appropriate labeled samples. Our research aims to choose the right users to be labeled to improve the accuracy of predicting. Given a set of Microblog users’ public information (e.g., number of followers) and a few labeled users, the task is to predict personality of other unlabeled users. The active learning regression algorithm has been employed to establish predicting model in this paper, and the experimental results demonstrate our method can fairly well predict the personality of Microblog users.
active media technology | 2014
Zengda Guan; Dong Nie; Bibo Hao; Shuotian Bai; Tingshao Zhu
Some research has been done to predict users’ personality based on their web behaviors. They usually use supervised learning methods to model on training dataset and predict on test dataset. However, when training dataset has different distributions from test dataset, which doesn’t meet independently identical distribution condition, traditional supervised learning models may perform not well on test dataset. Thus, we introduce a new regression transfer learning framework to deal with this problem, and propose two local regression instance-transfer methods. We use clustering and k-nearest-neighbor to reweight importance of each training instance to adapt to test dataset distribution, and then train a weighted risk regression model for prediction. We perform experiments on the condition that users dataset are from different genders and from different districts, and the results indicate that our methods can reduce mean square error about 30% to the most compared with non-transfer methods and be better than other transfer method in the whole.