Assessing the Severity of Health States based on Social Media Posts
Shweta Yadav, Joy Prakash Sain, Amit Sheth, Asif Ekbal, Sriparna Saha, Pushpak Bhattacharyya
AAssessing the Severity of Health States based onSocial Media Posts
Shweta Yadav
Department of Computer ScienceWright State University
Dayton, Ohio, [email protected]
Joy Prakash Sain
Department of Computer ScienceWright State University
Dayton, Ohio, [email protected]
Amit Sheth
Artificial Intelligence InstituteUniversity of South Carolina
Columbia, South Carolina, [email protected]
Asif Ekbal ∗ , Sriparna Saha † and Pushpak Bhattacharyya ‡ Department of Computer Science & EngineeringIndian Institute of Technology PatnaBihar, IndiaEmail: ∗ [email protected], † [email protected], ‡ [email protected], Abstract —The unprecedented growth of Internet users hasresulted in an abundance of unstructured information on socialmedia including health forums, where patients request health-related information or opinions from other users. Previous studieshave shown that online peer support has limited effectivenesswithout expert intervention. Therefore, a system capable of as-sessing the severity of health state from the patients’ social mediaposts can help health professionals (HP) in prioritizing the userspost. In this study, we inspect the efficacy of different aspects ofNatural Language Understanding (NLU) to identify the severityof the users health state in relation to two perspectives(tasks) (a)Medical Condition (i.e., Recover, Exist, Deteriorate, Other) and(b) Medication (i.e., Effective, Ineffective, Serious Adverse Effect,Other) in online health communities. We propose a multiviewlearning framework that models both the textual content as wellas contextual-information to assess the severity of the users healthstate. Specifically, our model utilizes the NLU views such assentiment, emotions, personality, and use of figurative languageto extract the contextual information. The diverse NLU viewsdemonstrate its effectiveness on both the tasks and as well as onthe individual disease to assess a users health . Index Terms —Natural Language Understanding, Social Media,Biomedical Natural Language Processing
I. I
NTRODUCTION
The volume of patient-generated healthcare data is expe-riencing an immense growth. The primary contributors tothis enormous amount of data are social networks, forums,and blogs where patients share their medical problems andtreatment experiences, including adverse reactions to medicalproducts. According to Pew Internet & American Life Project[1]–[4], almost 80% of Internet users in the US explore health-related topics in online health forums. Among them, 63% lookfor information about specific medical problems, and nearly47% look for medical treatments or procedures.In the online health communities (OHC) and support groups,healthcare professional (HPs) provide clinical intervention The manuscript is accepted for publication at 25th International Confer-ence on Pattern Recognition. when required, for example, when patients need any interpre-tation or details of clinical concepts or medical consultation.In these settings, patients benefit from the knowledge of bothpeer-patients’ and HPs simultaneously [5]–[7]. [8] reportsthat the users on health forums expect the involvement ofHPs for a quality suggestion and virtual observation. Anothersurvey carried out by Pew Internet Research , shows that morethan 80% of the patients preferred to consult the HPs ratherthan peer-patients for the information on prescription drugs,medical diagnoses, and alternative treatment options. However,the participation of HPs in the large-scale discussion forums istime-consuming. A study [9] on the most active communitieson WebMD.com data showed the level of HPs participationwas observed to be extremely low (only . % of the postswere answered). Hence, novel strategies are necessary forprioritization of the blog-posts based on the severity of ones’health states that could assist the HPs to efficiently selectthe posts that need their expertise for making an effectiveand timely response. Specifically, we explore two importantfacets of the health state as proposed by [10] that can helpin assessing the severity of ones’ health states based on theirsocial media posts discussed as follows: • Status of the medical/health condition (e.g., a patientwhose medical condition is deteriorating even after clin-ical trials tends to be more severe than the patient whohas started experiencing the medical symptoms.) • Consequences of the medication/treatment (e.g., apatient reporting an adverse drug effect will be moresevere than a patient whose treatment was ineffective.)In Table-I, we provide the examples and description of theclasses associated with the above described facets.To evaluate our study, we have used the benchmark datasetmade available through the LRE map [10]. The dataset utilizedthe blog-posts from two popular medical discussion forums,namely ‘patient.info’ and ‘dailystrength.org’ over four groups, https://pewrsr.ch/2wEL6JU a r X i v : . [ c s . C L ] S e p ask 1: Medical Condition Task 2: MedicationHealth Blog-post Class-labels Health Blog-post Class-labels “my high resolution CT scan came backnormal...I’m doing better after a long, breathlessblue journey” Recover “I think the plaquenil is helping- been on itfor almost 3 months”
Effective “Been having eye problems ...lots of swellingredness and eye discharge.”
Exist “I have been on Hizentra for almost a year...Idon’t seem to be getting sick as much or asbad (yay) but things areen’t normal, that’s forsure”
Ineffective “It’s been just over three months and I’mactually feeling worse! my IgG levels are risingfrom 766 to 1423, but I don’t feel good...”
Deteriorate “I was given propranolo for migraineassociated vertigo. I’ve only taken 5 mgs firlast 2 nights and I’ve had bad nausea sincestarting it, I want to know if I can just stoptaking it now without any problems ”
Serious Adverse Effect “Not everyone uses or likes Facebook. Let’sremember there are so many who are looking forinformation and support on this cvid forum.–Our voice can help Other”
Other “Hello ladies, I am curious if anyone has anyknowledge or experience with kidneysymptoms resulting from using eitherGammagard or other IG therapy brands”
Other
TABLE IE
XAMPLES AND DESCRIPTION OF CLASSES FOR THE T ASK
1: M
EDICAL C ONDITION AND T ASK
2: M
EDICATION . T HE ‘E XIST ’ CLASS BELONGS TOTHOSE USERS ’ WHO ARE REPORTING THEIR MEDICAL PROBLEM ( SYMPTOMS ) BUT ARE NOT UNDER ANY FORM OF TREATMENT / MEDICATION . T HE ‘D ETERIORATE ’ CLASS CONSIST OF THOSE USERS ’ WHOSE MEDICAL PROBLEM HAS WORSEN EVEN AFTER TAKING THE TREATMENT / MEDICATION . namely asthma, allergy, depression and anxiety. The prior stud-ies [10], [11] have treated this task as a traditional documentclassification tasks and propose the neural network frameworkwhich is able to capture the content level information from theuser’s blog-post. However, the social-media texts often carryslang terminology, grammatical errors, figurative languagesand hold in the information which is highly contextual. In suchsituations, mining only linguistic information turns out to begenerally inefficient and urges for extra information/clues. Thestudy conducted by [12] also demonstrates this requirementby showing how traditional classifier fails in instances wherehumans need additional context. They further illustrate theimportance of speaker and topical information associated withthe text to incorporate such context.Advancement in NLU technology, is one of the mostpromising avenues for discovering vital contextual informationfrom such data. Motivated by that, in this study we hypothesizethat the NLU views such as emotions, sentiments, personalityand usage of figurative language can help in understanding thevital contextual clues and discourse of the blog post requiredfor detecting the health states. Ahead in this section, we haveprovide the detailed motivation behind utilizing the variousNLU views in the multi-view learning framework.The emotions that patients express towards their personalsituation could be an important indicator to understand theirhealth. Understanding the emotion can capture a users mentaland physical health and can be applied to microblog posts[13] for behavioral decision making. Similarly, in mentalhealth diagnosis, certain personality traits correlate with thediagnosis. Personality can be defined as characteristics patternsof an individual’s thinking, behaving, and emotional feeling.Hence, automatic identification of an individual’s personalitytype can have a wide range of applications in personalizedhealth diagnosis [14] and to discover the user’s behavior.We also explore the sentiments, which focus on extractingopinions and affects from the textual content [15], it seems natural that incorporating these knowledge can be helpfulto discover emotional statement in the area of online healthtext classification. Previous research has provided evidenceto suggest that peoples mental and physical health can bepredicted by analyzing their sentiment [10], [16] and the wordthey use [17], [18].Similarly, in sentiment analysis, the presence of figurativelanguage (FL) such as sarcasm in a text can work as an unex-pected polarity reverser, which may undermine the accuracyof the system, if not addressed adequately [19].In this work, we propose a multi-view learning framework[20] that jointly models both the content and the contextual-information specific for the tasks. It begins by processingcontextual information using several NLU views such asemotions, personality, sentiments, and use of figurative lan-guage (FL). Following the contextual modeling phase, weperform the content modeling using the Bidirectional EncoderRepresentations from Transformers (BERT) to extract the task-agnostic view. The task-agnostic view is then fused with thevarious NLU derived views to obtain the final representationused for predicting the health of patients. Our contributionsinclude:1) A multi-view learning framework that allows the integra-tion of different semantics captured from social mediatexts to support several aspects of natural languageunderstanding (NLU) required by the health severityassessment task.2) A comprehensive evaluation of the framework by testingit on a publicly available dataset and comparing theperformance against the state-of-the-art baselines.3) A demonstration of the effectiveness of various viewssuch as emotion, sarcasm, personality, and sentimenton the diseases/disorders which provides the comple-mentary information for assessing the severity of healthstates.I. R ELATED W ORK
Literature shows increased attention on the OHCs for com-putationally discovering patient health [21]. A majority ofthese studies follows a qualitative approach based on themanual categorization of posts by the domain experts. Thecategorization includes: (i) the type of support [22], [23], (ii)the type of emotion and sentiment expressed [24], and (iii)discussion on other illness specific topics or adverse drugeffects [2], [25]. Below, we describe some of the prior researchthat utilized social media text of patients.Several techniques have been devised to automatically pro-cess the OHCs content, uncover the user behaviours [9] andcharacteristics [26]. [27] in their study formulated the OHCsfrom a social support viewpoint and defined three variables: type of support, source of support, and setting in whichthe support is exchanged on online cancer communities. [28]developed a new content analysis method to (a) recognizevarious healthcare participants, (b) discover currents trends,and (c) analyze the sentiments expressed by different health-care participants from lung cancer, diabetes, and breast cancerforums. [29] proposed a text mining technique to classifythe user’s participation based on the different types of socialsupport, such as informational support, emotional support, andcompanionship. Further, they developed a supervised machinelearning approach to predict whether user will be churn fromOHC. The challenge defined in [30] aimed to automaticallyclassify the user posts from an online mental health forum intofour different categories (crisis/red/amber/green) according toneed of urgent attention. [5] developed a text classificationtechnique for assisting the moderators in OHCs. Specifically,they devised a classification scheme to automatically catego-rize the blog-post over the codes such as: ‘Asking for medicalinformation’, ‘Asking for peer-patients’ experience’, ‘Generalchatting’, and ‘Miscellaneous’, using WebMD’s online dia-betes community data. Some of the other prominent researchin this area includes work of [31], [32]. Recently, [10] hasintroduced the novel annotation scheme for analyzing medicalsentiment on social media text that can capture the severity ofthe user’s health states. They utilized a CNN to understand thepossible sentiment. [11] further extended the study utilizing amultitask learning framework to capture the multiple facets ofmedical sentiments simultaneously.III. M
ATERIALS AND METHODS
First, we define the task and detail the approach used in thestudy. Later, we describe the dataset and provide the trainingdetails, followed by the baseline models.
A. Task Definition:
For a given medical forum post M , consisting of n sen-tences, i.e., M = { s , s , s , .....s n } , the task is to predict thetwo aspects of health status ‘ Y ’ & ‘ Z ’ from a discrete set ofmedical conditions ‘ Y = { Recover, Exist, Deteriorate, Other } ’and medications ‘ Z = { Effective, Ineffective, Serious AdverseEffect, Other } ’. B. Summary of the Approach
Given the user’s forum post M to be classified, the proposedsystem leverages both task-agnostics and context specificviews from the text. In order to capture task-agnostic view,we model M using BERT model to obtain a vector represen-tation of the medical blog post. BERT generate the abstractrepresentation of words by capturing bi-directional context inthe input sentence [33]. Basically, BERT aims to learn thedeep bidirectional representations from the Transformer stack[34]. For modeling the contextual views, the proposed systemutilizes the information of multiple NLU based features/views(emotion, sarcasm, personality, and sentiment) extracted fromthe user forum post. For extracting emotion, sarcasm and per-sonality views we adopt the domain adaptation approach [35]where we train a model in one domain (i.e., publicly availablegold data), and extract the features on another domain (i.e.,our dataset). Finally, we fused the NLU views with the BERTgenerated task-agnostic views which are used to categorizethe medical forum post. Figure-1 shows the architecture ofour approach. C. Task-agnostic View
We generated the task-agnostic views as follows: The task-agnostic view are generated using the BERT network. Weemploy the pre-trained BERT model having Transformerlayers ( L ), each having heads for self-attention and hiddendimension of , to extract the feature representation ofthe medical forum post. The pre-trained model has shownthe state-of-the-art performance in various natural languageprocessing tasks [36], [37]. The pre-trained BERT model ishighly efficient in generating the task-agnostic input repre-sentation from the transformer architecture [38]. This enableseven the low-resource tasks to benefit from deep bi-directionalarchitectures [39] and the unsupervised training framework toobtain the pre-trained network. We perform extensive experi-ment and to obtain the effective representation of the medicalforum post representation. Our experimental studies concludethat the aggregating the last three layers of the BERT modelachieves the best result in our experiment. Given a forumpost M consisting of n tokens { t , t , t , . . . , t n } . We use theWord Piece tokenizer [40] to tokenize the sentence. We use thespecial [ CLS ] token representation as the task-specific featurefor the medical forum post M . We represent this task-agnosticview as the content feature ˆ d cont . D. Contextual (NLU) View
In this section, we will discuss in details all the generatedNLU (emotion, sarcasm, personality, and sentiment) views: Emotion View : Humans are emotional beings; emotioncarries an intrinsic role in human life. It influences our decisionmaking [41], shapes our behavior [42], and affects mental andphysical health [43].Towards this, we study the users forum post on five primaryemotions such as ‘anger’, ‘disgust’, ‘joy’, ‘fear’, and ‘sadness’ shorturl.at/nDJPYig. 1. Proposed architecture for predicting the health severity assessment task. that can assist the model in capturing the overall healthcondition of an individual. We also examine the post onmore fine-grained emotions (i.e., ‘valence’, ‘arousal’, and ‘dominance’ ) that reveals the users state of feeling [44].In order to extract the emotion views from forum post,we leverage a system similar to the CNN network [45]. Weutilize the benchmark emotion EmoBank-2017 dataset [46]and EmoInt-2017 dataset [47] to train the model for fine andcoarse-grained emotion analysis, respectively. After trainingthe model, we extract the feature representation for each forumpost.Given a forum post M consisting of n words, after extract-ing the pooled representation, we pass it to a hidden layer togenerate the view representation as follows: F e = ϕ ( W e ˆ d e + b e ) (1)where ϕ is a non-linear activation function, W e , and b e are theweight, and bias of the last hidden layer. The pooled featurewith the given window size k is generated as follows: d e = pool ( conv ( emb ( M ) , k ; θ ))) (2)where emb ( . ) , conv ( . ) and pool () are the embedding, convo-lution and pooling operations similar to standard CNN model. θ is the model parameter. The final extracted feature is ˆ d e isobtained by concatenating the multiple features of differentwindow size. The fine and coarse grained emotion views F fe and F ce are extracted using the aforementioned equation. Sarcasm View : Humans seamlessly perform high-levelsemantic tasks by subconsciously utilizing a vast collectionof composite linguistic units along with their backgroundknowledge to visualize the reality. Social media texts often contain FL. The presence of FL makes it challenging to pro-cess for any NLP applications, sentiment analysis in particular.In sentiment analysis, the presence of FL such as sarcasmin a text can work as an unexpected polarity reverser, whichmay undermine the accuracy of the system, if not addressedadequately [19]. In medical forums, patients seeking supportfor their medical problems, often use sarcasm to express theiremotion.In the following example, one patient with anxiety problemdescribes her health condition sarcastically, writing as: “... It goes from pain to slight discomfort.. I cant move. Greatway to start the day !”
Here, the phrase “
Great way to start the day ! ” is presentedin the sarcastic sense to express the medical condition of notfeeling good. To model this feature for each forum post inour framework, we utilize the approach proposed by [48].The model utilizes 2-layers of CNN followed by a LongShort Term Memory (LSTM) [49] network. We compute thesarcastic scores as follows: F s = σ ( W s .V c + b s ) (3)where σ represents the sigmoid activation function. V c , W s and b s are the input, weight, and bias term of the output layer.The input vector of the output layer V c is the concatenatedfeature vector that combines the features extracted by theLSTM layers, which capture the state of mind of a user. Personality View : One’s behavior characterizes person-ality, sympathy, emotion, thought process, and motivation. Ourpersonality impacts many preferences in our lives such asdecision making, life choices, physical and mental health. Forextracting personality view, we employ an approach similar tohe one used for emotion view. We utilize the corpus curatedby [50], which contains , essays labeled with big-fivepersonality traits, to train the model. We utilize the CNNas a learning model for the multi-label personality detectiontask. After the training, the CNN model is used to inferthe personality traits present in each medical forum post byextracting the activation of the CNNs last hidden layer vectorwhich we call as the personality vector . After the training, themodel is used for extracting the feature representation. Theextracted personality vector can be represented as: F p = ϕ ( W p ˆ d p + b p ) (4)where ϕ is a non-linear activation function, and ˆ d p (i.e.,extracted pooled feature, W p , and b p are the input, weight,and bias of the last hidden layer. Sentiment View : We generate the sentiment views fromthe forum posts as described below:
1) Word-level Sentiment (WS) : Sentiment clue words provideimportant features in deciding the sentiment of the users.Besides, the inclusion of negation to the sentiment word canchange the polarity. For example, there is positive sentimentin “I’m stable ” but after including negation like “I’m notstable ”, the sentiment polarity changes. Briefly, there aretwo types of sentiment events by which we can capture thesentiments of users: occurrences of sentiment words (SW),occurrences of sentiment words with negation (NSW). Thisfeature calculates the positive ( SW (+) ), negative ( SW ( − ) )and objective ( SW ( O ) ) score for each word by capturing thesentiment event [51]. Publicly available SentiWordNet (SWN) is used to calculate the score for each word as follows: SW + =( tf ( SW ) ∗ f + ( SW )+ tf ( N SW ) ∗ f − ( SW )) ∗ idf ( SW ) (5) SW − =( tf ( SW ) ∗ f − ( SW )+ tf ( N SW ) ∗ f + ( SW )) ∗ idf ( SW ) (6) SW O = tf idf ( SW ) ∗ f O ( SW ) (7)Here, tf and idf represent the term and inverse documentfrequencies, respectively. f + ( SW ) , f − ( SW ) and f O ( SW ) are positive, negative and objective scores, respectively ob-tained from the SentiWordNet. The word-level sentiment fea-ture of a forum post having n words is obtained as follows: W S + = (cid:80) SW ∈ n SW + n (8) W S − = (cid:80) SW ∈ n SW + n (9) W S O = (cid:80) SW ∈ n SW O n (10) http://sentiwordnet.isti.cnr.it/
2) Target-specific Sentiment (TS) : After analyzing the vali-dation data, we observe that approximately % of the postsdepict sentiments in the context of a certain stative verbssuch as ‘ feel ’, ‘ suffer ’, ‘ experience ’. We design this featureby considering a context window of [- , ] words and selectingthe most effective stative verb. After that, negative and positivedensities of a post are calculated by the frequency of the cluewords to the number of words in the context (i.e., in thiscase). For example, if a post contains more than one instanceof ‘ feel ’ term, we calculate the score individually and considerthe maximum one. If the word ‘ feel ’ appears at the i th positionin a forum post then the score is calculated as follows: Score (+) = m = i − (cid:88) m = i − k w m × f + ( SW m ) + n = i + k (cid:88) n = i +1 w n × f + ( SW n ) (11) Score ( − ) = m = i − (cid:88) m = i − k w m × f − ( SW m ) + n = i + k (cid:88) n = i +1 w n × f − ( SW n ) (12)where, k is context window size, weight w m = m + k − i + 1 and w n = k − n + i + 1 . The aggregate scores T S + and T S − of a forum post are calculated as follows: T S + = max ( Score t =0 (+) , Score t =1 (+) , . . . , Score t = T (+)) (13) T S − = max ( Score t =0 ( − ) , Score t =1 ( − ) , . . . , Score t = T ( − )) (14)where, T is the number of sentiment bearing words in the post. E. Multi-view Fusion Layer
We take a multi-view learning approach to combine thevarious views discussed above into a comprehensive embed-ding for each medical forum post. We use a extension ofCanonical Correlation Analysis (CCA) [52] to perform fusionfrom multiple views. The extended CCA captures maximalinformation between multiple views and creates a combinedrepresentation. The extension of CCA is called the GeneralizedCCA (GCCA) [53], which has been used in the literature tofuse the multiple sources of information into a single source.GCCA finds G , U i by solving the optimization problem arg min G,U i (cid:88) i || G − X i U i || F (15)such that G T G = I .where, G ∈ R m × k contains the fused feature representationmatrix, X i ∈ R m × d i corresponds to the data matrix forthe i th view and U i ∈ R d i × k maps from latent space toobservable view i . However, since all the views are not equallyimportant, we employ the weighted GCCA (wGCCA) [54]. Inthis representation, we add a weight term to the above equationas follows: arg min G,U i (cid:88) i w i || G − X i U i || F (16) atasetCategories Classes( TABLE IID
ATASET STATISTICS FOR BOTH THE CATEGORIES OF MEDICAL SENTIMENT
Models Techniques Used Medical Condition MedicationsPrecision Recall F-Score Precision Recall F-Score
Baseline 1 BERT 72.70 73.15 72.89 86.64 87.55 86.81Baseline 2 BioBERT 72.42 72.30 72.28 86.68 86.97 86.76Baseline 3 MTL [11] 66.71 64.33 65.5 85.33 81.90 83.58Proposed Approach NLU based Multi-view Learning 75.52 80.25 77.45 89.52 89.91 89.57
TABLE IIIP
ERFORMANCE COMPARISON OUR PROPOSED MODEL WITH THE BASELINES ON BOTH THE DATASETS
Index View Medical Condition Medications (1) All 77.45 89.57(2) - Emotion (coarse) 75.08 87.32(3) - Emotion (fine) 75.44 88.61(4) - Sarcasm 77.10 86.82(5) - Personality 74.91 87.66(6) - Word-level Sentiment 74.54 85.94(7) - Target-specific Sentiment 75.85 85.27
TABLE IVF
EATURE ABLATION STUDY : T
HE SYSTEM PERFORMANCE ( IN F1 SCORE ) BY REMOVING ONE VIEW AT A TIME such that G T G = I and w i ≥ and represents the importanceof the i th view in the fusion process. The columns of G arethe eigenvectors of (cid:80) i w i X i ( X (cid:48) i X i ) − X (cid:48) i , and the solutionfor U i = ( X (cid:48) i X i ) − X (cid:48) i G .We use wGCCA, to obtain the final feature G representa-tion. Finally, the classification of forum post is carried out bythe following equation: p ( Y = y | M, G ) = sof tmax y ( G T W + b )= e G T W y + b y (cid:80) Kk =1 e G T W k + b k (17) F. Dataset
We evaluate the performance of the system on the publiclyavailable dataset [10] obtained from popular online healthforum . The forum posts were collected from four onlinediscussion groups: Depression, Allergy, Asthma, and Anxiety .The dataset consists of , medical forum posts related to Task 1: medical conditions and , forum posts related tothe category of Task 2: medication . We have extended theprevious dataset by including more class (‘Other’), whichconsider the miscellaneous blog-post. These type of blog postdoes not explicitly provide any information regarding theirmedical condition or treatment but are more sort of generalenquiry. The more detailed description for each class canbe obtained from [10]. The detailed description of dataset https://patient.info statistics is provided in Table-II. We perform a 10-fold cross-validation experiment on both the datasets. Ethics:
Our project involves analysis of anonymized data thatis publicly available and used by the other publication. It doesnot involve any direct interaction with any individuals or theirpersonally identifiable data. Thus, this study was reviewed bythe Wright State University IRB and received an exemptiondetermination. IV. E
XPERIMENTAL R ESULTS
Here, we present results on the severity assessment task.Thereafter, we will provide technical interpretation of theresults followed by ablation study. We used Recall, Precisionand F -Score to evaluate our proposed task against state-of-the-art relation extractor. As a baseline model, we used BERT , BioBERT [55], and multi-task adversarial learning framework[11] to compare our proposed model.We report the performance of our proposed approach alongwith other baselines in Table III on task-1 (Medical Condition),and task-2 (Medication). The obtained results shows thatBERT model is the best among all the baselines models. Theproposed approach achieves the improvement of . , . ,and . F-Score points for task-1, and . , . , and . F-Score for task-2 over the baseline , , and respectively.Statistical significance test (t-test) shows that improvementsover the baselines are significant ( p-value < . ).To prove the effectiveness of each view, we conduct theablation experiments on our proposed model. As shown inTable-IV, the performance of the model shows varying degreesof decline when we remove different view from the model. Allthe declines are significant with p ≤ . . Similarly, for the target-specific sentiment, we observe a decline of . . On the med-ication task, again sentiment view is found to be important.Removal of word-level and target-specific sentiment view de-clines the model performance by . and . F-Score pointespectively. This shows that primarily the sentiment viewscontribute to determining the health states. The two othersignificant views affecting the final predictions are emotion(fine) and personality. As they directly reflect the behavior ofa user, the exclusion leads to a decline in the performance. Theimpact of sarcasm view is found to be smaller as comparedto other views, as its removal drops the performance by . and . F-score points on Medical condition and Medicationsdataset respectively. Although our analysis shows that theidentification of sarcasm is crucial, the little impact could bebecause of our learning strategy. Since the use of FL in themedical domain is quite different than the general domain.V. D
ISCUSSION
In this section, we study a couple of cases from both thedatasets, where our model correctly identifies various aspectsof the health states with the help of NLU views. • Case 1: Effect of the emotion view
Consider the following example from task-1: “why does anxiety feel like you have to make yourselfbreathe instead of letting your body breathe on its own.Am’i the only one.”
In the absence of the emotion views, the system misclas-sifies the post as ‘Other’. However, the inclusion of theemotion views assists in understanding the users’ implicitstates of mind, and classifies correctly as ‘Exist’. Thesystem captures the anger and disgust emotions presentin the text, which are highly correlated with the emotiondistribution associated with the medical condition cate-gory ‘Exist’. • Case 2: Effect of the sarcasm view
In the following post, the user sarcastically expresseshis/her condition: “Lol I’m just a big ball of anxiety fun.”
The system misclassifies the post as ‘Recovered’ in theabsence of sarcasm view, which may be due to thepresence of positive sentiment-bearing words. However,sarcasm view helps the model to predict the class ‘Exist’correctly. The contextual cues extracted from a post arenot always enough to understand someone’s feelings andrequire common sense and background knowledge aboutthe topic of discussion. Such situations are prevalent inthe forum posts with very long sentences. • Case 3: Effect of the personality view
In our study we find the personality view to be very usefulin understanding the users having an anxiety disorder.Consider the following example from the anxiety groupof task-1: “cutting open my arm. The sensible bit of me says NOjust a reaction to my new meds, the other half want toself destruct... I don’t know which one is going to win...”
The personality view identified signs of neurotic person-ality in the above post, which overlaps with the symptomsof anxiety. This helped the system in correctly classifyingthe user’s mental state as having the anxiety disorder. • Case 4: Effect of the sentiment view
As shown in Table IV, the sentiment views help the modelin boosting the performance by nearly 2% F-Score. In thefollowing example: “I think it’s because I’m afraid of feeling ill when I’mout. This past week I increased citalopram to 20 mg andzI don’t know if it’s making me feel worse.”
The sentiment views capture the negative sentiment-bearing words (i.e., afraid, ill, and worse) and encode thisinformation to assist the model for correctly predictingthe class as ‘Serious Adverse Effect’.VI. C
ONCLUSIONS AND F UTURE W ORK
In this paper, we identify the severity of a user’s healthstate by analyzing different medical aspects (such as medicalcondition and outcome of treatment) from their social mediatexts. We validate our study by utilizing a benchmark datasetcurated from medical web forums. We propose a deep learn-ing model leveraging various NLU views such as emotion,sarcasm, personality, and sentiment along with the textualcontent for classifying the medical forum posts. The evaluationshows that combining the content view to context views is aneffective way to boost the classification performance. In thefuture, we would like to explore the other facets of a user’shealth state like ‘Consequence of a treatment’ and ‘Certaintyof a diagnosis’. In addition to sarcasm, we would also like tomodel other forms of figurative languages like ‘metaphor’ and‘irony’ which are widely used in social media texts.VII. A
CKNOWLEDGEMENT
Amit Sheth acknowledged partial support from NMH awardR01MH105384 Modeling Social Behavior for Healthcare Uti-lization in Depression. All findings and opinions are of authorsand not sponsors. Sriparna Saha would like to acknowledgethe support of SERB WOMEN IN EXCELLENCE AWARD2018. R
EFERENCES[1] S. Fox,
The social life of health information 2011 . Pew Internet &American Life Project Washington, DC, 2011.[2] S. Yadav, A. Ekbal, S. Saha, and P. Bhattacharyya, “A unified multi-task adversarial learning framework for pharmacovigilance mining,”in
Proceedings of the 57th Annual Meeting of the Associationfor Computational Linguistics
Proceedings of the 2018 Conference of theNorth American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 2 (Short Papers)
Proceedings of the Eleventh International Conferenceon Language Resources and Evaluation (LREC 2018)
Journal of biomedicalinformatics , vol. 46, no. 6, pp. 998–1005, 2013.[6] S. A. Qureshi, S. Saha, M. Hasanuzzaman, and G. Dias, “Multitaskrepresentation learning for multimodal estimation of depression level,”
IEEE Intelligent Systems , vol. 34, no. 5, pp. 45–52, 2019.[7] S. A. Qureshi, M. Hasanuzzaman, G. Dias, and S. Saha, “Improvingdepression level estimation by concurrently learning emotion intensity,”
IEEE Computational Intelligence Magazine , vol. accepted, 2020.[8] P. E. Kummervold, D. Gammon, S. Bergvik, J.-A. K. Johnsen,T. Hasvold, and J. H. Rosenvinge, “Social support in a wired world: useof online mental health forums in norway,”
Nordic journal of psychiatry ,vol. 56, no. 1, pp. 59–65, 2002.[9] J. Huh, D. W. McDonald, A. Hartzler, and W. Pratt, “Patient moderatorinteraction in online health communities,” in
AMIA Annual SymposiumProceedings , vol. 2013. American Medical Informatics Association,2013, p. 627.[10] S. Yadav, A. Ekbal, S. Saha, and P. Bhattacharyya, “Medical sentimentanalysis using social media: Towards building a patient assisted system.”in
LREC , 2018.[11] S. Yadav, A. Ekbal, S. Saha, P. Bhattacharyya, and A. Sheth, “Multi-task learning framework for mining crowd intelligence towards clinicaltreatment,” in
Proceedings of the 2018 Conference of the North Ameri-can Chapter of the Association for Computational Linguistics: HumanLanguage Technologies, Volume 2 (Short Papers) , vol. 2, 2018, pp. 271–277.[12] B. C. Wallace, L. Kertz, E. Charniak et al. , “Humans require context toinfer ironic intent (so computers probably do, too),” in
Proceedings of the52nd Annual Meeting of the Association for Computational Linguistics(Volume 2: Short Papers) , vol. 2, 2014, pp. 512–516.[13] W. Wang, L. Chen, K. Thirunarayan, and A. P. Sheth, “Harnessingtwitter” big data” for automatic emotion identification,” in . IEEE, 2012, pp. 587–592.[14] F. Cortellese, M. Nalin, A. Morandi, A. Sanna, and F. Grasso, “Per-sonality diagnosis for personalized ehealth services,” in
InternationalConference on Electronic Healthcare . Springer, 2009, pp. 157–164.[15] B. Pang, L. Lee et al. , “Opinion mining and sentiment analysis,”
Foundations and Trends® in Information Retrieval , vol. 2, no. 1–2, pp.1–135, 2008.[16] J. Carrillo-de Albornoz, J. R. Vidal, and L. Plaza, “Feature engineeringfor sentiment analysis in e-health forums,”
PloS one , vol. 13, no. 11, p.e0207996, 2018.[17] L. A. Gottschalk and G. C. Gleser,
The measurement of psychologicalstates through the content analysis of verbal behavior . Univ ofCalifornia Press, 1969.[18] W. B. Stiles,
Describing talk: A taxonomy of verbal response modes .Sage Publications Newbury Park, 1992.[19] C. Bosco, V. Patti, and A. Bolioli, “Developing corpora for sentimentanalysis: The case of irony and senti-tut,”
IEEE Intelligent Systems ,vol. 28, no. 2, pp. 55–63, 2013.[20] R. Kumar, S. Yadav, R. Daniulaityte, F. Lamy, K. Thirunarayan,U. Lokala, and A. Sheth, “edarkfind: Unsupervised multi-view learningfor sybil account detection,” in
Proceedings of The Web Conference2020 , 2020, pp. 1955–1965.[21] B. I. Davidson, S. L. Jones, A. N. Joinson, and J. Hinds, “Theevolution of online ideological communities,”
PloS one , vol. 14, no. 5,p. e0216932, 2019.[22] T. O. Blank, S. D. Schmidt, S. A. Vangsness, A. K. Monteiro, and P. V.Santagata, “Differences among breast and prostate cancer online supportgroups,”
Computers in Human Behavior , vol. 26, no. 6, pp. 1400–1404,2010.[23] N. S. Coulson, H. Buchanan, and A. Aubeeluck, “Social support incyberspace: a content analysis of communication within a huntington’sdisease online support group,”
Patient education and counseling , vol. 68,no. 2, pp. 173–178, 2007.[24] I. Ruthven, S. Buchanan, and C. Jardine, “Isolated, overwhelmed, andworried: Young first-time mothers asking for information and supportonline,”
Journal of the Association for Information Science and Tech-nology , 2018.[25] J. Huber, A. Ihrig, T. Peters, C. G. Huber, A. Kessler, B. Hadaschik,S. Pahernik, and M. Hohenfellner, “Decision-making in localized prostate cancer: lessons learned from an online support group,”
BJUinternational , vol. 107, no. 10, pp. 1570–1575, 2011.[26] J. Huh and W. Pratt, “Weaving clinical expertise in online healthcommunities,” in
Proceedings of the 32nd annual ACM conference onHuman factors in computing systems . ACM, 2014, pp. 1355–1364.[27] S. Zhang, E. OCarroll Bantum, J. Owen, S. Bakken, and N. Elhadad,“Online cancer communities as informatics intervention for socialsupport: conceptualization, characterization, and impact,”
Journal of theAmerican Medical Informatics Association , vol. 24, no. 2, pp. 451–459,08 2016. [Online]. Available: https://doi.org/10.1093/jamia/ocw093[28] Y. Lu, Y. Wu, J. Liu, J. Li, and P. Zhang, “Understanding healthcare social media use from different stakeholder perspectives: a contentanalysis of an online health community,”
Journal of medical Internetresearch , vol. 19, no. 4, p. e109, 2017.[29] X. Wang, K. Zhao, and N. Street, “Analyzing and predictinguser participations in online health communities: A social supportperspective,”
J Med Internet Res
Proceedingsof the Third Workshop on Computational Lingusitics and ClinicalPsychology , 2016, pp. 118–127.[31] K. O’Leary, S. M. Schueller, J. O. Wobbrock, and W. Pratt, “suddenly,we got to become therapists for each other: Designing peer supportchats for mental health,” in
Proceedings of the 2018 CHI Conferenceon Human Factors in Computing Systems . ACM, 2018, p. 331.[32] R. Balyan, S. A. Crossley, W. Brown III, A. J. Karter, D. S. McNamara,J. Y. Liu, C. R. Lyles, and D. Schillinger, “Using natural languageprocessing and machine learning to classify health literacy from securemessages: The eclippse study,”
PloS one , vol. 14, no. 2, p. e0212488,2019.[33] A. Jacovi, O. S. Shalom, and Y. Goldberg, “Understanding con-volutional neural networks for text classification,” arXiv preprintarXiv:1809.08037 , 2018.[34] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in
Advancesin neural information processing systems , 2017, pp. 5998–6008.[35] X. Glorot, A. Bordes, and Y. Bengio, “Domain adaptation for large-scalesentiment classification: A deep learning approach,” in
Proceedings ofthe 28th international conference on machine learning (ICML-11) , 2011,pp. 513–520.[36] J. Maillard, S. Clark, and D. Yogatama, “Jointly learning sentenceembeddings and syntax with unsupervised tree-lstms,”
Natural LanguageEngineering , vol. 25, no. 4, pp. 433–449, 2019.[37] J. Hewitt and C. D. Manning, “A structural probe for finding syntaxin word representations,” in
Proceedings of the 2019 Conference ofthe North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, Volume 1 (Long and ShortPapers) , 2019, pp. 4129–4138.[38] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,L. Kaiser, and I. Polosukhin, “Attention is all you need,” in
NIPS , 2017.[39] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-trainingof deep bidirectional transformers for language understanding,” arXivpreprint arXiv:1810.04805 , 2018.[40] Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey,M. Krikun, Y. Cao, Q. Gao, K. Macherey et al. , “Google’s neuralmachine translation system: Bridging the gap between human andmachine translation,” arXiv preprint arXiv:1609.08144 , 2016.[41] G. A. Van Kleef, C. K. De Dreu, and A. S. Manstead, “An interpersonalapproach to emotion in social decision making: The emotions as socialinformation model,” in
Advances in experimental social psychology .Elsevier, 2010, vol. 42, pp. 45–96.[42] R. F. Baumeister, K. D. Vohs, C. Nathan DeWall, and L. Zhang, “Howemotion shapes behavior: Feedback, anticipation, and reflection, ratherthan direct causation,”
Personality and social psychology review , vol. 11,no. 2, pp. 167–203, 2007.[43] N. Extremera and P. Fern´andez-Berrocal, “Emotional intelligence aspredictor of mental, social, and physical health in university students,”
The Spanish Journal of Psychology , vol. 9, no. 1, pp. 45–51, 2006.[44] A. Mehrabian and J. A. Russell,
An approach to environmental psychol-ogy. the MIT Press, 1974.[45] Y. Kim, “Convolutional neural networks for sentence classification,”in
Proceedings of the 2014 Conference on Empirical Methods inNatural Language Processing (EMNLP)
Proceedings of the 15th Conference of the European Chapter ofthe Association for Computational Linguistics: Volume 2, Short Papers ,vol. 2, 2017, pp. 578–585.[47] S. M. Mohammad and F. Bravo-Marquez, “Wassa-2017 shared task onemotion intensity,” arXiv preprint arXiv:1708.03700 , 2017.[48] A. Ghosh and T. Veale, “Magnets for sarcasm: Making sarcasm detectiontimely, contextual and very personal,” in
Proceedings of the 2017Conference on Empirical Methods in Natural Language Processing ,2017, pp. 482–491.[49] S. Hochreiter and J. Schmidhuber, “Long short-term memory,”
Neuralcomputation , vol. 9, pp. 1735–80, 12 1997.[50] G. Matthews and K. Gilliland, “The personality theories of hj eysenckand ja gray: A comparative review,”
Personality and Individual differ-ences , vol. 26, no. 4, pp. 583–626, 1999.[51] T.-T. Dang and K. Shirai, “Machine learning approaches for moodclassification of songs toward music search engine,” in
Knowledgeand Systems Engineering, 2009. KSE’09. International Conference on .IEEE, 2009, pp. 144–149.[52] H. Hotelling, “Relations between two sets of variates,” in
Breakthroughsin statistics . Springer, 1992, pp. 162–190.[53] J. D. Carroll, “Generalization of canonical correlation analysis to threeor more sets of variables,” in
Proceedings of the 76th annual conventionof the American Psychological Association , vol. 3, 1968, pp. 227–228.[54] A. Benton, R. Arora, and M. Dredze, “Learning multiview embeddingsof twitter users,” in
Proceedings of the 54th Annual Meeting of theAssociation for Computational Linguistics (Volume 2: Short Papers) ,2016, pp. 14–19.[55] J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang,“Biobert: a pre-trained biomedical language representation model forbiomedical text mining,”