Public Sentiment Toward Solar Energy: Opinion Mining of Twitter Using a Transformer-Based Language Model
Serena Y. Kim, Koushik Ganesan, Princess Dickens, Soumya Panda
PPublic Sentiment Toward Solar Energy: Opinion Mining of Twitter Using a Transformer-Based Language Model
Serena Y. Kim a,b,*,
Koushik Ganesan c, Princess Dickensd and Soumya Panda e °Computer Science, University of Colorado Boulder, 1111 Engineering Dr, Boulder, CO 80309 USA b School of Public Affairs, University of Colorado Denver, 1380 Lawrence St., Suite 500, Denver, CO 80204 USA c Physics, University of Colorado Boulder, 2000 Colorado Ave, Boulder, CO 80309 USA d Linguistics, University of Colorado Boulder, Hellems 290, Boulder, CO 80309 USA e Business Analytics, University of Colorado Boulder, 995 Regent Dr, Boulder, CO 80309 USA
AR TIC LE INFO
Keywords:
Solar energy Renewable energy policy Sentiment analysis Machine learning Natural language processing RoBERTa Social media
1. Introduction
ABSTRACT
Public acceptance and support for renewable energy are important determinant s of renewable energy policies and market conditions. This paper examine s public sentiment toward solar en ergy in the United States using data from Twitter , a micro-blogging platform in which people post messages, known as tweets. We filtered tweets specific to solar energy and performed a classification task using Robustly optimi zed Bidirectional Encoder Representation s from Trans formers (RoBERTa). Analyzing 71,262 tweets during the period of late January to ear ly July 2020 , we find public sentiment varies significantly across states. Within the study period, the Northeastern U.S. region shows more positive sentiment toward solar energy than did the South ern U.S. region. Solar radiation does not corre late to variation in solar sentiment across states. We also find that public sentiment toward solar correlates to renewab le energy policy and mar ket conditions , specifically, Renewable Portfolio Standards (RPS) targets, customer-fr iendly net metering policies, and a mature solar market.
For the first time in U.S. history, solar energy recently became more cost efficient than coal energy in the long term [36]. Solar energy contributes to greenhouse gas (GHG) emission reduction, offers new business opportunities to lan downers and energy providers, and allows utility customers to lower their utility bills. De spite these advantages, only about 2% of U.S. electricity generation currently emanates from solar [64]. The low proportion of energy generated from solar may be attributed to multiple factors, including skepticism surrounding financial benefits, system reliability, and conflicts of interest with respec t to utility revenue preservation [73]. Although public sentiment and preferences regarding renewable energy have been studied in the last decade [ 43, 59, 20], few studies have documented public perception of solar energy specifically. Public opinion on solar energy is worth investigating separately because different renewable sources such as geothermal, hydroel ectricity, wind, and bioenergy have unique advantages and requirements depending on the existing climate conditions and state-speci fic energy policies. Furthermore, most of the existing literature relies on surveys and interviews to understand public perception of renewable energy. Although surveys and interviews can provid e targeted and relevant data , they can be susceptible to selection and respons e biase s [9]. For example, people who are more supportive of renewable energy may be more likely to respond to surveys and interviews , which may yield bias. Social media is an increasin gly popular mean s to express opinions and preference s and can thus provide valuable information for understanding public opinion on solar energy. In this paper, our first aim is to under stand public opinion using data from Twitter, a micro -blogging platform in which people can post and interact with me ssages known as "twee ts" . With over 50 million active users in the United States alone [2], Twitter provides an ideal platform for opinion mining as extensive amounts of data are collected across a wide range of demographics and geograp hical location s. To achieve this aim, we utilized Robustl y optimiz ed *Corresponding author ~ serena. kim©colorado . edu (S.Y. Kim); koushik. ganesan©colorado. edu (K.
Gane san); princess . dickens ©colorado. edu (P. Dickens); soumya. panda ©colorado. edu (S. Panda)
ORCID(s): 0000-0003-3839-3674 (S.Y.
Kim)
Kim , Ganesan , Dickens , and Panda:
Preprint submitted to Elsevier
Page 1 of 15 olar Energy Opinion Mining
Bidirectional Encoder Repre sentation s from Transformer s (RoBERTa) [38] based on recent development in the fields of Natural Language Proces sing (NLP) and Machine Learning (ML). RoBERTa possesses an extensive pre-training phase which can later be fine-tuned for a domain-specific task, yielding highly accurate results. Our second aim in this paper is to examine whether the energy policies and market conditions explain public opinion on solar energy. In particular , we focus on state-level characteristics, including renewable energy generation capacity, renewable energy portfolio standards (RPS), net metering, renewable energy incentive s, and solar market maturity. The results suggest that public opinion on solar energy varies widely across states and is more likely to be positive in states with aggressive RPS targets , customer-friendly net metering rules , and a more mature solar market. This study has policy implications in addressing spatial disparities in public support for solar energy and future opportunities for solar energy deployment. In particular, states which implemented specific renewable energy incentives are more supportive of solar energy, which may have implication s for solar deployment. We also discuss the benefits of using social media as a source of data on public opinion and applying RoBERTa for sentiment classification in gauging public sentiment toward solar energy.
2. Background
Public acceptance and awareness of renewable energy are critical to the development of renewable energy industries and technology [52]. On one hand, lack of institutional, political, and community support is often identified as a key barrier to renewable energy development [73 ]. Landscape modification and visual intrusion of facilities are highlighted as main reasons for residents' opposition to renewable energy [6, 71 ]. On the other hand , the contribution of renewabl e energy to the local and regional economy positively influences public support for renewable energy [47 , 7]. Not surprisingly, individuals ' opinion and preferences regarding renewable energy are highly associated with their belief and perception of global warming , climate change , and environmental risk [16, 43]. Views on renewable energy are also determined by personal characteristics such as education, party identification , and age [20, 56]. Government policies and public opinion on renewable energy may relate to each other in a bidirectional sense. Pub lic acceptance and support for renewable energy may affect renewable energy policy adoption, which in turn encourages renewable energy deployment. Renewable energy programs and financial incentives can also mitigate uncertainties in energy transition , garnering public support and acceptance for renewable energy. For example , RPS policy design and framing strongly influence broad public support for renewable energy technologies [59]. Additionally, net metering, or an electricity billing policy allowing renewable energy system owners (e.g., residential, commercial, and indu strial buildings) to use the renewable energy generated, has been shown to facilitate solar energy deployment [12, 13]. A well-designed , transparent net metering policy can mitigate market and revenue uncertaintie s and help gain support from key stakeholders , including utilities , solar businesses , and customers [5], despite compensation rate limitations negatively impacting solar energy preferences after reaching a certain tipping point [12]. Public beliefs about the effectiveness of the policy-making and planning process is an important predi ctor of public sentiment toward renewable energy. Public perceptions of fairness in decision makin g about the siting of renewable energy facilities can mitigate public resistance to renewable energy deployments [71]. Having tru st in the policy makers responsible for renewable energy development has a direct effect on publi c opinion [25]. Greater openness to information sharing about alternative energy options can enhance public support for renewable energy [78]. However , lack of a common understanding about the planning proce ss contributes to public reluctance of renewable energy adoption [8]. Public education and outreach around administrative and technolo gical aspects of solar energy can help reverse negative perceptions of renewable energy development [53, 25]. Leveraging social networks and facilitating participation in the planning and development proces s can also enhance public trust in solar energy development [44]. Electricity market characteristics and conditions also likely influence public opinion on renewable energy. Public preferences depend on the price of conventional electricity because renewable energy is considered an alternative to conventional electricity generation. While the availability of low cost fossil fuel- generated electricity can make it difficult to justify renewable energy development [10], increased energy prices can enhance public acceptance of renewable energy [61]. In addition, high upfront costs and lack of adequate financing options are major barri ers to public support for renewable energy [44, 25 ]. Solar busine sses contribute to local economic development , by lowerin g solar installation costs and creating new jobs , which in turn leads to publi c support for renewable energy development. Existing literature also documents temporal and spatial variations in preferences and opinions regarding renewable energy. Public opinion about renewable energy changes over time [66, 1]. For example, Hamilton et al. [20] finds that public acceptance of renewable energy shows clear upward trends from 2011 to 2018. Historic al events (e.g., energy Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 2 of 15 olar Energy Opinion Mining shortages and electricity price increa ses) can affect public awareness about renewable energy technolog y [l].
Public acceptance of renewable energy also varies geographically, by communities [11], states [37, 20], and countries [58]. Spatial and geographical characteristics, such as region-specific culture [4 ], solar radiation [57], and energy autonomy of the region [58], influence spatial variations in public opinion. Most empirical studies have used surveys and interviews to measure public opinion , sentiment, awareness, and per ception of renewable energy. The literature finds broad public support for renewable energy across the United States [59], Finland [33], Mexico [19], Spain [54], South Korea [29], Portugal [60], Greece [24], and worldwide [52, 72]. Surveys and interviews have advantages in gauging individual-level demographic information , such as gender, educa tion, income [34 ], distance to renewable energy facilities, and previous experience with renewable energy technologies [58], which is one of the key determinants of individual s' preference s regarding renewable energy. However, surveys and interviews are limited in gauging temporal dynamics and geographical variations in public opinions. Researchers are beginning to use social media, especially Twitter, to examine public sentiment toward renewable energy [l , 23 , 46, 37]. Jain and Jain [23] compares five different machine learning technique s for sentiment analysis and finds that the Support Vector Machine (SVM) achieves higher accuracy than K-Nearest Neighbor, Naive Bayes, AdaBoost, and Bagging algorithms for sentiment classification on renewable energy related tweet s. Using both traditional and social media for opinion mining , Nuortimo and Harkonen [46 ] find that public opinion on solar and wind has been the most positive compared to other energy sources, including coal, nuclear , and bioma ss. Using Twitter data from 2014 to 2016, Abdar et al. [1] finds that Alaskans' energy preferences have become more supportive of renewable energy over time.
3. Materials and methods
Our data are from two main sources: Twitter [63] and the U.S. federal and state government agencies. We use Twitter data for sentiment analysis, or opinion mining, of solar energy. Twitter has been a valuable source for opinion mining, but the manual classification of a large amount of tweets is difficult and time-consuming. Thus, we use NLP and ML methods to detect public opinion on solar energy automatically. We discuss our data collection and pre processing processes in the following section. Section 3.2 outlines recent developments and approaches in opinion mining. Section 3.3 explains our sentiment classification model built upon RoBERTa. Finally, section 3.4 summarize s energy policy and market data used in this study.
Twitter Application Program Interface (API) was used to collect tweets, which are posts created by individuals on Twitter, specific to solar energy. Tweepy, a python library for accessing the Twitter API is used to stream live tweets in real-time [63]. We used ten keywords to stream live tweet s, including 'so lar energy', 'so lar panel ', 'so lar PV ', ' solar photovoltaic', 'solar battery ', ' solar thermal' , 'solar power ', 'solar-powered ', 'solar generation', and ' solar subsidie s.' In total, 406,811 tweets specific to solar energy were collected between January and early July of 2020. We removed URLs, "RT (ReTweets)", and images from original tweets using the library preprocessing [41 ], which aids proces sing text strings. We identified a list of words that make a tweet irrelevant to public sentiment toward solar energy. These words include 'Pokemon ', 'Superman', 'galaxy ', 'e clipse ', 'solar plexu s', 'solar-powered human ', and 'I will become your sun.' We excluded 16,245 tweets that included these words. We also excluded the tweets that included the 10 key words (e.g., 'solar energy ', 'solar panel ', 's olar PV' , ' solar photovoltaic ') only in the user identification (i.e., screen name and description) but not in the text, quoted text, or extended text. We excluded 64,253 tweets through this proce ss . Not all tweets include geographic information that is essential to this study. Only about 40 % of the tweets have geographic information either where they are based or where users tweeted. Amon g this 40 %, about half of the tweets are from countries other than the United States . For the purpo se of this study, we only need the tweet s tweeted by users with geoloc ation s associated with the United State s and/or the tweets by the user s who identify themselves based in the U.S. Thus we extracted tweets with geographic information based on the self-reported locations in the user profiles as well as the latitude/longitude coordinates using Carmen, a library for geolocating tweets [15]. Our final dataset includes 71,262 unlabelled tweets as a result of this extraction process. We randomly selected 5,122 of the 71,262 tweets and manuaIJy classified them into one of the two groups: "po sitive toward solar" or "negative toward solar." Tweets that include job announcements and product advertisement in the solar
Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 3 of 15 olar Energy Opinion Mining industry are classified into the positive class as these tweet s support solar energy industry. Retweet s are included since retweeting is a way to express individuals' support or opposition to solar energy. We did not include a neutral class because there were not enough tweets (8% of collected tweets) which we would consider neutral. In this case, including a neutral class could have resulted in lower performance or prediction accuracy [68, 48]. The tweets classified as neutral were deleted from the manually annotated dataset(5,122 tweets). Table 1 illustrates examples of tweets belonging to the two classes , positive and negative solar sentiment.
Table 1
Sample Tweets for Two Classes Tweets Solar energy has never been easier and more affordable to install. Oil is the the way of the past. Solar and other forms of clean energy is where you have to invest. Add solar and get paid twice - Equity and Energy! I've seen gopher tortoises and red-shouldered hawks rendered bereft of habitat because of solar panel farms. Solar and wind power are totally dependent on weather and can't be trusted. Solar is expensive to maintain and return is not what everyone is shouting about.
Sentiment Positive Negative
ML approaches have been widely applied to opinion mining, also known as sentiment analysis. Opinion mining is the computational treatment of opinions, sentiments, and subjectivity of text [28]. Sentiment classification techniques based on NLP and ML have been widely applied for opinion mining in di verse areas, including public health [79 ], movie reviews [27], airline service [67], political news [26], and the COVID-19 pandemi c [17, 40 ]. Sentiment classification is a sub-discipline of text classification , which is concerned with classifying a text to a class for analyzing opinion or sentiment in texts. Although human emotions and intent s are highly complex in real life, the current state-of-the-art sentiment analysis has achieved higher performance with a simpler classification task, such as classifying texts into two (e.g., positive and negative ) or three categories (e.g., positive , negative, and neutral). Recent developments in NLP have produced effective ways for automatic sentiment analysis using ML algorithms [50]. Lexicon-based approaches using supervised and unsupervised ML algorithms, including Nai"ve Bayes and Sup port Vector Machines (SVM) , have generated high accuracy in sentiment cla ssification. They have gained wide pop ularity in opinion mining [21]. More recently , deep learning approaches including Gated Recurrent Units (GRUs), long short-term memory (LSTM), and Convolutional Neural Networks (CNN) have started using non-static word em bedding and have shown even higher performan ce results than the previous ML approaches in multipl e sentiment classification tasks [39, 49]. Although neural network models have established higher accuracy with their automated feature learning , the ap plication of such models is often limited by their heavy reliance on manually annotated data for trainin g language models [50]. Within neural network models , considerable progress has been achieved by the models using the Trans former architecture, which is based on an self-attention mechanism [65]. While such advantages have given rise to the development of new models , BERT (Bidirectional Encoder Representations from Transformers) which uses con textual sentences and word-embeddings to address the limitations of RNNs and LSTMs , has notably improved the performance of sentiment analysis [14 ]. BERT is also the first large-scale bidirectional unsupervised way of training the language model by using two main strategies, namely as "masked language model " (MLM) and "next sentence prediction " . A number of BERT-based model s have been developed since 2018 including Disti!BERT [55], AIBERT [35], XLNet [74], and RoBERTa[38]. One of the key advantages of the BERT-based model s is that users do not need Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 4 of 15 olar Energy Opinion Mining
Class Label (Positive or Negative) ~~ ( ~ ~ ~~ RoBERTa ~ ~ :$) ~ ~ T ~ ~ ~ ~- E, E4 ,094 E4 ,09s E4 ,096
Figure 1:
Graphic representation of our model based on RoBERTa. The input embeddings are denoted as E, and the final hidden vector is denoted as T. Every input example starts with CLS, a special token. C is the final hidden vector of the special CLS token. Trm is Transformer. Adapted from Devlin et al. [14). a large corpus of texts to train their models. Users only need to fine-tune a BERT-based model using area/ta sk-specific supervised training data (e.g., manually annotated tweets) because BERT is already pre-trained on large corpus from Wikipedia and books, while utilizing more rich contextual information. The sentence-level vectors are more conve nient for downstream NLP tasks, including sentiment classification, detecting sentence similarity, and learning word vector embedding.
Our sentiment classification model is based on RoBERTa , a Robustly Optimized BERT [38]. Like any other BERT-based models , RoBERTa is powerful as it pretrains the language model by a bidirectional representation of words , meaning that the model is not restricted to reading texts either right-to-left or left-to-right. This deep bidir ec tional pre-training structure provides the model with more information about context. RoBERTa also achieves higher accuracy than all other previous BERT-based models including BERT , BERT-Large, and XLNet. This higher accu racy is achieved by four main modifications, including pre-training the language model with 8-times larger batches over 10-times more data; training on 5-times longer sequences; using Byte-Pair Encoding (BPE) vocabulary instead of the character-level vocabulary; removing the next sentence prediction (NSP); dynamically changing the masking pattern applied to the training data instead of static masking [38]. Thu s, we chose RoBERTa as our base model, as it is considered the state-of-the-art sentiment classification approach as of January 2020. In total , we manually annotated 5,122 tweets without duplicates, and the annotated tweets were used to construct our train (80 %), development (10%) and test (10%) sets. As shown in Figure l , we fine-tuned RoBERTa using 4,097 annotated tweets. During the annotation process, we detected sarcastic expre ssions (e.g., "Solar power ? Yeah, right. That is a top concern for millions of Americans during COVID " .) and classified the tweets including such expres sions into negative class. We used the HuggingFace transformer library [70], which includes the standard RoBERTa-based architecture with 12 hidden states and 12 attention heads, but with some modifications to the hyper-parameter s, the parameter s in which values are used to control the learning process in machine learning. We also used the AdamW optimi zer [32] to minimize the cross-entropy loss. To optimize the parameters, we conducted a number of experiments testing multiple combinations of the hyper-parameters on the development set (10% of the annotated tweets). In our final model , we fine-tuned RoBERTa with a learning rate of 6e-6, an € of le-8 , and dropout of 0.1. We set the maximum sequence Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 5 of 15 olar Energy Opinion Mining length to 128 tokens, used a batch size of 16 and train for 4 epochs on a Tesla PlO0 GPU. Using this final set of hyper parameters, our model achieves 90.7% accuracy with an Fl score of 0.927 on the test set (another 10% of annotated tweets).
The second aim of this study is to examine associations between public sentiment on solar energy and renewable energy policy and market characteristics. The characteristics examined in this study include renewable energy gener ation, RPS , net metering , the number of existing renewable energy incentives and policies , electricity price , and solar energy market maturity.
A state's existing renewable energy generation capacity may be positively associated with public acceptance of solar energy. Renewable generation is measured as the percentage of renewable energy (excluding hydroelectric energy) generated in a state in Megawatt-hours as of 2019. Electricity generation data are from the U.S. Energy Information Administration (EIA) [ 64]. Distributed energy generation ( e.g., rooftop solar) is not included in the dataset due to a lack of nationwide data on distributed energy generation capacity.
RPS
RPS policy is a state-level mechanism that requires utilities to generate or purchase a certain percentage of energy from renewable sources. Although RPS policies share common characteri stics, the design features vary widely across states. Existing studies have developed and used different RPS measures , including a binary measure of whether the state has RPS policies [75], a RPS percentage target [22], an interaction between the percentage target and the target year[31], marginal RPS targets [30], and a RPS target combined with "free trade " of Renewable Energy Certificates (RECs) [77]. In order to take into account such variation s, we construct a mea sure for RPS as follows: TargetPercenti - ' TargetYeari - (1) States that already achieved a RPS target or do not have RPS policies as of 2019 received a score of 0. RPS data as of 2019's Quarter 4 are from state government agencies. Iowa and Texas do not have a target year, but both states already achieved RPS target and received a score of 0.
Net metering data are from the North Carolina Clean Energy Technology Center [3]. Some states have enacted net metering policies and other distributed generation (DG) compensation rules to allow electricity customers generatin g electricity (e.g., rooftop solar) to use that generated energy at any time. Solar system owner s send the excessive energy back to the electricity grid and are compensated at a specific rate. DG compensation policies provide substantial incentives for solar energy generation as commercial and residential building owners can use solar energy at night, generated during the sunny or cloudy days. Net metering policies can vary significantly in terms of design and features. Thus, we construct an additive measure capturing five key features: a.
M echanismi:
The existence of statewide net metering mechanisms (4 = statewide net metering ; 3 = statewide alternative compensation mechanism ; 2 = some customers (e.g., residential buildings) receive net metering benefit s; 1 = only selective utilities (e.g., IOUs) provide net metering; 0 = no net metering or alternative DG compensation) b. Capi:
Net metering capacity limitations, which regulate the size of systems which can receive net metering benefits in states (1 = unlimited system size; 0 = otherwise) c. Subscriberi:
Net metering subscriber size limitation (1 = unlimited; 0 = otherwise) d. Compensation/
Compensation rate for energy generation (1 = compensate for customer rates ; 0 = otherwise) e. Rollov eri:
Rollover of remaining energy is allowed (2 = allowed without any limitations; 1 = partially allowed or allowed only until the end of billing year; 0 = not allowed) N EMi = M echanismi + Capi + Subscriberi + Comp ens ationi + Rollo veri (2)
Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 6 of 15 olar Energy Opinion Mining
Average Sentiment Score
Figure 2:
Spatia I patterns in sentiment toward solar energy across the United States
States which enacted more renewable energy policie s and incentives may have greater public support for renew able energy. Thus, we included the number of renewable energy policies and incentive s by states as of October 2019. The data were collected from the database of State Incentives for Renewable & Efficiency [42 ].
Public sentiment on solar energy may be more positive in states with a more mature solar market as thi s creates jobs and supports local economic development. We measure solar market maturity by the number of solar indu stry jobs per 1 million people in the state population in 2019. The data wer e obtained from the National Solar Job Cen sus [62].
Higher electricity prices may be associated with higher public sentiment toward solar energy because distributed renewable energy generation helps reduce electricity bills in places with high electricity rates [61, 10]. Thus, we include in our analysis the average price of electricity to ultimate residential customers by the state as of August 2019. The electricity data are from the U.S. Energy Information Administration (EIA) [64]. Solar radiation
The level of solar radiation positively predicts solar photovoltaic system installations [57 ] and solar energy genera tion [22 ]. As the solar energy system achieves greater efficiency and performance in regions with high solar radiation, we expect that annual average solar radiation is po sitively associated with se ntiment toward solar energy. Solar radi ation is measured in KWh/m /Day and is aggregated at the state level. The solar radiation data are from the National Renewable Energy Laboratory 's National Solar Radiation Database [45 ]. Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 7 of 15 olar Energy Opinion Mining
Table 2
Descriptive statistics Variables Obs. Mean SD Min Max (1) (2) (3) ( 4) (5) (6) (7) (1) Average sentiment score 51 7.546 .805 4.96 9.08 1 (2) Renewable generation 51 17.078 17.079 1 67 .13 1 (3) RPS 51 1.568 1.825 0 9 .40 -.06 1 ( 4) Net metering 51 6 2.088 0 9 .43 .07 .12 1 (5) Renewable incentives 51 68.961 41.386 13 217 .28 .07 .16 .09 1 (6) Solar market maturity 51 .725 .695 0 2 .44 .09 .24 .18 .23 1 (7) Electricity price 51 13.863 4.441
33 .26 -.04 .26 .17 .02 .34 1 (8) Solar radiation 51 4.333 .589 3 6 -.22 .08 -.14 -.13 .03 .18 - .11
4. Results 4.1. Public opinion on solar energy by state
Figure 2 presents the sentiment score findings across the United States. The District of Columbia (9.077), Utah (8. 7 71 ), and Connecticut (8 .692) are most supportive of solar energy, while Mi ssissippi ( 4.962 ), Alabama (5 .7 45), and Oklahoma (6.224) are least supportive. There is a statistically significant difference (one-way ANOVA,
F(3, = p = .028) across four U.S. Census regions (Northeast, Midwest , South , and West), but the Bartlett 's test with Bonferroni adjustment indicates that only the difference between the Northeast and the South is statistically significant (Northeast - South= .975, p = .042). Figure 3 displays average sentiment scores of 50 states and the District of Columbia (n =
51) between January 12 and July 7, 2020. The average sentiment score can vary from O (all tweet s are negative toward solar energy) to 10 (all tweets are positive toward solar energy). The state average sentiment score range s from 4.962 (Mississippi ) to 9.077 (District of Columbia). The 95% confidence intervals depend on the number of tweets collected from each state. The number of tweets from each state varies from 64 (North Dakota) to 11,788 (California) , with an overall mean of 1397 .294 and standard deviation of 1967.161, resulting in larger confidence intervals for North Dakota , South Dakota, and Wyomi ng. However , the number of tweets per 1 million people in the state population has a lower standard deviation of 307.08 7 for a mean of242.996, indicating that the average number of tweets adjusted for state population is more evenly distributed. Table 2 presents the descriptive statistics for all variables included in the statistical analysis. The average sentiment score of each state is calculated by averaging sentiment scores of all tweets from each state. Thus , the mean of the average sentiment scores (7 .546) in table 2 is slightly different from the national average (7 .835), as the means for states are based on non-weighted state averages. The highest correlation between the policy and market variables is .34, between solar market maturity and electricity price. Table 3 presents the result s from the regression of sentiment score on the renewable energy policy and market vari ables. In models 1-7 , we examine bivariate associations between sentiment and each renewable energy policy/market variable. The full model (model 8) includes all predictor variables. Model 1 demonstrates that there is no significant relationship between public sentiment and current renewable energy generation. In model 2, public sentiment on solar energy, however, is more positive in states which enacted higher annual RPS targets (P = .178, p < .01). We also find that states with more customer-friendly net metering policies (e.g., statewide net metering mechani sms, no capacity limitations , year-to-year rollover) have a more positive sentiment toward solar energy (P = .167, p < .01) in model 3. This finding is consistent in the full model (P = .115, p < .01). Model 3 suggests that net metering explains 18.8% of the variation in public sentiment on solar energy. The number of renewable energy incentives in states is positively correlated with public sentiment in model 4 (P = .005, p < .05), but this relationship is not statistically significant in the full model. In model 5, we find that solar energy is perceived more positively in states with a more mature solar market (P =.50 8, p < .01). This finding also hold s in the full model (P = .373, p < .01). The results from model 6 suggest average sentiment score is higher in state s with higher electricity price (P = .046, p < .05), but this association is not statistically significant in the full model. Lastly, Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 8 of 15 olar Energy Opinion Mining solar radiation is not associated with public sentiment on solar energy. The full model has no multicolline arity issues (mean VIP = State No . of Tweets District of Columbia 2,274 Utah 773 Conneticut Arkansas Hawaii Washington Massachusetts California Rhode Island Minnesota Colorado New Mexico New York Indiana Illinois Vermont Maryland Virginia Iowa Nevada Maine Oregon Delaware New Jersey Wisconsin Pennsylvania North Carolina Michigan New Hampshire Ohio Florida South Dakota North Dakota West Virginia Montana Idaho Arizona Kansas Nebraska
Missouri
Georgia Texas Wyoming Louisiana Kentucky Alas ka
Tennessee
South Carolina Oklahoma Alabama Mississippi 879 207 402 2,116 1,733 11,788 127 1,414 2,375 523 5,650 1,320 2,171 325 1,021 1,641 469 1,180 359 1,466 181 1,328 906 1,554 1,794 1,552 301 1,936 5,072 87 64 216 255 333 1,851 631 255 722 1,473 5,989
532 548 213 1,025 589 527 738 266 Score 9.077 8.771 8.692 8.502 8.483 8.426 8.413 8.380 8.346 8.260 8.248 8.222 8.191 8.167 8.125 8.123 8.100 8.062 8.038 7.941 7.939 7.906 7.901 7.877 7.859 7.780 7.547 7.519 7.409 7.392 7.376 7.356 7.344 7.269 7.25S 7.237 7.175 7.147 7.137 7.105 7.101 7.058 6.937 6.805 6.770 6.761 6.488 6.435 6.224 5.745 4.962
4. 5
6. 0 --- ~ ... --- ---- -- ~ ---- ~ -------- ~ Figure 3:
Average sentiment score in each state (n=71 ,262; national average=7.835 ; error bars indicate 95% confidence intervals) Kim, Ganesan, Dickens , and Panda:
Preprint submitted to Elsevier
Page of 15 olar Energy Opinion Mining Table 3
Public sentiment toward solar energy and renewable energy policy and market characteristics Variables (1) (2) (3) (4) (5) (6) (7) (8) Renewable generation .006 .005 ( .006) ( .005) RPS .178*** .106*** (.058) ( .029) Net metering .167*** .115*** (.048) (.037) Renewable incentives .005** .003 (.002) ( .002) Solar market maturity .508*** .373*** (.129) ( .119) Electricity price .046** .002 (.021) (.019) Solar radiation -.301 -.293 (.221 ) (.156) Number of states 51 51 51 51 51 51 51 51 R2 .016 .163 .188 .079 .193 .066 .048 .469 ***p < **p < *p < Robust standard errors are in parentheses.
5. Discussion
This paper aimed to understand how sentiment toward solar energy varies across states , and how policy and market factors explain suh sentiment. The results indicate overall sentiment toward solar energy is positive but substantial variation exists across states. As of early 2020, the Northea st region is most supportive of solar energy, while the South region is the lowest. Solar radiation, which determines the efficiency and performan ce of the solar energy generation system, does not explain geographical variation in public sentiment toward solar, although this may be a result of such radiation varying more widely in states with the largest land area. But we find that energy policie s and solar market maturity explain the variation in sentiment. This finding suggests that state energy policy programs could be effective measures to build public support for renewable energy. We also detect temporal variation in sentiment for the 6-month period of Twitter data collection. As shown in Figure 4, events can potentially shift public opinion on solar energy. The sharp decline in sentiment score on March 23-25 may be motivated by the congress' discussion on the inclusion of tax incentives for renewable energy in the Coronavirus Aid, Relief, and Economic Security (CARES) Act of 2020. To ensure the robustness of our results, we employed an alternative model excluding the tweets from March 23-25. As shown in Table Al , the main findings from the alternative model were consistent with the original model , except for the estimated effect of renewabl e incentives. In the alternative model , renewable incentives are statistically significant in explaining solar sentiment in both the bivariate and full models. The alternative model also explains more of the total variation (R = .499). Our analysis finds the proportion of energy generated from renewable sources doe s not relate to public sentiment toward solar energy, but the RPS targets appe ar to matter. That is, public opinion on solar energy aligns more closely with state renewable energy policy goals but not current renewable energy generation . Although our analysis does not examine causal relationships, these findings suggest public opinion and policy may affect each other: public support for solar energy may impact RPS policy adoption and design, but at the same time, a progre ssive RPS policy may help build public support for solar energy. The design and framing of RPS policie s may shape public support for renewable energy [59). Our analysis also finds that net metering policy is also a strong predictor of public opinion on solar energy. As net metering offers direct financial benefits to residential and commercial solar system owners, we suspect that state net metering policie s influence public support for solar energy. Con sidering the impact of net metering rules and retail rate s on consumer-side solar system deployment [12), specific design elements (e.g., compensation rate, capacity limitations , subscriber size limitations) , as opposed to simply having net metering rules, may play an important role in fostering positive public perceptions of solar energy. For example, indefinite rollover of excess renewable generation credits and compensation at retail electricity price s provides more explicit incentive s, which in turn may promote positive Kim, Ganesan, Dickens, and Panda:
Preprint submitted to Elsevier
Page 10 of 15 olar Energy Opinion Mining QI .. V VI .., C: QI E :;::; C: QI VI QI Cl "' .. QI > Figure 4: Trend of public sentiment toward solar energy from January 12, 2020 to July 07, 2020 perception of solar energy. Solar market maturity, measured by the number of solar jobs per 1 million residents in the state population, is one of the most important predictors of public sentiment toward solar energy. This finding suggests public support of solar energy could positively influence solar energy market growth, and vice versa. Public acceptance of solar energy may be one of the factors that attracts solar businesses. At the same time, economies of scale in the solar market could enable solar firms to lower system installation costs and grid fees, generating "trickle down" effects that benefit solar energy customers and increasing public support for solar energy. This study has a few key limitations. First, the Twitter data we collected may not be fully representative of all U.S. residents due to demographic disparities between Twitter users and non-users [2]. Twitter users tend to be younger and more politically liberal than the general public [69], and at the same time, younger and more liberal populations tend to favor renewable energy development [20]. Therefore, our estimation of the average sentiment score (7.835) is likely to be inflated compared to the true sentiment score. Considering sentiment classification can be highly dependent on the platform from which the training data are extracted, future research could address this limitation by incorporating data from multiple social media platforms. Second , our analysis captures public sentiment toward solar energy between late January and early July 2020, amid a global pandemic, which may limit the generalizability of the findings. Although the number of tweets used to estimate the average sentiment score is over 70,000, and the findings are robust to inclusion/exclusion of outlier values from March 23-25, the sentiment score could be made less sensitive if the data were from a longer period of time. In future research , we aim to collect data from multiple years and explore the sequential order of public sentiment and renewable energy policies to answer the question : Does public perception of solar energy precede renewable energy policie s, or vice versa? Finally, although our language model achieves high accuracy (90.7 %) with binary classification (positi ve or neg ative), it is not sufficient to capture fine-grained human emotions , such as happiness, joy , excitement, anger , sadness , frustration , fear, and sarcasm. As proposed by Abdar et al. [1], publi c preferences and attitudes on renewable energy can be gauged with multiple dimensions, including valence (positive versus negative) and arousal (high versu s low level of activation). Future research should aim to capture multidimensional emotion s, for example, by utilizing Sen tiBert to capture compositional sentiment semantics [76], modeling irony and sarcasm detection [5 l ], and/or explicitly modeling label semantics in guiding an encoder network [18]. 6. Conclusions This study aimed to understand public sentiment toward solar energy in the United States and examine the rela tionship between sentiment and renewable energy policy and market characteristics. We collected 71,262 tweets from late January to early July 2020 and analyzed the data using RoBERTa , a state -of-the -art langu age model. We find that public sentiment on solar energy varies widely across states . RPS and net metering policie s and solar market maturity are important predictors of the public sentiment toward solar energy. The main contributions of this study are as follows : i. We examine public opinion and sentiment specific to solar energy that has not been separately investigated as Kim, Ganesan, Dickens, and Panda: Preprint submitted to Elsevier Page 11 of 15 olar Energy Opinion Mining extensively by existing literature. ii. This study is one of few empirical studies using social media, ML, and NLP approache s to understand public opinion and sentiment regarding renewable energy. Applying RoBERTa, which is a state -of-the-art language model as of early 2020, we propo se a new method to examine public opinion and sentiment toward solar energy. Our sentiment classification model achieves high (90.7%) accuracy by using manually annotated tweets , which are area-specific and of higher quality, to fine-tune RoBERTa. iv. This study provides a comprehensive picture of the geographical variation in public sentiment regardin g solar energy across states. We demonstrate that this variation is explained by state policy and market characteristics while refuting the theory that it can be explained by local solar radiation amount s. v. Our analysis offers important policy implications as we provide empirical evidence of the positive relationship between public sentiment toward solar energy and renewable energy policy and market characteristi cs. States that wish to gain public support for solar energy may need to consider implementing consumer -friendly net metering policies and support state solar market growth. CRediT authorship contribution statement Serena Y. Kim: Conceptualization of the study, Language modeling , Data collection, Statistical and spatial anal ysis, Writing-original draft. Koushik Ganesan: Neural network modeling , Language modeling , Writing-review and ed-iting. Princess Dickens: Language modeling, Data curation , Writing-review and editing . Soumya Panda: Lan guage modeling, Data curation, Writing-review. Acknowledgement The authors thank Heeyoung Kwon, Katharina Kann , and William Swann for helpful suggestions. Appendix Table Al Public sentiment toward solar energy and renewable energy policy and market characteristics (Excluding outlier values from March 23-25, 2020) Variables (1) (2) (3) (4) (5) (6) (7) (8 ) Renewable generation .004 .002 (.005) (.005) RPS .146*** .082*** (.049) (.029) Net metering .120*** .081 ** (.045) (.034) Renewable incentives .005*** .003** (.002) (.002) Solar market maturity .457*** .333** (.100) (.124) Electricity price .045*** .011 (.016) (.015) Solar radiation -.165 -.164 (.178) (.125) Number of states 51 51 51 51 51 51 51 51 R2 .013 .177 .158 .106 .249 .099 .048 .499 *** p < p < *p < Preprint submitted to Elsevier Page 12 of 15 olar Energy Opinion Mining References & Microposts , pp. 34-41. [10) Bunting , A., 2004. Opposition to wind power: Can it be a catalyst for impro ving public understanding of energy usage, in: Technologies , Publics and Power Conference , February , Citeseer. pp. 1-1 3. [11) Bush , D., Hoagland, P., 2016. Public opinion and the environmental , economic and aesthetic impacts of offshore wind. Ocean & Coasta l Management 120, 70-79. [12) Cornelio , S., Reichelstein, S., 20 17. Cost competitiveness of residenti al solar pv: The impact of net metering restriction s. Renewable and Sustainable Energy Reviews 75, 46-57. [13) Darghouth , N.R. , Wiser, R.H. , Barbose , G., Mills , A.D. , 2016. Net meterin g and market feedback loops: Exploring the impact of retail rate design on distributed pv deployment. Applied Energy 162, 713-722. [14) Devlin, J., Chang, M.W., Lee, K., Toutanova, K. , & Social Science 29, 72-83. [17) Duon g, V., Pham, P., Yang, T., Wang, Y., Luo, J., 2020. The ivory tower lost: How college student s respond differently than the general public to the covid-19 pandemic. arXiv preprint arXiv:2004.09968. [18) Gaonkar, R., Kwon, H., Bastan , M., Balasubramanian, N., Chambers, N., 2020. Modeling label semantics for predictin g emotional reaction s. arXiv preprint arXiv:2006.05489 . [19) Hagen , B., Pijawka , D., 2015. Public perceptions and support of renewable energy in north amer ica in the context of global climate change. International Journal of Disaster Risk Science 6, 385-398. [20) Hamilt on, L.C., Hartter, J., Bell , E., 2019. Generation gaps in us public opinion on renewab le energy and climate change. PLOS one 14, e0217608. [21) Hasan, A., Moin , S., Karim, A., Shams hirband , S., 20 18. Machine learning-ba sed sentiment analysis for twitter account s. Mathemati cal and Computatio nal Application s 23, 11. [22) Herche, W., 2017. Solar energy strategies in the us utility market. Renewab le and Sustainable Energy Reviews 77, 590-595. [23) Jain, A., Jain, V., A., Inkpen, D., 2006. Sentiment class ification of movie reviews using contextual valence shifter s. Computational intellig ence 22, 110-1 25. [28) Khairnar , J., Kinikar , M., 2013. Machine learning algorithm s for opin ion mining and sentiment cla ssification. Internationa l Jour nal of Scie ntific and Research Publications 3, 1-6. [29) Kim, J., Park , S.Y., Lee, J., 2018. Do people really want renewable energy? who want s renewable energy?: Discrete choice model of reference-dependent preference in south korea . Energy Policy 120, 76 1- 770. [30) Kim , J.E., Tang, T., 2020. Preventing early lock-in with technology-specific policy design s: The renewable portfolio standards and diver sity in renewab le energy technologies. Renewable and Sustainabl e Energy Reviews 123, 109738. [31) Kim , S.Y., 2020. Institution al arrangements and airport solar pv. Energy Policy 143, 111536. [32) Kingma, D.P., Ba, J., Kim, Ganesan, Dickens, and Panda: Preprint submitted to Elsevier Page 13 of 15 olar Energy Opinion Mining [34] Ladenburg , J., 2010. Attitudes towards offshore wind farm s-t he role of beach visits on attitude and demographic and attitude relation s. Energy Policy 38, 1297-1304. [35] Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma , P., Soricut, R., 2019. Albert: A lite bert for self-supervised learnin g of language representations. arXiv preprint arXiv: 1909.11942 . [36] Lazard , 2019. Levelized cost of energy and levelized cost of storage. URL: https: / /wWT,T. lazard. com/perspect i ve/ lcoe2019 / . [37] Li, R., Crowe, J., Leifer, D., Zou, L., Schoof, J., & Social Science 56, IO I Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen , D., Levy, Lewis , M., Zettlemoyer , L., Stoyanov, V., 2019. Roberta : A robustly optimized bert pretraining approach. ArXiv abs/1907.J 1692. URL : https: / / arxiv. org/abs / 1907. 11692 . [39] Mathew, L., Bindu , V., 2020. A review of natural language processing technique s for sentim ent analysis using pre-train ed model s, in: 2020 Fourth International Conference on Computing Methodologie s and Communication (ICCMC), IEEE. pp. 340-345. [40] Miiller, M ., Salathe, M. , Kummervold, P.E., 2020. Covid-twitter-bert: A natural language processing model to analyse covid-19 content on twitter. arXiv preprint arXiv:2005.07503 . [ 41] Murphy, M., 2018. Prepro cessing. Available from https: / /pypi . org/pro j ect / preprocessing / . [ 42] NCCJeanEnergyTechnol ogyCe nter, 2019. Databa se of state incentives for renewables efficiency. URL: https : / / WT,TW. dsireusa. org . accessed: 2020-04-05. NC Clean Energy Technology Center. [43] Noblet, C.L., Teisl , M.F., Evans, K., Anderson, M.W., McCoy, S., Cervone, E., 2015 . Public preferences for investments in renewable energy production and energy efficiency. Energy Policy 87, 177-186. [44] Noll, D., Dawes, C., Rai , V., 2014. Solar community organizations and active peer effects in the adoption of residential pv. Energy policy 67, 330-343. [ 45] NREL , 2019. The national solar radiation databas e: Solar irradiance data. Available from https : // maps . nrel. gov / nsrdb-viewer . The National Renewable Energy Laboratory . [46] Nuortimo , K., Harki:inen, J., & Social Science 21, 167-179 . [48] Pak, A., Paroubek , P., 2010. Twitter based system: Using twitter for disambiguating sentiment ambiguou s adjectives, in: Proceedin gs of the 5th International Workshop on Semantic Evaluation , pp. 436-439 . [49] Poria , S., Cambria, E., Gelbukh , A., 2016. Aspect extraction for opinion mining with a deep convolutional neural network . Knowledge-Based Systems 108, 42-49. [50] Poria, S., Hazarika , D., Majumder, N., Mihalcea , R., 2020. Beneath the tip of the iceberg: Current challenges and new directions in sentiment analysi s research. arXiv preprint arXiv:2005.00357 . [51] Potamias , R.A., Siolas, G., Stafylopatis, A.G. , 2019. A transformer-based approach to irony and sarcas m detection. arXiv preprint arXiv:1911.10401. [52] Qazi, A., Hussain, F., Rahim , N.A., Hardaker , G., Alghazzawi , D., Shaban, K., Harun a, K., M. , Braga, A.C. , 2018. Modelling perception and attitudes towards renewable energy technologies. Renewable Energy 122, 688--097. [54] de! Rio, P., Mir-Artigues, P., 2012. Support for solar pv deployment in spain: Some policy lessons. Renewable and Sustainable Energy Reviews 16, 5557-5566. [55] Sanh, V., Debut, L., Chaumond, J., Wolf, T., A.J., Brun , S., 2015. Beyond the sun-socioeconomic drivers of the adoption of small-scale photovoltaic installations in germany. Energy Research & Social Science 10, 220-227. [58] Schumacher, K., Krones, F., McKenna, R., Schultmann , F., 2019. Public acceptance of renewable energie s and energy autonomy: A compar ative study in the french, german and swiss upper rhine region. Energy policy 126, 315-3 32. [59] Stokes, L.C., Warshaw, C., 2017. Renewable energy policy design and framing influence public support in the united states. Nature Energy 2, 1-6. [60] Siitterlin, B., Siegrist, M., 2017. Public accepta nce of renewable energy technologies from an abstract versus concrete perspective and the positive imagery of solar power. Energy Policy 106, 356--366. [61] Teisl, M .F., McCoy, S., Marrinan, S., Noblet, C.L., Johnson , T., Wibberly, M., Roper, R., Klein, S., 2015. Will offshore energy face "fair winds and following seas"?: Understanding the factors influencing offshore wind acceptance . Estuarie s and coasts 38, 279- 286. [62] TheSolarFoundation, 2020. National solar jobs census 2019. Available from https: / / WT,TW. Solar JobsCensus. org . Created: 2020-02-01 , Accessed: 2020-04-15. [63] Twitter,. Tweepy. Available from https: / / WWT,T. tweepy . org / . [64] U.S.EnergylnformationAdmini stration, 2019. Electricity generation by energy source. Available from https: / / WWT,T. eia. gov/ electricity /monthly / . Created: 2020-04-0 I, Accessed: 2020-04-15. [65] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin , I., . Attention is all you need, in: Proceedings of the 31st Internati onal Conference on Neural Informati on Process ing Systems. [66] Von Borgstede, C., Andersson , M., Johnsson, F., 2013. Public attitudes to climate change and carbon mitigation-impli cations for energy assoc iated behaviours. Energy Policy 57, 182-193. [67] Wan, Y., Gao, Q., 2015. An ensemble sentiment classification system of twitter data for airline services analysis, in: 2015 IEEE international Kim, Ganesan, Dickens, and Panda: Preprint submitted to Elsevier Page 14 of 15 olar Energy Opinion Mining conference on data mining workshop (ICDMW), IEEE. pp. 13 18-1325. [68] Wang, W., Wu, J ., pewresearch. org . [70] Wolf, T., Debut , L., Sanh, V., Chaumond, J., Delangue , C., Moi, A., Cistac, P., Rault , T., Louf, R., Funtowicz, M., Brew, J., 2019. Hugging face's tran sformers: State-of-the-art natural langua ge processing. Ar Xiv URL: https : / / arxi v. org/ abs/ [71] Wolsink, M., 2007. Wind power implementation: the nature of public attitudes: equity and fairness instead of 'backyard motive s'. Renewab le and sustainable energy reviews 11, 1188-120 7. [72] WorldPublicOpinion , 2007. World publics strongly favor requiring more wind and solar energy, more efficiency, even if it increases costs. Available from http://worldpublicopinion.net . [73] Wyllie , Essah, E.A., Ofetotse , E.L., J., mining: Identifying negative sentiment about hpv vaccines on twitter . Kim, Ganesan, Dickens, and Panda: