[PDF] Can Google searches help nowcast and forecast unemployment rates in the Visegrad Group countries?

Abstract

Online activity of the Internet users has been repeatedly shown to provide a rich information set for various research fields. We focus on the job-related searches on Google and their possible usefulness in the region of the Visegrad Group -- the Czech Republic, Hungary, Poland and Slovakia. Even for rather small economies, the online searches of their inhabitants can be successfully utilized for macroeconomic predictions. Specifically, we study the unemployment rates and their interconnection to the job-related searches. We show that the Google searches strongly enhance both nowcasting and forecasting models of the unemployment rates.

Full PDF

aa r X i v : . [ q -f i n . E C ] A ug Can Google searches help nowcast and forecastunemployment rates in the Visegrad Groupcountries?

Jaroslav Pavlicek and Ladislav Kristoufek , , , ∗ ∗ E-mail: [email protected]

Abstract

Online activity of the Internet users has been repeatedly shown to provide a rich informa-tion set for various research ﬁelds. We focus on the job-related searches on Google andtheir possible usefulness in the region of the Visegrad Group – the Czech Republic, Hun-gary, Poland and Slovakia. Even for rather small economies, the online searches of theirinhabitants can be successfully utilized for macroeconomic predictions. Speciﬁcally, westudy the unemployment rates and their interconnection to the job-related searches. Weshow that the Google searches strongly enhance both nowcasting and forecasting modelsof the unemployment rates.

Introduction

Online activity has become an inherent part of the modern society and the way of livingamong its members. The Internet provides a vast amount of information to its users aswell as an aid and assistance in times of need. During the current ﬁnancial and followingeconomic and production crises, most of the developed as well as developing economieshave been hit by an economic downturn which is usually tightly connected with a growingunemployment. Job loss can be a very traumatizing experience with long lasting impacton one’s life. Seeking a new job then becomes an integral part of an everyday life. Inthe current digitalized era, the job seeking does not restrict itself to job oﬃces but theseekers (as well as potential employers) more frequently turn to the Internet as a sourceof information and new possibilities. As such, the job seekers leave a digital track of theiractivity.Analysis and examination of various patterns of the online activity have become a fruit-ful branch of research in the last years with some exciting applications such as elections [1],investment allocation [2, 3], private consumption [4] and consumers’ behavior [5], futureorientation [6], earnings announcements [7], diseases spreading [8–12], and economics andﬁnance [13–19]. Turning back to the unemployment and its possible examination utiliz-ing the online activity of the Internet users, there has been some research done in thearea as well focusing primarily on the Google engine search queries. The ﬁrst study fo-cusing on the possible connection between Google searching activity and unemploymentrates examining the series in Germany shows usefulness of adding search queries data intothe models [20]. Following research [21–23] analyzes connection between the queries andclaims for unemployment beneﬁts in the USA and the unemployment rate itself has beenstudies as well [24, 25]. Even job search activity index based on the Google search datahas been developed [26]. Most of these studies focus on the US economy and its modelingwhile the other economies are studied rather marginally [27, 28].Here, we focus on possible connection between job-related search queries on the Googlesearch engine and the unemployment rate in countries of the so-called Visegrad Group (theCzech Republic, Hungary, Poland and Slovakia). Our contributions lay in the following.First, we focus on a set of countries which would be normally treated as a marginal oneand thus not much studied. However, if the utility of the online search activity (andspeciﬁcally the Google searching) is to be claimed, its eﬃciency should be shown notonly on developed and well covered countries but also on the smaller ones and the resultsmight prove useful to all policy makers even in such regions. Second, we provide a carefuland step-by-step procedure to the unemployment modeling focusing not only on simplecorrelations but also nowcasting, forecasting and causality. And third, a cross-countriescomparison is delivered which is rather unique in comparable studies focusing primarilyon one speciﬁc country.

Results

The unemployment rates have undergone quite heterogenous evolution in the analyzedcountries (Fig. 1). In the Czech Republic, the rate ranged between 4% and 9% betweenyears 2004 and 2013. Initially, there was a signiﬁcant downward trend from year 2004to 2008 when the rate dropped from 9% to 4%. As the recession hit the Czech Republicin 2008, the rate started to rise to reach its new maximum of 8.5% in 2010. Since then,the unemployment rate ﬂuctuated between 7% and 8.5%. The Hungarian unemploymentrate was steadily rising from the year 2004 to 2010 where it reached its new maximumof nearly 12%. After that the rate ﬂuctuated for almost 3 years between 10.5% and 12%to start declining in the year 2013. The unemployment in Poland experienced a steadydecline from the astronomical rate of nearly 22% in the year 2004 to 6% in 2009. However,as the recession hit Poland, the unemployment rate began rising again. With some minorﬂuctuations, it smoothly increased to the current level of approximately 10%. And inSlovakia, the unemployment rate seems to have a similar pattern as the one of the CzechRepublic, although on a diﬀerent scale. In 2004, Slovakia had an unemployment rate ofalmost 20%. This rate linearly decreased to 8% in 2009. With the hit of recession, theunemployment rate quickly escalated to 16% around which it has been ﬂuctuating untiltoday.The evolution of the Google searches is illustrated in Fig. 2. There are evident seasonalpatterns in all four series. Hungary is characterized by quite regularly increasing trend inthe Google searches whereas Slovakia shows the opposite and the remaining two analyzedseries remain quite stable in time. Even though there seems to be some connection betweenthe Google searches and the unemployment rates for the Czech Republic and Hungaryvisible by the naked eye, we can hardly claim any relationship without a proper analysis.

Basic relationship

As the initial step, we present the results of the stationarity tests which tell us whetherwe should analyze the original series or some of their transformations. In Tab. 1, we showthe results of the ADF and KPSS tests (see the Methods section for more details) for theoriginal as well as the logarithmic series and their ﬁrst diﬀerences. The outcome is quitestraightforward as we do not reject unit roots for either of the original series (or theirlogarithmic transformation for the Google searches, we do not examine the logarithmictransformation for the unemployment time series as these are already in the percentagerepresentation). Further testing, which is not reported here, shows no cointegration re-lationship between the unemployment and the search queries series so that we need toproceed with the ﬁrst diﬀerences of the series. For most of the cases, we support station-arity of the ﬁrst diﬀerences. In the analysis, we further proceed with the ﬁrst diﬀerencesof the unemployment rate and the ﬁrst logarithmic diﬀerences of the Google searches. Weopt for this combination as the pair of percentage representation and logarithmic transfor-mation allows for a straightforward interpretation as an elasticity, i.e. as a proportionalrelationship.For the very basic relationship between the unemployment rate and the intensity ofthe job-related searches on Google, we study the following equation∆UR t = α + α ∆ log(GI) t + ε t (1)where ∆UR t and ∆ log(GI) t stand for the ﬁrst diﬀerence of an unemployment rate at time t and the ﬁrst logarithmic diﬀerence of the Google searches at time t , respectively, for agiven country, and ε t is an error term.The elasticity between the Google searches and unemployment rate from Eq. 1 isestimated at 0.5538 (with the p =value of 0.0533), 0.2056 (0.0726), 0.3317 (0.2163) and0.4630 (0.0062) for the Czech Republic, Hungary, Poland and Slovakia, respectively, withthe heteroskedasticity and autocorrelation consistent (HAC) standard errors. The pro-portional relationship thus varies across the analyzed countries but it remains positive forall four and statistically signiﬁcant for three out of four (at least at the 10% signiﬁcancelevel). Speciﬁcally, the relationship is very strong for the Czech Republic and Slovakiawith the value around 0.5. This shows that the changes in the unemployment rate are wellprojected into the online search queries for the vacancies and job-related terms. Study-ing the connection between these two variables thus seems promising and worth furtherutilization and investigation. Nowcasting

Macroeconomic time series, such as the unemployment rates, have a special propertywhich is not present for ﬁnancial series or other series in natural sciences – they areavailable with a pronounced lag. This is due to the data processing and collection whichusually take several months and even after such period, there are sometimes correctionsto the reported values. Such characteristic makes a series, which is available immediatelywithout any lag and which is strongly correlated with the variable of interest, very usefulfor forecasting the present value of the variable without waiting for several months. Suchforecasting the present is usually referred to as “nowcasting”.In the previous section, we have shown that the Google searches for job-related termsare signiﬁcantly correlated with the unemployment rate which makes the search queriespotentially useful for nowcasting of the unemployment. As a nowcasting model, we con-sider the following one∆UR t = β + X i =3 β i ∆UR t − i + X j =0 γ j ∆ log(GI) t − j + ε t (2)where the unemployment rate is assumed to be available with a three months lag. Weagain consider the diﬀerenced series due to stationarity issues discussed above. Both seriesare kept to the lag of 12 months which controls for the seasonal pattern in both the series.The results of the nowcasting models are summarized in Tab. 2. There we showthe adjusted R ( ¯ R ) as a measure of the models’ quality controlling for the number ofexplanatory variables. We observe that for all countries, the inclusion of the Google seriesenhances the model strongly. The ¯ R increases by approximately a third for all countriesbut Poland for which it increases slightly less. Nonetheless, inclusion of the search queriesimproves the model for all countries signiﬁcantly as is reported by the F -statistics for theinsigniﬁcance of the searches. All series are jointly signiﬁcant even at the 1% level. Forecasting & Causality

The nowcasting results are very promising and they illustrate usefulness of the Googlesearches series. However, we are also interested whether such usefulness is mainly due tothe unavailability of the unemployment data or whether the search queries data provideadditional informative value as well. To do so, we also undergo a standard forecastingexercise where we practically hypothesize what would happen were the unemploymentdata available straightaway. If the Google series improve even such hypothetical model,we conclude that the search queries data bring additional information to the model inaddition to being strongly correlated with the changes in the unemployment rate byitself.For the forecasting exercise, we utilize the standard vector autoregressive model (VAR,see the Methods section for more details). The speciﬁc model takes the following form∆UR t = β + X i =1 β i ∆UR t − i + X i =1 γ i ∆ log(GI) t − i + ε t ∆ log(GI) t = β + X i =1 β i ∆UR t − i + X i =1 γ i ∆ log(GI) t − i + ε t (3)and it is compared to a simple autoregressive model of unemployment∆UR t = δ + X i =1 δ i ∆UR t − i + ν t . (4)For the comparison purposes, we use two measures of the forecasting quality – rootmean squared error and mean absolute error (RMSE and MAE, respectively, see theMethods section for more details). These measures are very straightforward – the lowerthey are the better performing the model is. In addition, we utilize the Diebold-Marianotest [29] which compares the forecasting performance of two models with the null hy-pothesis of the models performing the same (see the Methods section for more details).The model is estimated on the series between January 2004 and December 2012 and theforecasting period is set between January and December 2013.The summary of the forecasting performances is given in Tab. 3. There we cansee that for all countries, the forecasting performance of the models increases stronglywith the addition of the Google searches. This is further supported by the results ofthe Diebold-Mariano test which gives signiﬁcant results, i.e. the model using the Googledata outperforms the ones without them, for all countries at at least the 5% signiﬁcancelevel. The online search data thus evidently provide an additional informative value tothe unemployment modeling.As the last step of the analysis, we provide a causality examination. We are thusinterested in the speciﬁc relationship between the two analyzed series. Concretely, weexamine whether the increasing unemployment causes people to look up the job-relatedterms more, or the increased online activity signalizes potential tensions on the job market,or both ways, or none. To do so, we utilize the Granger causality framework (see theMethods section for more details) which is built on the VAR analysis. The results aresummarized in Tab. 3. Note that the null hypothesis of the Granger causality is “noGranger causality”. Therefore, if the null hypothesis is rejected, the causality is claimedto be found. The ﬁndings are quite homogenous. For three out of four countries (Hungarybeing the exception), we report causality in both directions. The inﬂuence thus goes fromboth directions and the series strongly inﬂuence each other. Discussion

Online activity of the Internet users has been proven useful in various ﬁelds. Nowcastingthe unemployment rate is one of these ﬁelds. Contrary to the prevailing trend in theliterature focusing on the well-developed (Western) countries, we have focused on utilizingthe job-related Google searches in the Visegrad Group countries, i.e. the Czech Republic,Hungary, Poland and Slovakia. Even though the data availability and utilization of theInternet might not be as widespread in the region as one would expect for the developedcountries, we have shown that in fact the online searches provide a very strong basis forthe unemployment modeling.In summary, we have shown that the basic dynamics of the Google searches for thejob-related terms closely follows the one of the unemployment rates. Further, we haveutilized this idea to successfully nowcast the unemployment rates using the current andlagged values of the Google searches. Such results have been shown to be caused notsimply by the fact that the unemployment rates are not immediately available but alsoby the additional informative value of the online searches. Our ﬁndings indicate thatthe information left online by the Internet users can be easily utilized even for small ormedium countries such as the ones of the Visegrad Group.

Methods

Data

The monthly unemployment data for the Czech Republic, Hungary, Poland and Slovakiahave been obtained from the Eurostat database. The basis of unemployment measurementamong the EU countries lies in the EU Labour Force Survey (EU LFS) – a continuous0and harmonized household survey, which is in accordance with the EU legislation carriedout in each member state. The monthly data from Eurostat are estimates based on theresults of EU LFS. Since there are no legal obligations for the EU countries to delivermonthly data, these data are often interpolated/extrapolated using national survey orregistered unemployment data.According to Eurostat, an unemployed person is deﬁned as someone aged between 15and 74 without work during the reference week who is available to start working withintwo weeks and who has actively sought employment at some time during the last fourweeks. In our analysis, we use the general (both sex, 15-74 years old) raw (not seasonallyadjusted) unemployment rate. We do this since we do not know the method used for theseasonal adjustment and the Google data are not seasonally adjusted either.The Google search queries data have been downloaded from the Google Trends web-page. As languages of the studied countries diﬀer, we have looked for various terms. Asthe Czech, Polish and Slovakian are all Slavonic languages, the searched words are verysimilar or even the same. For Czech, we searched for “pr´ace”, for Polish “praca” andfor Slovakian “pr´ace” as well. For Hungarian, we used term “´all´as”. For the Slavoniclanguages, the terms are equivalent to “job” or “work”, and for Hungarian, it is close to“job” or “work” but rather in a sense of looking for it. The term “´all´as” provides betterresults than a more straightforward “munka” which would be closer to a more standardmeanings of terms “job” or “work’.The weekly series obtained from the Google Trends site have been transformed to themonthly series on a basis of the number of days in the month basis. All series, both ofthe unemployment rate and the Google searches, are studied between January 2004 andDecember 2013.1

Stationarity

Stochastic process { z t } is stationary if for every collection of time indices 1 ≤ t < t < t m ,the joint probability distribution of ( x t , x t , ..., x t m ) is the same as the joint probabilitydistribution of ( x t h , x t h , ..., x t m + h ) for all integers h ≥ z t = α + θz t − + γt + ∆ z t − + ∆ z t − + · · · + ∆ z t − p + ε t (5)in order to perform the test, where α and γt are an intercept and a time trend, respec-tively, and p represents the lag order. The null hypothesis under which the series containsa unit root is found for H : θ = 0against the alternative H A : θ < . The ADF test statistics is then computed as usual t -statistics, which, however, follows amore complicated distribution under the null hypothesis. Due to the relative short timeseries, we set the number of lags arbitrarily to three.The null hypothesis of the KPSS test [32] is opposite to the one of the ADF test,i.e. the KPSS test has the null hypothesis of stationarity. The test is based on the OLS2regression of the series { z t } z t = α + γt + k t X i =0 ξ i + ε t (6)where α and γt again represent an intercept and a time trend, respectively, and ξ i areindependent and identically distributed random variables with a zero mean and a unitvariance. The null hypothesis of stationarity is found for H : k = 0against the alternative H A : k = 0 . The KPSS test statistic is deﬁned as

KP SS = P nt =1 S t n ˆ ω T where S t is partial sum of residuals S t = t X i =1 ˆ ε i and ˆ ω T is an estimator of the spectral density at a frequency zero.3 Vector autoregression

Vector autoregression (VAR) is simply a system of temporally dependent series. Moreprecisely, denote the number of variables k and the length of the series T , then VAR oforder p is generally represented by equation y t = α + A y t − + A y t − + · · · + A p y t − p + ε t (7)where y t and ε t are k × T matrices representing the studied series and residuals, respec-tively, α represents a vector of constants and A i are time invariant matrices replacing thetraditional β i coeﬃcients. The selection of appropriate lag order p is usually based on aspeciﬁc information criterion.In the VAR framework, the Granger causality concept is usually used as well. Thecausality testing simply stems in testing the joint signiﬁcance of one of the variables inthe equation for some other variable. The testing procedure is thus an F -test for jointsigniﬁcance of a speciﬁc variable. In needs to be noted that such causality is strictlystatistical and it should be always treated with caution. Forecasting

To compare forecasting accuracy of the proposed models, we utilize three measures –mean absolute error (MAE), root mean squared error (RMSE) and the Diebold-Marianotest [29].MAE measures the average value of absolute losses. In other words, it gives an average4deviation of forecast from realized value in absolute terms. It is given by the equation

M AE = 1 T T X i =1 | f i − y i | = 1 T T X i =1 a i (8)where f i stands for the predicted value, y i is the actual value and a i = | f i − y i | .RMSE is quite similar to the mean absolute error as it is simply a square root of themean squared error, and it is deﬁned as RM SE = vuut T T X i =1 ( f i − y i ) = vuut T T X i =1 s i (9)where f i stands for the predicted value y i is the actual value and s i = ( f i − y i ) .Diebold & Mariano [29] propose a test to compare the predictive accuracy of twocompeting forecasts. Let { ε t } Tt and { ε t } Tt be the sequences of forecast errors losses fromtwo competing forecasting measures by particular loss function (e.g. absolute error lossas a i in Eq. 8 or squared error loss as s i in Eq. 9). The null and alternative hypothesesare then stated as H : E { ε t } Tt = E { ε t } Tt H A : E { ε t } Tt = E { ε t } Tt . The Diebold-Mariano test assesses the accuracy based on the loss diﬀerential d t = { ε t } Tt − { ε t } Tt and the underlining null H : E { d t } = 0 . S = ¯ d q [ LRV ¯ d /T where ¯ d is the mean loss diﬀerential LRV ¯ d = γ + 2 ∞ X j =1 γ j , γ j = cov( d t , d t − j )and [ LRV ¯ d is a consistent estimate of the asymptotic (long-run) variance of √ T ¯ d . Underthe null hypothesis, the testing statistic goes to a standard normal distribution so that S A ∼ N (0 ,

1) [29].

Acknowledgments

The research leading to these results has received funding from the European Union’sSeventh Framework Programme (FP7/2007-2013) under grant agreement No. FP7-SSH-612955 (FinMaP) and the Czech Science Foundation project No. P402/12/G097 “DYME– Dynamic Models in Economics”.

References

1. Metaxas PT, Mustafaraj E (2012) Social media and the elections. Science 338:472-473.2. Mondria J, Wu T, Zhang Y (2010) The determinants of international investmentand attention allocation: Using internet search query data. Journal of International6Economics 82: 85 - 95.3. Kristoufek L (2013a) Can google trends search queries contribute to risk diversiﬁ-cation? Sci Rep 3.4. Vosen S, Schmidt T (2011) Forecasting private consumption: survey-based indica-tors vs. google trends. Journal of Forecasting 30: 565-578.5. Goel S, Hofman J, Lehaie S, Pennock DM, Watts DJ (2010) Predicting consumerbehavior with Web search. Proceedings of the National Academy of Sciences of theUnited States of America 7: 17486-17490.6. Preis T, Moat HS, Stanley HE, Bishop SR (2012) Quantifying the advantage oflooking forward. Scientiﬁc Reports 2: 350.7. Drake MS, Roulstone DT, Thornock JR (2012) Investor information demand: Evi-dence from google searches around earnings announcements. Journal of AccountingResearch 50: 1001-1040.8. Polgreen PM, Chen Y, Pennock DM, Nelson FD, Weinstein RA (2008) Using inter-net searches for inﬂuenza surveillance. Clinical Infectious Diseases 47: 1443-1448.9. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, et al. (2009) De-tecting inﬂuenza epidemics using search engine query data. Nature 457: 1012-1014.10. Carneiro H, Mylonakis E (2009) Google Trends: A Web-Based Tool for Real-TimeSurveillance of Disease Outbreaks. Clinical Infectious Diseases 49: 1557-64.11. Seifter A, Schwarzwalder A, Geis K, Aucott J (2010) The utility of “Google Trends”for epidemiological research: Lyme disease as an example. Geospatial Health 4:135-137.712. Dugas A, Hsieh YH, Levin S, Pines J, Mareiniss D, et al. (2012) Google Flu Trends:Correlation With Emergency Department Inﬂuenza Rates and Crowding Metrics.Clinical Infectious Diseases 54: 463-469.13. Preis T, Reith D, Stanley HE (2010) Complex dynamics of our economic life ondiﬀerent scales: insights from search engine query data. Philosophical Transactionsof the Royal Society A 368: 5707-5719.14. Choi H, Varian HR (2012) Predicting the present with google trends. The EconomicRecord 88: 2-9.15. Bordino I, Battiston S, Caldarelli G, Cristelli M, Ukkonen A, et al. (2012) Websearch queries can predict stock market volumes. PLoS One 7: e40014.16. Kristoufek L (2013b) Bitcoin meets google trends and wikipedia: Quantifying therelationship between phenomena of the internet era. Sci Rep 3: 3415.17. Preis T, Moat HS, Stanley HE (2013) Quantifying trading behavior in ﬁnancialmarkets using google trends. Scientiﬁc Reports Volume 3: Article number 1684.18. Moat HS, Curme C, Avakian A, Kenett DY, Stanley HE, et al. (2013) Quantifyingwikipedia usage patterns before stock market moves. Scientiﬁc Reports 3: 1801.19. Curme C, Preis T, Stanley H, Moat H (2014) Quantifying the semantics of searchbehavior before stock market moves. Proceedings of the National Academy ofSciences doi:10.1073/pnas.1324054111.20. Askitas N, Zimmermann KF (2009) Google econometrics and unemployment fore-casting. Applied Economics Quarterly 55: 107-120.821. Choi H, Varian HR (2009b) Predicting initial claims for unemployment beneﬁts.Technical report, Google.22. Bughin JR (2011) ’nowcasting’ the belgian economy. Working papers series, Uni-versit Libre de Bruxelles (ULB) - European Center for Advanced Research in Eco-nomics and Statistics (ECORE) ; McKinsey & Company.23. Scott SL, Varian HR (2014) Predicting the present with bayesian structural timeseries. Int J of Mathematical Modelling and Numerical Optimisation 5: pp.4-23.24. DAmuri F, Marcucci J (2010) Google it!Forecasting the US Unemployment Ratewith a Google Job Search index. Working Papers 2010.31, Fondazione Eni EnricoMattei.25. D’Amuri F, Marcucci J (2012) The predictive power of Google searches in forecast-ing unemployment. Temi di discussione (Economic working papers) 891, Bank ofItaly, Economic Research and International Relations Area.26. Baker S, Fradkin A (2011) What Drives Job Search? Evidence from Google SearchData. Discussion Papers 10-020, Stanford Institute for Economic Policy Research.27. Chadwick MG, Sengul G (2012) Nowcasting Unemployment Rate in Turkey : Let’sAsk Google. Working Papers 1218, Research and Monetary Policy Department,Central Bank of the Republic of Turkey.28. Karam F, Fondeur Y (2012) Can Google Data Help Predict French Youth Un-employment? Documents de recherche 12-03, Centre d’tudes des Politiquesconomiques (EPEE), Universit d’Evry Val d’Essonne.929. Diebold FX, Mariano RS (1995) Comparing Predictive Accuracy. Journal of Busi-ness & Economic Statistics 13: 253-63.30. Wooldridge J (2008) Introductory Econometrics: A Modern Approach. ISE - In-ternational Student Edition. Cengage Learning.31. Dickey DA, Fuller WA (1979) Distribution of the estimators for autoregressive timeseries with a unit root. Journal of the American statistical association 74: 427–431.32. Kwiatkowski D, Phillips PC, Schmidt P, Shin Y (1992) Testing the null hypothesisof stationarity against the alternative of a unit root: How sure are we that economictime series have a unit root? Journal of Econometrics 54: 159 - 178.0

Figures U ne m p l o y m en t r a t e i n % Czech RepublicHungaryPolandSlovakia

Figure 1.

Unemployment rate in the Visegrad countries.

The group of countries is evidentlyquite heterogenous in the unemployment rates. The Hungarian rate starts at the lowest level butincreases stably during the whole period. The Czech rate begins at quite low levels and decreases up tothe outbreak of the ﬁnancial crisis when the rate surges up until 2010 after which it remains quitestable. The Polish and Slovakian rates commence at very high levels of unemployment which go downagain up until the outbreak of the crisis after which they change the trends similarly to the Czech rate.

20 30 40 50 60 70 80 90 100 2004 2006 2008 2010 2012 2014 G oog l e i nde x Czech RepublicHungaryPolandSlovakia

Figure 2.

Google search queries for the job-related terms in the Visegrad countries.

Thepatterns are again quite heterogenous and the connection between the Google searches and theunemployment rates can be observed for the Czech and Hungarian rates. For the other two, theconnection is not visible by the naked eye. Detailed treatment of the interconnections is given in theResults section of the text.

Tables

Table 1.

Stationarity testingCzech Rep. Hungary Poland Slovakia

ADF test

Unemployment -1.6066 -1.78611 -2.6739 ∗ -2.6438 ∗ - ﬁrst diﬀerence -5.3860 ∗∗∗ -4.5134 ∗∗∗ -3.5267 ∗∗∗ -4.2349 ∗∗∗ Google -2.6399 ∗ -1.76434 -2.1745 -1.4327- logarithm -2.6504 ∗ -2.0239 -2.3280 -0.9082- diﬀerence -11.2221 ∗∗∗ -10.2869 ∗∗∗ -11.1560 ∗∗∗ -10.5487 ∗∗∗ - logarithmic diﬀerence -11.3131 ∗∗∗ -10.7993 ∗∗∗ -11.0750 ∗∗∗ -10.5391 ∗∗∗ KPSS test

Unemployment 0.5399 ∗∗ ∗∗∗ ∗∗∗ ∗∗∗ - ﬁrst diﬀerence 0.1932 0.2848 0.6708 ∗∗ ∗∗ Google 0.4208 ∗∗ ∗∗∗ ∗∗∗ ∗∗∗ - logarithm 0.4048 ∗ ∗∗∗ ∗∗∗ ∗∗∗ - diﬀerence 0.0730 0.1406 0.1358 0.0436- logarithmic diﬀerence 0.0818 0.1201 0.1362 0.08212 Table 2.

Nowcasting summaryCzech Rep. Hungary Poland Slovakia¯ R without Google 0.2796 0.3590 0.4673 0.1605with Google 0.3763 0.4469 0.5521 0.2193Google insigniﬁcant F -stat 5.8507 2.5828 2.5251 2.7135 p -value 0.0000 0.0089 0.0057 0.0031 Table 3.

Forecasting and causality summaryCzech Rep. Hungary Poland SlovakiaRMSE no Google 0.3686 0.3156 0.1990 0.1907Google 0.3216 0.2742 0.1467 0.1359change -12.75% -13.11% -26.28% -28.73%MAE no Google 0.3100 0.2183 0.1437 0.1638Google 0.2744 0.1889 0.1164 0.1009change -11.50% -13.49% -19.01% -38.41%DM test test statistic 2.4160 2.8870 2.0220 1.7750 p -value 0.0078 0.0019 0.0216 0.0379Google → Unemployment test statistic 6.0311 1.1163 3.8285 3.1555 p -value 0.0000 0.3614 0.0002 0.0012Unemployment → Google test statistic 2.5545 1.4951 2.5635 2.5641 pp