[PDF] A rapidly updating stratified mix-adjusted median property price index model

Abstract

Homeowners, first-time buyers, banks, governments and construction companies are highly interested in following the state of the property market. Currently, property price indexes are published several months out of date and hence do not offer the up-to-date information which housing market stakeholders need in order to make informed decisions. In this article, we present an updated version of a central-price tendency based property price index which uses geospatial property data and stratification in order to compare similar houses. The expansion of the algorithm to include additional parameters owing to a new data structure implementation and a richer dataset allows for the construction of a far smoother and more robust index than the original algorithm produced.

Full PDF

AA rapidly updating stratiﬁed mix-adjusted medianproperty price index model

Robert Miller ∗ and Phil Maguire † Dept. of Computer Science, National University of Ireland, Maynooth,Kildare, Ireland.Email: ∗ [email protected], † [email protected] c (cid:13) Abstract —Homeowners, ﬁrst-time buyers, banks, governmentsand construction companies are highly interested in following thestate of the property market. Currently, property price indexesare published several months out of date and hence do not offerthe up-to-date information which housing market stakeholdersneed in order to make informed decisions. In this article, wepresent an updated version of a central-price tendency basedproperty price index which uses geospatial property data andstratiﬁcation in order to compare similar houses. The expansionof the algorithm to include additional parameters owing to anew data structure implementation and a richer dataset allowsfor the construction of a far smoother and more robust indexthan the original algorithm produced.

I. I

NTRODUCTION

House price indexes provide vital information to the po-litical, ﬁnancial and sales markets, affecting the operationand services of lending institutions greatly and inﬂuencingimportant governmental decisions [1]. As one of the largestasset classes, house prices can even offer insight regarding theoverall state of the economy of a nation [2]. Property valuetrends can predict near-future inﬂation or deﬂation and alsohave a considerable effect on the gross domestic product andthe ﬁnancial markets [3], [4].There are a multitude of stakeholders interested in thedevelopment and availability of an algorithm which can offeran accurate picture of the current state of the housing market,including home buyers, construction companies, governments,banks and homeowners [5], [6].Due to the recent global ﬁnancial crisis, house price indexesand forecasting models play a more crucial role than ever. Thekey to providing a more robust and up-to-date overview of thehousing market lies in machine learning and statistical analysison set of big data [7]. The primary aim is the improvementof currently popular algorithms for calculating and forecastingprice changes, while making such indexes faster to computeand more regularly updated. Such advances could potentiallyplay a key role in identifying price bubbles and preventingfuture collapses in the housing market [8], [9].Hedging against market risk has been shown to be po-tentially beneﬁcial to all stakeholders, however, it relies onhaving up-to-date and reliable price change information whichis generally not publicly available [7], [10]. This restrictsthe possibility of this tool becoming a mainstream option tohomeowners and small businesses. In this article, we will expand upon previous work by [5]on a stratiﬁed, mix-adjusted median property price modelby applying that algorithm to a larger and richer dataset ofproperty listings and explore the enhancements in smoothnessoffered by evolving the original algorithm enabled by the useof a new data structure [11].II. P

ROPERTY PRICE INDEX MODELS

In this section we will detail the three main classes ofexisting property price indexes. These consist of the hedonicregression , repeat-sales and central-price tendency methods. A. Hedonic Regression

Hedonic regression [12] is a method which considers allof the characteristics of a house (eg. bedrooms, bathrooms,land size, location etc.) and calculates how much weight eachof these attributes have in relation to the overall price of thehouse. While it has been shown to be the most robust measurein general by [13], outperforming the repeat-sales and mix-adjusted median methods, it requires a vast amount of detaileddata and the interpretation of an experienced statistician inorder to produce a result [5], [14].As hedonic regression rests on the assumption that theprice of a property can be broken down into its integral at-tributes, the algorithm in theory should consider every possiblecharacteristic of the house. However, it would be impracticalto obtain all of this information. As a result, specifying acomplete set of regressors is extremely difﬁcult [15].The great number of free parameters which require tuningin hedonic regression also leads to a high chance of overﬁttingthe model [5].

B. Repeat-sales

The repeat-sales method [16] is the most commonly usedmethod of reporting housing sales in the United States and usesrepeated sales of the same property over long periods of timeto calculate change. An enhanced, weighted version of thisalgorithm was explored by [17]. The advantage of this methodcomes in the simplicity of constructing and understanding theindex; historical sales of the same property are compared witheach other and thus the attributes of each house need not beknown nor considered. The trade-off for this simplicity comesat the cost of requiring enormous amounts of data stretchedacross long periods of time [18]. a r X i v : . [ s t a t . A P ] S e p t has also been theorised that the sample of repeat salesis not representative of the housing market as a whole. Forexample, in a study by [19], only 7% of detached homeswere resold in the study period, while 30% of apartmentshad multiple sales in the same dataset. It is argued that thisphenomenon occurs due to the ’starter home hypothesis’:houses which are cheaper and in worse condition generally sellmore frequently due to young homeowners upgrading [19],[20], [21]. This leads to over-representation of inexpensiveand poorer quality property in the repeat-sales method. Cheaphouses are also sometimes purchased for renovation or aresold quickly if the homeowner becomes unsatisﬁed with them,which contributes to this selection bias [19]. Furthermore,newly constructed houses are under-represented in the repeat-sales model as a brand new property cannot be a repeat saleunless it is immediately sold on to a second buyer [20].As a result of the low number of repeat transactions, anoverwhelming amount of data is discarded [22]. This leads togreat inefﬁciency of the index and its use of the data availableto it. In the commonly used repeat-sales algorithm by [17],almost 96% of the property transactions are disregarded dueto incompatibility with the method [15]. C. Central Price Tendency

Central-price tendency models have been explored as analternative to the more commonly used methods detailedpreviously. The model relies on the principle that large setsof clustered data tend to exhibit a noise-cancelling effect andresult in a stable, smooth output [5]. Furthermore, central pricetendency models offer a greater level of simplicity than thehighly-theoretical hedonic regression model. When comparedto the repeat sales method, central tendency models offer moreefﬁcient use of their dataset, both in the sense of quantity andtime period spread [5], [23].According to a study of house price index models by [13],the central-tendency method employed by [23] signiﬁcantlyoutperforms the repeat-sales method despite utilising muchsmaller dataset. However, the method is criticised as it doesnot consider the constituent properties of a house and is thusmore prone to inaccurate ﬂuctuations due to a differing mix ofsample properties between time periods [13]. For this reason,[13] ﬁnds that the hedonic regression model still outperformsthe mix-adjusted median model used by [23]. Despite this, thesimplicity and data utilisation that the method offers deservecredit were argued to justify these drawbacks [23], [13].An enhancement to the mix-adjusted median algorithm by[23] was later shown to outperform the robustness of thehedonic regression model used by the Irish Central StatisticsOfﬁce [5], [24]. The primary drawback of this algorithm waslong execution time and high algorithmic complexity dueto brute-force geospatial search, limiting the algorithm frombeing further expanded, both in terms of algorithmic featuresand the size of the dataset [11].

D. Improvement Attempts

With an aim to overcome the issue of algorithmic com-plexity in the method described by [5], a niche data structurewas designed primarily for the purpose of greatly speedingup the geospatial proximity search with the aim of sacri-ﬁcing minimal algorithmic precision. The

GeoTree offers asubstantial performance improvement when applied to theoriginal algorithm while producing an almost identical index[11]. Through application of the GeoTree, the restrictionson the original algorithm have been lifted and we can nowexplore the performance of an evolved implementation of thealgorithm on a richer, alternative dataset while introducingfurther parameters.III. C

ASE S TUDY : M Y H OME P ROPERTY L ISTING D ATA

MyHome [25] are a major player in property sale listings inIreland. With data on property asking prices being collectedsince 2011, MyHome have a rich database of detailed dataregarding houses which have been listed for sale. MyHomehave provided access to their dataset for the purposes of thisresearch.

A. Dataset Overview

The data provided by MyHome includes veriﬁed GPS co-ordinates, the number of bedrooms, the type of dwelling andfurther information for most of its listings. It is important tonote, however, that this dataset consists of asking prices, ratherthan the sale prices featured in the less detailed Irish PropertyPrice Register Data (used in the original algorithm) [5].The dataset consists of a total of 718,351 property listingrecords over the period February 2011 to March 2019 (inclu-sive). This results in 7,330 mean listings per month (with astandard deviation of 1,689), however, this raw data requiressome ﬁltering for errors and outliers.

B. Data Filtration

As with the majority of human collected data, some pruningmust be done to the MyHome dataset in order to removeoutliers and erroneous data. Firstly, not all transactions inthe dataset include veriﬁed GPS co-ordinates or include dataon the number of bedrooms. These records will be instantlydiscarded for the purpose of the enhanced version of the al-gorithm. They account for 16.5% of the dataset. Furthermore,any property listed with greater than six bedrooms will notbe considered. These properties are not representative of astandard house on the market as the number of such listingsamounts to just 1% of the entire dataset.Any data entries which do not include an asking pricecannot be used for house price index calculation and mustbe excluded. Such records amount to 3.6% of the dataset.Additionally, asking price records which have a price of lessthan e e C. Comparison with PPR Dataset

The mean number of ﬁltered monthly listings available inour dataset represents a 157% increase on the 2,200 meanmonthly records used in the original algorithm’s index compu-tation [5]. Furthermore, the dataset in question is signiﬁcantlymore precise and accurate than the PPR dataset, owing to theability to more effectively prune the dataset. The PPR datasetconsists of address data entered by hand from written docu-ments and does not use the Irish postcode system, meaning thataddresses are often vague or ambiguous. This results in someerroneous data being factored into the model computation asthere is no effective way to prune this data [5]. The MyHomedataset has been ﬁltered to include veriﬁed addresses only, asdescribed previously.The PPR dataset has no information on the number ofbedrooms or any key characteristics of the property. This canresult in dilapidated properties, apartment blocks, inheritedproperties (which have an inaccurate sale value which is usedfor taxation purposes) and mansions mistakenly being countedas houses [5]. Our dataset consists of only single propertiesand the ﬁltration process described previously greatly reducesthe number of such unrepresentative samples making their wayinto the index calculation.The ”sparse and frugal” PPR dataset was capable of out-performing the CSO’s hedonic regression model with a mix-adjusted median model [5]. With the larger, richer and morewell-pruned MyHome dataset, further algorithmic enhance-ments to this model are possible.IV. P

ERFORMANCE M EASURES

Property prices are generally assumed to change in asmooth, calm manner over time [26] [27]. According to [5],the smoothest index is, in practice, the most robust index.As a result of this, smoothness is considered to be one ofthe strong indicators of reliability for an index. However,the ’smoothness’ of a time series is not well deﬁned norimmediately intuitive to measure mathematically.The standard deviation of the time series will offer someinsight into the spread of the index around the mean indexvalue. A high standard deviation indicates that the indexchanges tend to be large in magnitude. While this is usefulin investigating the ”calmness” of the index (how dramatic itschanges tend to be), it is not a reliable smoothness measure,as it is possible to have a very smooth graph with sizeablechanges.The standard deviation of the differences is a much morereliable measure of smoothness. A high standard deviation ofthe differences indicates that there is a high degree of varianceamong the differences ie. the change from point to point is unpredictable and somewhat wild. A low value for this metricwould indicate that the changes in the graph behave in a morecalm manner.Finally, we present a metric which we have deﬁned, the mean spike magnitude µ ∆ X (MSM) of a time series X . Thisis intended to measure the mean value of the contrast betweenchanges each time the trend direction of the graph ﬂips. Inother words, it is designed to measure the average size of the’spikes’ in the graph.Given D X = { d , . . . , d n } is the set of differences in thetime series X , we say that the pair ( d i , d i +1 ) is a spike if d i and d i +1 have different signs. Then S i = | d i +1 − d i | is thespike magnitude of the spike ( d i , d i +1 ) .The mean spike magnitude of X is deﬁned as: µ ∆ X = 1 | S X | (cid:88) S ∈ S X S where: S X = { S , S , ..., S t } is the set of all spike magnitudes of X V. A

LGORITHMIC E VOLUTION

A. Original Price Index Algorithm

The central price tendency algorithm introduced by [5]was designed around a key limitation; extremely frugal data.The only data available for each property was location, saledate and sale price. The core concept of the algorithm relieson using geographical proximity in order to match similarproperties historically for the purpose of comparing saleprices. While this method is likely to match certain propertiesinaccurately, the key concept of central price tendency is thatthese mismatches should average out over large datasets andcancel noise.The ﬁrst major component of the algorithm is the votingstage. The aim of this is to remove properties from thedataset which are geographically isolated. The index relies onmatching historical property sales which are close in locationto a property in question. As a result, isolated properties willperform poorly as it will not be possible to make sufﬁcientlynear property matches for them.In order to ﬁlter out such properties, each property in thedataset gives one vote to its closest neighbour, or a certain,set number of nearest neighbours. Once all of these voteshave been casted, the total number of votes per property isenumerated and a segment of properties with the lowest votesis removed. In the implementation of the algorithm used in[5], this amounted to ten percent of the dataset.Once the voting stage of the algorithm is complete, thenext major component is the stratiﬁcation stage. This is thecore of the algorithm and involves stratifying average propertychanges on a month by month comparative basis which thenserve as multiple points of reference when computing the over-all monthly change. The following is a detailed explanation ofthe original algorithm’s implementation.irst, take a particular month in the dataset which will serveas the stratiﬁcation base, m b . Then we iterate through eachhouse sale record in m b , represented by h m b . We must nowﬁnd the nearest neighbour of h m b in each preceding month inthe dataset, through a proximity search. For each prior month m x to m b , refer to the nearest neighbour in m x to h m b inquestion as h m x . Now we are able to compute the changebetween the sale price of h m b and the nearest sold neighbourto h in each of the months { m , . . . , m n } as a ratio of h m b to h m x for x ∈ { , . . . , n } . Once this is done for every propertyin m b , we will have a scenario such that there is a catalogueof sale price ratios for every month prior to m and thus wecan look at the median price difference between m and eachhistoric month.However, this is only stratiﬁcation with one base, referredto as stage three in the original article [5]. We then expand thealgorithm by using every month in the dataset as a stratiﬁcationbase. The result of this is that every month in the dataset nowhas price reference points to every month which preceded itand we can now use these reference points as a way to comparemonth to month.Assume that m x and m x +1 are consecutive months inthe dataset and thus we have two sets of median ratios { r x ( m ) , . . . , r x ( m x − ) } and { r x +1 ( m ) , . . . , r x +1 ( m x ) } where r a ( m y ) represents the median property sale ratio be-tween months m a and m y where m a is the chosen stratiﬁca-tion base. In order to compute the property price index changefrom m x to m x +1 , we look at the difference between r x ( m i ) and r x +1 ( m i ) for each i ∈ , . . . , x − and take the meanof those differences. As such, we are not directly comparingeach month, rather we are contrasting the relationship of bothmonths in question to each historical month and taking anaveraging of those comparisons.This results in a central price tendency based property indexthat outperformed the national Irish hedonic regression basedindex while using a far more frugal set of data to do so. B. GeoTree

The largest drawback of the original index lies in thecomputational complexity; it is extremely slow to run. This isdue to the performance impact of requiring repeated search forneighbours to each data point. This limitation was responsiblefor preventing the algorithm scaling to larger datasets, morereﬁned time periods and more regular updating. A custom datastructure, the GeoTree, was developed in order to trade off asmall amount of accuracy in return for the ability to retrieve acluster of neighbours to any property in constant time [11].This data structure relies on representing the geographicallocation of properties as geohash strings.The GeoTree data structure functions by placing the geohashcharacter by character into a tree-structure where each branchat each level represents an alphanumeric character. Under eachbranch of the tree there is also a list node which caches all ofthe property records which exist as an entry in that subtree,allowing the O (1) retrieval of those records. The number ofsequential characters in common from the start of a pair of geohashes puts a bound on the distance between those twogeohashes. Thus, by traversing down the tree and queryingthe list nodes, the GeoTree can return a list of approximatenearest neighbours in O (1) time [11].As can be seen in [11, Table I], the performance improve-ment to the index offered by the GeoTree is profound andsacriﬁces very little in terms of precision, with the resultingindexes proving close to identical. This development allowsthe scope of the index algorithm to be widened, including theintroduction of larger datasets with richer data, more frequentupdating and the development of new algorithmic features,some of which will be explored in this article. C. Geohash + Extended geohashes, which we will refer to as geohash + ,are geohashes which have been modiﬁed to encode additionalinformation regarding the property at that location. Additionalparameters are encoded by adding a character in front ofthe geohash. The value of the character at that positioncorresponds to the value of the parameter which that characterrepresents. Figure 1 demonstrates the structure of a geohash + with two additional parameters, p and p .geohash + : p p (cid:124)(cid:123)(cid:122)(cid:125) + x . . . x n (cid:124) (cid:123)(cid:122) (cid:125) geohash Fig. 1: geohash + formatAny number of parameters can be prepended to the geohash.In the context of properties, this includes the number ofbedrooms, the number of bathrooms, an indicator of the typeof property (detached house, semi-detached house, apartmentetc.), a parameter representing ﬂoor size ranges and any otherattribute desired for comparison.Alternative applications of geohash + could include a situa-tion where a rapid survey of nearby live vehicles of a certaintype is required. If we prepend a parameter to the geohashlocations of vehicles representing that vehicle’s type, eg: forcars, for vans, for motorcycles and so forth, we can usethe GeoTree data structure to rapidly survey the SCBs arounda particular vehicle, with separate SCBs generated for eachtype automatically. D. GeoTree Performance with geohash + Due to the design of the GeoTree data structure, a geohash + will be inserted into the tree in exactly the same manneras a regular geohash [11]. If the original GeoTree had aheight of h for a dataset with h -length geohashes, then theGeoTree accepting that geohash extended to a geohash + with p additional parameters prepended should have a height of h + p . However, both of these are ﬁxed, constant, user-speciﬁedparameters which are independent of the number of datapoints, and hence do not affect the constant-time performanceof the GeoTree.The major beneﬁt of this design is that the ranged proximitysearch will interpret the additional parameters as regulareohash characters when constructing the common bucketsupon insertion, and also when ﬁnding the SCB in any search,without introducing additional performance and complexitydrawbacks. E. Enhanced Price Index

In order to enhance our price index model, we prepend aparameter to the geohash of each property representing thenumber of bedrooms present within that property. As a result,when the GeoTree is performing the SCB computation, it willnow only match properties which are both nearby and sharethe same number of bedrooms. This allows the index model tocompare the price of properties which are more similar acrossthe time series and thus should result in a smoother, moreaccurate measure of the change in prices over time.The technical implementation of this algorithmic enhance-ment is handled almost entirely by the GeoTree automatically,due to its design. As described previously, the GeoTree seesthe additional parameter no differently to any other characterin the geohash and due to its placement at the start ofthe geohash, the search space will be instantly narrowed toproperties with matching number of bedrooms, x , by takingthe x branch in the tree at the ﬁrst step of traversal.VI. R ESULTS

We ran the algorithm on the MyHome data without factoringany additional parameters as a control step. We then createda GeoTree with geohash + entries consisting of the numberof bedrooms in the house prepended to the geohash for theproperty. A. Comparison of Time Series

Table I shows the performance metrics previously describedapplied to the algorithms discussed in this paper: OriginalPPR, PPR with GeoTree, MyHome without bedroom factoringand MyHome with bedroom factoring. While both the standarddeviation of the differences and the MSM show that somesmoothness is sacriﬁced by the GeoTree implementation of thePPR algorithm, the index running on MyHome’s data withoutbedroom factoring approximately matches the smoothness ofthe original algorithm. Furthermore, when bedroom factoringis introduced, the algorithm produces by far the smoothestindex, with the standard deviation of the differences being26.2% lower than the PPR (original) algorithm presented in[5], while the MSM sits at 58.2% lower.If we compare the MyHome results in isolation, we canclearly observe that the addition of bedroom matching makesa very signiﬁcant impact on the index performance. Whilethe trend of each graph is observably similar, Figure 2demonstrates that month to month changes are less erraticand appear less prone to large, spontaneous dips. Consideringthe smoothness metrics, the introduction of bedroom factoringgenerates a decrease of 26.8% in the standard deviation ofthe differences and a decrease of approximately 48.4% in theMSM. These results show a clear improvement by tighteningthe accuracy of property matching and are promising for the potential future inclusion of additional parameters such asbedroom matching should such data become available.Figure 2 corresponds with the results of these metrics, withthe

MyHome data (bedrooms factored) index appearing thesmoothest time series of the four which are compared. It isimportant to note that the PPR data is based upon actual saleprices, while the MyHome data is based on listed asking pricesof properties which are up for sale and as such, may producesomewhat different results.It is a well known fact that properties sell extremely wellin spring and towards the end of the year, the former beingthe most popular period for property sales. Furthermore, themonths towards late summer and shortly after tend to be theleast busy periods in the year for selling property [28]. Thesephenomena can be observed in Figure 2 where there is adramatic increase in the listed asking prices of properties inthe spring months and towards the end of each year, whilethe less popular months tend to experience a slump in pricemovement. As such, the two PPR graphs and the MyHomedata (bedrooms not factored) graph are following more orless the same trend in price action and their graphs tend tomeet often, however, the majority of the price action in theMyHome data graphs tends to wait for the popular sellingmonths. The PPR graph does not experience these phenomenaas selling property can be a long, protracted process and dueto a myriad of factors such as price bidding, paperwork, legalhurdles, mortgage applications and delays in reporting, ﬁnalsale notiﬁcations can happen outside of the time period inwhich the sale price is agreed between buyer and seller.VII. C

ONCLUSION

The introduction of bedroom factoring as an additionalparameter in the pairing of nearby properties has been shownto have a profound impact on the smoothness of the mix-adjusted median property price index, which was alreadyshown to outperform a popularly used implementation ofthe hedonic regression model. This improvement was madepossible due to the acquisition of a richer data set and thedevelopment of the GeoTree structure, which greatly increasedthe performance of the algorithm. There is future potential forthe introduction of further property characteristics (such asthe number of bedrooms, property type etc.) in the proximitymatching part of the algorithm, should such data be acquired.Furthermore, the design of the data structure used en-sures that minimal computational complexity is added whenconsidering the technical implementation of this algorithmicadjustment. As a result of this, the index can be computedquickly enough that it would be possible to have real-timeupdates (eg. up to every 5 minutes) to the price index, if asufﬁciently rich stream of continuous data was available tothe algorithm. Large property listing websites, such as Zillow,likely have enough live , incoming data that such an indexwould be feasible to compute at this frequency, however, thisvolume of data is not publicly available for testing.

ABLE I:

Index Comparison Statistics

Algorithm St. Dev St. Dev ofDifferences MSMPPR (original) 16.524 2.191 23.30PPR (GeoTree) 16.378 2.518 29.78MyHome (withoutbedrooms) 12.898 2.209 18.91MyHome (withbedrooms) 12.985 1.617 9.75

Fig. 2:

Comparison of index on PPR and MyHome data sets, from 02-2011 to 03-2019 [data limited to 09-2018 for PPR]

EFERENCES[1] W. E. Diewert, J. de Haan, and R. Hendriks, “Hedonic regressionsand the decomposition of a house price index into land and structurecomponents,”

Econometric Reviews , vol. 34, no. 1-2, pp. 106–126, 2015.[Online]. Available: https://doi.org/10.1080/07474938.2014.944791[2] K. Case, R. Shiller, and J. Quigley, “Comparing wealth effects: Thestock market versus the housing market,”

Advances in Macroeconomics ,vol. 5, no. 1, 2001.[3] M. Forni, M. Hallin, M. Lippi, and L. Reichlin, “Do ﬁnancialvariables help forecasting inﬂation and real activity in the euro area?”

Journal of Monetary Economics

Journal of Emerging MarketFinance , vol. 12, no. 3, pp. 239–291, 2013. [Online]. Available:https://doi.org/10.1177/0972652713512913[5] P. Maguire, R. Miller, P. Moser, and R. Maguire, “A robust houseprice index using sparse and frugal data,”

Journal of PropertyResearch , vol. 33, no. 4, pp. 293–308, 2016. [Online]. Available:https://doi.org/10.1080/09599916.2016.1258718[6] V. Plakandaras, R. Gupta, P. Gogas, and T. Papadimitriou, “Forecastingthe u.s. real house price index,” Rimini Centre for EconomicAnalysis, Working Paper series 30-14, Nov 2014. [Online]. Available:https://ideas.repec.org/p/rim/rimwps/30 14.html[7] J. R. Hernando,

Humanizing Finance by Hedging PropertyValues

International Journal of Housing Markets andAnalysis , vol. 8, no. 1, pp. 135–147, 2015. [Online]. Available:https://doi.org/10.1108/IJHMA-04-2014-0010[9] P. Klotz, T. C. Lin, and S.-H. Hsu, “Modeling property bubbledynamics in greece, ireland, portugal and spain,”

Journal of EuropeanReal Estate Research , vol. 9, no. 1, pp. 52–75, 2016. [Online].Available: https://doi.org/10.1108/JERER-11-2014-0038[10] P. Englund, M. Hwang, and J. M. Quigley, “Hedging housingrisk*,”

The Journal of Real Estate Finance and Economics ,vol. 24, no. 1, pp. 167–200, Jan 2002. [Online]. Available:https://doi.org/10.1023/A:1013942607458[11] R. Miller and P. Maguire, “GeoTree: a data structure for constant timegeospatial search enabling a real-time mix-adjusted median propertyprice index,” arXiv e-prints , p. arXiv:2008.02167, Aug. 2020.[12] J. F. Kain and J. M. Quigley, “Measuring the value of housing quality,”

Journal of the American Statistical Association

Housing Studies , vol. 27, no. 5, pp.643–666, 2012. [Online]. Available: https://doi.org/10.1080/02673037.2012.697551[14] S. Bourassa, M. Hoesli, and J. Sun, “A simple alternative house priceindex method,”

Journal of Housing Economics , vol. 15, no. 1, pp. 80–97,3 2006.[15] B. Case, H. O. Pollakowski, and S. M. Wachter, “On choosingamong house price index methodologies,”

Real Estate Economics ,vol. 19, no. 3, pp. 286–307, 1991. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/1540-6229.00554[16] M. J. Bailey, R. F. Muth, and H. O. Nourse, “A regression method forreal estate price index construction,”

Journal of the American StatisticalAssociation

Journal of Housing Economics

The Journal of Real Estate Financeand Economics , vol. 37, pp. 163–186, 01 2008.[20] G. COSTELLO and C. WATKINS, “Towards a system of local houseprice indices,”

Housing Studies , vol. 17, no. 6, pp. 857–873, 2002.[Online]. Available: https://doi.org/10.1080/02673030216001[21] R. E. Dorsey, H. Hu, W. J. Mayer, and H. chen Wang, “Hedonic versusrepeat-sales housing price indexes for measuring the recent boom-bustcycle,”

Journal of Housing Economics

The Journal of Real Estate Finance andEconomics , vol. 14, no. 1, pp. 75–88, Jan 1997. [Online]. Available:https://doi.org/10.1023/A:1007720001268[23] N. Prasad and A. Richards, “Improving median housing priceindexes through stratiﬁcation,”

Journal of Real Estate Research ,vol. 30, no. 1, pp. 45–72, 2008. [Online]. Available: https://ideas.repec.org/a/jre/issued/v30n12008p45-72.html[24] N. O’Hanlon, “Constructing a national house price index forireland,”

Journal of the Statistical and Social Inquiry Societyof Ireland

Journal of Economic Geography ,vol. 3, no. 1, pp. 57–73, 01 2003. [Online]. Available: https://doi.org/10.1093/jeg/3.1.57[27] J. M. Clapp, H. Kim, and A. E. Gelfand, “Predicting spatial patternsof house prices using lpr and bayesian smoothing,”

Real EstateEconomics , vol. 30, no. 4, pp. 505–532, 2002. [Online]. Available:https://onlinelibrary.wiley.com/doi/abs/10.1111/1540-6229.00048[28] L. Paci, M. A. Beamonte, A. E. Gelfand, P. Gargallo, and M. Salvador,“Analysis of residential property sales using spacetime point patterns,”