[PDF] Bayesian Reconstruction of Missing Observations

Abstract

We focus on an interpolation method referred to Bayesian reconstruction in this paper. Whereas in standard interpolation methods missing data are interpolated deterministically, in Bayesian reconstruction, missing data are interpolated probabilistically using a Bayesian treatment. In this paper, we address the framework of Bayesian reconstruction and its application to the traffic data reconstruction problem in the field of traffic engineering. In the latter part of this paper, we describe the evaluation of the statistical performance of our Bayesian traffic reconstruction model using a statistical mechanical approach and clarify its statistical behavior.

Full PDF

aa r X i v : . [ s t a t . M L ] M a r Bayesian Reconstruction of Missing Observations

Shun Kataoka ∗ , Muneki Yasuda † , and Kazuyuki Tanaka ∗ ∗ Graduate School of Information Sciences, Tohoku University, Sendai 980-8579, Japan † Graduate School of Science and Engineering, Yamagata University, Yonezawa 992-8510, Japan abstract:

We focus on an interpolation method referred to Bayesian reconstruction in this paper. Whereas in standardinterpolation methods missing data are interpolated deterministically, in Bayesian reconstruction, missing dataare interpolated probabilistically using a Bayesian treatment. In this paper, we address the framework ofBayesian reconstruction and its application to the traﬃc data reconstruction problem in the ﬁeld of traﬃcengineering. In the latter part of this paper, we describe the evaluation of the statistical performance of ourBayesian traﬃc reconstruction model using a statistical mechanical approach and clarify its statistical behavior.

Methods for interpolating missing data are important in various scientiﬁc ﬁelds. A standard interpolationmethod, such as spline interpolation, is a deterministic interpolation technique. An alternative, probabilistic,interpolation technique has been developed in the last few years. In the probabilistic interpolation technique,which is called Bayesian reconstruction, the Bayesian treatment is used to interpolate and reconstruct missingregions. To the best of our Knowledge, Bayesian reconstruction was ﬁrst implemented in the digital imageinpainting ﬁlter. The digital image inpainting ﬁlter is used in the process of reconstructing lost or deterioratedparts of images [1] (see ﬁgure 1). Previously, two of the authors applied Bayesian reconstruction to the digitalFigure 1: Example of digital image inpainting ﬁlter. The left image is the original scratched image and thecenter image is masked by the black regions. We reconstructed the masked region using the digital imageinpainting ﬁlter to restore the original damaged image. The right image is the reconstructed image obtained byusing the digital image inpainting method proposed in reference [2].image inpainting ﬁlter [3, 4]. Bayesian reconstruction is now becoming the standard technique in the digitalimage inpainting ﬁlter [5].It can be expected that the framework of Bayesian reconstructions will be utilized in various reconstructionproblems, and therefore, their use should not be limited to image processing. Recently, the authors appliedBayesian reconstruction to the traﬃc data reconstruction problem [6]. Traﬃc data reconstruction is an impor-tant processing that precedes traﬃc prediction, such as travel time prediction, density prediction, and routeplanning. In order to provide accurate information to drivers, a broad-scale database of real-time vehicular traf-ﬁc over an entire city is required. However, in practice, it is diﬃcult to collect the traﬃc data for an entire city,because traﬃc sensors are not installed on all roads. Therefore, the objective of the traﬃc data reconstructionis to reconstruct the states of unobserved roads where traﬃc sensors are not installed by using information fromobserved roads where traﬃc sensors are installed.In the ﬁrst part of this paper, we introduce the details of Bayesian reconstruction, and subsequently, anoverview of the Bayesian traﬃc data reconstruction method proposed in reference [6], together with some new Corresponding author: [email protected]

In the Bayesian framework, we suppose observations (observed data) are probabilistically drawn from a speciﬁcprobability distribution, referred to as prior probability, because observations suﬀer from uncertainty, whoseorigin is physical noise, incompleteness of some elements, and so on.Suppose that there exists an n -dimensional observation x = { x i ∈ R | i ∈ V = { , , . . . , n }} , which isgenerated from prior probability P prior ( x ), and that we cannot observe a part of the elements in the observationfor some reason. Since the observation is probabilistically generated, we can treat its elements as randomvariables. We deﬁne the set of labels of missing elements by M ⊆ V and the complementary set of M bynotation O , i.e., O := V \ M , and therefore, O is the set of labels of observed elements. Given an observation x , we describe the values of the observed elements by notation y to distinguish them from unobserved elementsand collectively express the observed elements by y = { y i ∈ R | i ∈ O} . The values of the elements in set O are ﬁxed by the observation y . Bayesian reconstruction considered in this paper consists of reconstructing theunobserved elements, M , in the observation by using the observed elements, O ; in other words, the objectiveis to estimate the values of x M by using y , where notation x A is the set of x i belonging to set A ⊆ V , i.e., x A := { x i | i ∈ A} .In order to reconstruct the missing elements in terms of the Bayesian point of view, we ﬁrst formulate theposterior probability of the missing elements, P ( x M | y ), by the Bayesian rule P ( x M | y ) = P ( x M , y ) P ( y ) , (1)where the value of y is ﬁxed by the observation. By using Dirac’s delta, we have P ( x M , y ) = (cid:16) Y i ∈O δ ( y i − x i ) (cid:17) P ( x M , x O ) = (cid:16) Y i ∈O δ ( y i − x i ) (cid:17) P prior ( x ) . (2)It should be noted that, if x are discrete variables, Dirac’s delta is replaced by Kronecker’s delta. From equations(1) and (2), we have P ( x M | y ) ∝ P likelihood ( y | x ) P prior ( x ) , (3)where P likelihood ( y | x ) := Y i ∈O δ ( y i − x i ) . Probability P likelihood ( y | x ) is referred to as the likelihood in the Bayesian framework. In Bayesian reconstruc-tion, we consider the suitable reconstructed values of unobserved elements, x ∗M , to be the values of x M thatmaximize the posterior probability in equation (3), i.e., x ∗M = arg max x M P ( x M | y ) = arg max x M P likelihood ( y | x ) P prior ( x ) . (4)The above reconstruction scheme requires prior probabilities that describe the hidden probabilistic mecha-nisms of observations. However, unfortunately, in almost all situations we do not know the details of the priorprobabilities. Therefore, in order to implement the Bayesian reconstruction system, we should model unknownprior probabilities. 2 .1 Prior Modeling based on Markov Random Fields One of the models of prior probabilities of observed data that is presently available is MRF. MRFs can easilytreat complex spatial interactions among observational data points that create a variety of appearance patterns.Consider an undirected graph G ( V, E ), where V = { , , . . . , n } is the set of vertices and E = { ( i, j ) } is theset undirected edges between the vertices. An MRF is usually deﬁned on such an undirected graph G ( V, E )by assigning each variable x i to the corresponding vertex i . Edge ( i, j ) expresses a spatial interaction betweenvariable x i and variable x j . To construct a probabilistic model, we deﬁne the joint probability of x , P model ( x ).On the undirected graph G ( V, E ), if we assume a spatial Markov property among random variables, x , and thepositivity of model P model ( x ) >

0, by the Hammersley-Cliﬀord theorem, the model can be expressed as P model ( x ) = 1 Z exp (cid:16) X i ∈ V φ i ( x i ) + X ( i,j ) ∈ E ψ ij ( x i , x j ) (cid:17) , (5)without loss of generalities. The ﬁrst term in the exponent, φ i ( x i ), is a potential function on vertex i thatdetermines the characteristic of x i , and the second term in the exponent, ψ ij ( x i , x j ), is a potential functionbetween vertices i and j that determines the interaction between x i and x j . P ( i,j ) ∈ E represents the summationrunning over all edges, Z denotes the normalization constant, sometimes referred to as the partition function,deﬁned by Z := Z ∞−∞ exp (cid:16) X i ∈ V φ i ( x i ) + X ( i,j ) ∈ E ψ ij ( x i , x j ) (cid:17) d x . The model in equation (5) is the MRF that is most frequently used. In the MRF, let us consider the conditionalprobability of x i expressed as P model ( x i | x − i ) = P model ( x ) R ∞−∞ P model ( x ) d x i , (6)where x − i denotes the set of all variables except x i : x − i = { x j | j ∈ V \ { i }} . Equations (5) and (6) lead to P model ( x i | x − i ) = P model ( x i | x ∂ ( i ) ) , (7)where ∂ ( i ) denotes the set of vertices connecting to vertex i in the graph, and x ∂ ( i ) denotes the set of variables onthe vertices belonging to ∂ ( i ), that is, x ∂ ( i ) is the set of nearest neighbor variables of x i : x ∂ ( i ) = { x j | j ∈ ∂ ( i ) } .Equation (7) states that the variable x i depends on only nearest neighbor variables in the conditional probability,and this constitutes the spatial Markov property of MRF. In order to implement the MRF in equation (5), the forms of potential functions should be determined. Thisis one of the most important points in MRF modeling. Parametrically, we model the potential functions bycertain parametric functions with parameter θ , P model ( x | θ ) = 1 Z ( θ ) exp (cid:16) X i ∈ V φ i ( x i | θ i ) + X ( i,j ) ∈ E ψ ij ( x i , x j , | θ ij ) (cid:17) . (8)Thus, we should ﬁnd the optimal values of the parameters. The standard method for achieving this is providedby the ﬁeld of machine learning theory described as follows.The objective of MRF modeling is to model the unknown prior probability of observation P prior ( x ). There-fore, the optimal values of the parameters, θ ∗ , should minimize some distance between the prior probabilityand our model P model ( x | θ ). The Kullback-Leibler divergence (KLD) K ( P || P ) := Z ∞−∞ P ( x ) ln P ( x ) P ( x ) d x (9)3s often utilized as a measure of two distinct probabilities, P ( x ) and P ( x ). The value of KLD is always non-negative and is zero when two probabilities are equivalent. Thus, we consider that two distinct probabilitiesare close to each other when the value of KLD is small. In terms of KLD, we suppose the optimal values of theparameters are given by minimizing the value of KLD between the prior probability and our model, θ ∗ = arg min θ K ( P prior || P model ) . However, we cannot perform this minimization because we do not know the prior probability.Since we do not know the prior probability, we suppose instead that we have many complete observations generated from the prior probability. We describe the observations by D = { y ( µ ) ∈ R n | µ = 1 , , . . . , N } , andwe deﬁne the empirical distribution of the N complete observations by Q D ( x ) := 1 N N X µ =1 Y i ∈ V δ ( y ( µ ) i − x i ) . (10)The empirical distribution is the frequency distribution of the N complete observations. It should be notedthat, if x are discrete variables, Dirac’s delta is again replaced by Kronecker’s delta. We suppose the empiricaldistribution has some important properties of the prior probability and that suppose the optimal values of theparameters are approximately obtained by minimizing the value of KLD between the empirical distribution andour model, θ ∗ = arg min θ K ( P prior || P model ) ≈ arg min θ K ( Q D || P model ) . (11)This minimization can be perform if we have the complete observations generated from the prior probability.Equation (11) is rewritten as θ ∗ ≈ arg min θ K ( Q D || P model ) = arg max θ Z ∞−∞ Q D ( x ) ln P model ( x | θ ) d x . This corresponds to the MLE in statistics. In the above scheme, we assumed there are no missing points in the N observations used in the estimation of the parameters. If the observations include missing points, we will usean alternative strategy and apply the expectation and maximization (EM) algorithm.From the above arguments, the (parametric) Bayesian reconstruction system is summarized as follows.Before the reconstructions, we design the potential functions in our MRF model P model ( x | θ ) and estimatethe optimal values of the parameters, θ ∗ , in advance by using many complete observations and equation (11).Then, the reconstruction is approximately performed using the constructed model P model ( x | θ ∗ ) instead of theprior probability in equation (4), i.e., x ∗M = arg max x M P likelihood ( y | x ) P prior ( x ) ≈ arg max x M P likelihood ( y | x ) P model ( x | θ ∗ ) . (12)Since arg max x M P likelihood ( y | x ) P model ( x | θ ∗ ) = arg max x M P model ( x M | x O = y , θ ∗ ) , the suitable reconstructed values of unobserved missing points, x ∗M , are the values that maximize the conditionalprobability of our model, P model ( x M | x O , θ ∗ ) = P model ( x | θ ∗ ) R ∞−∞ P model ( x | θ ∗ ) d x M , with x O ﬁxed by the observation y . In this section, we give an overview of the application of the Bayesian reconstruction scheme presented in theprevious section to the traﬃc data recognition problem proposed by the authors [6], together with some newnumerical results. “Complete” means each observation includes no missing points. .1 MRF model for Bayesian Traﬃc Data Reconstruction We applied the Bayesian reconstruction scheme to traﬃc data reconstruction. Our goal is to reconstruct thestates of roads, and therefore, random variables x are assigned to roads. In order to formulate an MRF for aroad network, we construct an undirected graph G ( V, E ) as follows: we assign each vertex on each road anddraw each edge between two roads that are connected to each other at a traﬃc intersection (see ﬁgure 2). OnFigure 2: Undirected graph representation for road network. (a) Road network with six roads and two inter-sections. (b) Vertices are assigned to roads. (c) Edges are drawn between two roads that are connected to eachother at intersections.the undirected graph, we deﬁne the MRF by P model ( x | θ ) = 1 Z ( θ ) exp (cid:16) X i ∈ V h i x i − ξ X i ∈ V x i − J X ( i,j ) ∈ E ( x i − x j ) (cid:17) , (13)where θ = { h , ξ, J } are the parameters of the model . The variable x i ∈ R expresses the state of road i . Inthis paper, we consider x as traﬃc densities according to the method in reference [6]. The traﬃc density onroad i is deﬁned by the number of cars per unit area on road i ; high densities tend to lead to traﬃc jams. ThisMRF is obtained by setting φ i ( x i ) = h i x i − ξx i / ψ ij ( x i , x j ) = J ( x i − x j ) / h i is the bias that controls the level of the traﬃc density of road i , and parameter ξ > J ≥ M and the set of observed roadsby O . After determining the values of the parameters by the machine learning method in equation (11), fromequation (12), the reconstructed densities on the unobserved roads are obtained by x ∗M = arg max x M P model ( x M | x O = y , θ ∗ ) , (14)where y represents the densities on the observed roads. Since our model in equation (13) is multi-dimensionalGaussian, the conditional probability is also multi-dimensional Gaussian. Therefore, equation (14) is rewrittenas x ∗M = Z ∞−∞ x M P model ( x M | x O = y , θ ∗ ) d x M . (15)Hence, we ﬁnd that the reconstructed densities, x ∗M , are the expectations of P model ( x M | x O = y , θ ∗ ). Theexpectations are obtained by solving the simultaneous equations, x i = 1 ξ + | ∂ ( i ) | J (cid:16) h i + J X j ∈ ∂ ( i ) z j (cid:17) i ∈ M (16) Although this expression seems to diﬀer slightly from the original model proposed in reference [6], this expression is essentiallyequivalent to the original model.

5y an iteration method, where |A| denotes the number of elements in the assigned set A and z j = ( x j j ∈ M y j j ∈ O . Equation (16) is known as the mean-ﬁeld equation, which is obtained by the naive mean-ﬁeld approximation for P model ( x M | x O = y , θ ∗ ), and is also known as the Gauss-Seidel method. It is known that in GGM mean-ﬁeldequations and Gauss-Seidel methods are equivalent in general and that mean-ﬁeld equations always provideexact expectations [7]. In this section, we describe the application of our Bayesian traﬃc data reconstruction to the road network ofthe city of Sendai (shown in ﬁgure 3) and show the performance of our model. Figure 3 shows the road networkFigure 3: Road network of Sendai, Japan. This network consists of about ten thousand roads.of Sendai which consists of about ten thousand roads. According to our Bayesian traﬃc data reconstructionscheme, ﬁrst, we deﬁned the MRF model shown in equation (13) for the road network. The structure of theMRF was constructed according to ﬁgure 2.The parameters were determined by the maximum likelihood estimation shown in section 2.2 with the L regularizations [6]. Regularizations are frequently used to avoid over-ﬁtting to noises in training data. In themaximum likelihood estimation, we used N = 359 complete traﬃc data generated by a traﬃc simulator forthe road network of Sendai. Although the traﬃc data were not real, they were presumed to represent typicalbehavior of traﬃc in Sendai. Using the MRF model, we reconstructed the traﬃc densities of Sendai. FigureFigure 4: True traﬃc density used in our numerical experiment. Each road is colored according to its traﬃcdensity. The network in the left panel is the entire network and the network in the right panel is an enlargedimage of the center of Sendai.4 shows the traﬃc density data that were not used in the learning, namely, the test data. In order to visuallyrepresent the densities, we quantized the densities into ﬁve stages at 0.03 intervals; each road is colored accordingto its quantized traﬃc density. The densities increase in the following order: black, blue, green, yellow, and red.Thus, a road colored black has a density in the interval [0 , . p = 0 .

8. Figure 5 shows the positions of the unobservedFigure 5: Positions of unobserved roads, where the unobserved roads are colored red. About 80 % of roadsare unobserved. The network in the left panel is the entire network and the network in the right panel is anenlarged image of the center of Sendai.roads, which are colored red. We reconstructed densities in the unobserved roads using the densities in theobserved roads, which colored black in the ﬁgure. Our Bayesian reconstruction result is shown in ﬁgure 6. TheFigure 6: Reconstruction result by using our Bayesian reconstruction, where each road is colored according toits traﬃc density. The network in the left panel is the entire network and the network in the right panel is anenlarged image of the central area of Sendai.mean square error (MSE) between the true densities shown in ﬁgure 4 and the reconstructed densities shownin ﬁgure 6 of the unobserved roads is approximately 0.001447, where the MSE is deﬁned by[MSE] := 1 M X i ∈M (cid:0) x data i − x recon i (cid:1) , (17)where x data i is the true density on road i and x recon i is the density on road i reconstructed by our method.The scatter plot of this reconstruction is shown in ﬁgure 7. The correlation coeﬃcient of this scatter plot isapproximately 0.919. It can be seen that the correlation coeﬃcient is close to one. Thus, our simple MRFmodel can be expected to capture a static statistical property of the traﬃc data.Next, we address the average performance of our reconstruction method versus the value of the missingprobability p . Figure 8 shows the MSE versus the value of the missing probability p . Each point is the averagevalue of MSE over 100 trials using the leave-one-out cross-validation method. It can be seen that the errorincreases with the value of the missing probability p . In this section, we clarify the relationship between the model parameters in the MRF model in equation (13)and the reconstruction performance from a statistical mechanical point of view.7 t r ue den s i t i e s estimated densities Figure 7: Scatter plot of the true densities in ﬁgure 4 and the reconstructed densities in ﬁgure 6. [ M S E ] p Figure 8: MSE versus the missing probability p . The solid curve is the Bezier interpolation of points.In our analysis, we assume that the prior probability of traﬃc data has the same form as our model inequation (13), P prior ( x | θ ) = 1 Z ( θ ) exp (cid:16) X i ∈ V h i x i − ξ X i ∈ V x i − J n X i

5. It can be seen that, where J is bigger than J , the performance level is relativelyFigure 9: Plot of E against error r when ξ = ξ , µ ε = σ ε = 0, p = 0 .

5, and J = J + r . The vertical axis is E in equation (27). The inset is an enlarged plot around r = 0.robust. In contrast, the performance level drastically decreases when J is smaller than J .From equation (27), it can be seen that the dependency on the missing rate p arises when model errors existin the bias parameters or in variance parameters ξ and ξ . Next, we consider the case where model errors existin the bias parameters and the variance parameters. Figure 10 shows the plot of E against missing rate p when ξ = 0 . µ ε = 0 . σ ε = 0, and J = J . The reconstruction performance level decreases with the increase inthe value of p . This performance behavior seems to be qualitatively similar to the behavior of our numericaltraﬃc density reconstruction in ﬁgure 8. From this result, we can presume that the behavior of MSE in ﬁgure8 is caused primarily by the model errors in either the biases or the variance, or both.The parameters in our reconstruction model were determined by the training data as described in section3.2. However, in the training, it was not easy to ﬁnd the truly optimal values of parameters from the trainingdata, because the number of training data was much smaller than the number of parameters. This could beone of the reasons why the model errors exist. 10igure 10: Plot of E against missing rate p when ξ = 0 . µ ε = 0 . σ ε = 0, and J = J . The vertical axis is E in equation (27). In this paper, we introduced the Bayesian reconstruction framework, in which missing data are probabilisticallyinterpolated, and overviewed the application of Bayesian reconstruction to the problem of traﬃc data recon-struction in the ﬁeld of traﬃc engineering. Our traﬃc reconstruction model in equation (13) neglects some realtraﬃc properties, for example, traﬃc lanes, contraﬂows, and so on. Nevertheless, the results of our reconstruc-tion seem to be accurate. It can be expected that our simple GGM captures the static statistical property oftraﬃc data and that it can be used an important base model of Bayesian traﬃc reconstructions. The extensionof our model by taking real traﬃc properties into account should be addressed in the next study.In the latter part of this paper, we evaluated the statistical performance of our reconstruction by using thestatistical mechanical analysis, that is mean-ﬁeld analysis. In our analysis, we used the simpliﬁed reconstructionmodel, which has no network structures, for the convenience of calculations. However, since real road networksare network structures, the result of our evaluation can be only a rough approximation. The authors proposed amethod based on the belief propagation method to evaluate the statistical properties of Bayesian reconstructionson structured networks in the context of image processing [8]. We strongly suggest that the method can beapplied to Bayesian traﬃc reconstruction and can lead to more realistic evaluations.

A Mean-ﬁeld Analysis for Traﬃc Data Reconstruction Model

The partition function in equation (18) is Z ( θ ) = Z ∞−∞ exp (cid:16) X i ∈ V h i x i − ξ X i ∈ V x i − J n X i

12 ln 2 πξ + J − n ( ξ + J ) X i ∈ V h i − J n ( ξ + J ) ξ (cid:16) X i ∈ V h i (cid:17) , (30)so that the ﬁrst- and the second-order moments of the prior model are given by h x i i prior = − n ∂F ( θ ) ∂h i = h i ξ + J + Jn ( ξ + J ) ξ X j ∈ V h j (31)and h x i x j i prior = − n ∂ F ( θ ) ∂h i ∂h j + h x i i prior h x j i prior = δ ij ξ + J + Jn ( ξ + J ) ξ + h x i i prior h x j i prior , (32)respectively, where h· · ·i prior = R ∞−∞ ( · · · ) P prior ( x | θ ) d x and δ ij is Kronecker’s delta.Next, we ﬁnd the free energy for the conditional probability of the reconstruction model in equation (20).The conditional probability is expressed by P model ( x M | x O , θ ) = 1 Z ( θ , x O ) exp (cid:16) X i ∈M (cid:0) β i + f ( x O ) (cid:1) x i − (cid:16) ξ + ( n − J n (cid:17) X i ∈M x i + J n X i

12 ln 2 πξ + J − t ( ξ + J ) X i ∈M (cid:0) β i + f ( x O ) (cid:1) − pJ t ( ξ + J ) (cid:0) ξ + (1 − p ) J (cid:1) (cid:16) X i ∈M (cid:0) β i + f ( x O ) (cid:1)(cid:17) , (34)so that the ﬁrst-order moments of the conditional probability of the reconstruction model are given by h x i i M|O = − t ∂ F ( θ , x O ) ∂β i = β i + f ( x O ) ξ + J + pJ t ( ξ + J ) (cid:0) ξ + (1 − p ) J (cid:1) X j ∈M (cid:0) β j + f ( x O ) (cid:1) (35)for ∀ i ∈ M , where we deﬁne h· · ·i M|O := R ∞−∞ ( · · · ) P model ( x M | x O , θ ) d x .We consider the thermal dynamical limit by taking limits n, t → ∞ , where p is ﬁxed at a ﬁnite constant. Inthe thermal dynamical limit, the free energies in equations (30) and (34) are reduced to F ( θ ) = −

12 ln 2 πξ + J − σ h + µ h ξ + J ) − Jµ h ξ + J ) ξ F ( θ , x O ) = −

12 ln 2 πξ + J − σ h + σ ε + (cid:0) µ h + µ ε + f ( x O ) (cid:1) ξ + J ) − pJ (cid:0) µ h + µ ε + f ( x O ) (cid:1) ξ + J ) (cid:0) ξ + (1 − p ) J (cid:1) , respectively, where µ h and σ h are the average and variance of p h ( h ), respectively, and µ ε and σ ε are the averageand variance of p ε ( ε ), respectively. { µ h , µ ε , σ h , σ ε } appear because of the law of large numbers. Similarly, themoments in equations (31), (32), and (35) are reduced to h x i i prior = h i ξ + J + Jµ h ( ξ + J ) ξ , (36) h x i x j i prior = δ ij ξ + J + h x i i prior h x j i prior , (37)and h x i i M|O = β i + f ( x O ) ξ + J + pJ (cid:0) µ h + µ ε + f ( x O ) (cid:1) ( ξ + J ) (cid:0) ξ + (1 − p ) J (cid:1) , (38)respectively, in the thermal dynamical limit. Acknowledgment

The authors are very grateful to Professor Masao Kuwahara and Dr. Jinyoung Kim of the Graduate Schoolof Information Science, Tohoku University for providing road network data and traﬃc simulation data. Thiswork was partially supported by grants-in-aid (nos. 25280089, 24700220, and 25-7259) from the Ministry ofEducation, Culture, Sports, Science and Technology of Japan.

References [1] Bertalm´ıo, M., Sapiro, G., Caselles, V., Ballester, C., “Image inpainting,”

Proceedings of SIGGRAPH2000 (2000).[2] Yasuda, M., Tanaka, K., “Mean ﬁeld approximation for ﬁelds of experts,”

Interdisciplinary InformationSciences : 113–119 (2013).[3] Yasuda, M., Ohkubo, J., Tanaka, K., “Digital image inpainting based on Markov random ﬁeld,” Proceedingsof 2005 International Conference on Computational Intelligence for Modelling, Control and Automation(CIMCA’05) , : 747–752 (2005).[4] Yasuda, M., Ohkubo, J., Tanaka, K., “Digital image inpainting algorithm by using Gaussian graphicalmodel,” Information Technology Letters (Proceedings of Forum of Information Technology (FIT) 2006) , :225–228 (2006) (in Japanese).[5] Roth, S., Black, M.J., “Fields of experts,” International Journal of Computer Vision , : 205–229 (2009).[6] Kataoka, S., Yasuda, M., Furtlehner, C., Tanaka, K., “Traﬃc data reconstruction based on Markov randomﬁeld modeling,” Inverse Problems , : 025003 (2014).[7] Weiss, Y., Freeman, W.T., “What makes a good model of natural images,” Proceedings of the IEEEConference on Computer Vision and Pattern Recognition (CVPR) (2007).[8] Kataoka, S., Yasuda, M., Tanaka, K., “Statistical analysis of Gaussian image inpainting problems,”

Journalof the Physical Society of Japan81