[PDF] Computationally Efficient Multiscale Neural Networks Applied To Fluid Flow In Complex 3D Porous Media - Researchain

Abstract

The permeability of complex porous materials can be obtained via direct flow simulation, which provides the most accurate results, but is very computationally expensive. In particular, the simulation convergence time scales poorly as simulation domains become tighter or more heterogeneous. Semi-analytical models that rely on averaged structural properties (i.e. porosity and tortuosity) have been proposed, but these features only summarize the domain, resulting in limited applicability. On the other hand, data-driven machine learning approaches have shown great promise for building more general models by virtue of accounting for the spatial arrangement of the domains solid boundaries. However, prior approaches building on the Convolutional Neural Network (ConvNet) literature concerning 2D image recognition problems do not scale well to the large 3D domains required to obtain a Representative Elementary Volume (REV). As such, most prior work focused on homogeneous samples, where a small REV entails that that the global nature of fluid flow could be mostly neglected, and accordingly, the memory bottleneck of addressing 3D domains with ConvNets was side-stepped. Therefore, important geometries such as fractures and vuggy domains could not be well-modeled. In this work, we address this limitation with a general multiscale deep learning model that is able to learn from porous media simulation data. By using a coupled set of neural networks that view the domain on different scales, we enable the evaluation of large images in approximately one second on a single Graphics Processing Unit. This model architecture opens up the possibility of modeling domain sizes that would not be feasible using traditional direct simulation tools on a desktop computer.

Full PDF

CC OMPUTATIONALLY EFFICIENT MULTISCALE NEURALNETWORKS APPLIED TO FLUID FLOW IN COMPLEX POROUSMEDIAA PREPRINT

Javier E. Santos

The University of Texas at Austin

Ying Yin

Xi’an Jiaotong University

Honggeun Jo

The University of Texas at Austin

Wen Pan

The University of Texas at Austin

Qinjun Kang

Los Alamos National Laboratory

Hari S. Viswanathan

Los Alamos National Laboratory

Maša Prodanovi´c

The University of Texas at Austin

Michael J. Pyrcz

The University of Texas at Austin

Nicholas Lubbers

Los Alamos National LaboratoryFebruary 17, 2021 A BSTRACT

The permeability of complex porous materials is of interest to many engineering disciplines. Thisquantity can be obtained via direct ﬂow simulation, which provides the most accurate results, butis very computationally expensive. In particular, the simulation convergence time scales poorly assimulation domains become tighter or more heterogeneous. Semi-analytical models that rely onaveraged structural properties (i.e. porosity and tortuosity) have been proposed, but these featuresonly summarize the domain, resulting in limited applicability.On the other hand, data-driven machine learning approaches have shown great promise for buildingmore general models by virtue of accounting for the spatial arrangement of the domains’ solidboundaries. However, prior approaches building on the Convolutional Neural Network (ConvNet)literature concerning 2D image recognition problems do not scale well to the large 3D domainsrequired to obtain a Representative Elementary Volume (REV). As such, most prior work focused onhomogeneous samples, where a small REV entails that that the global nature of ﬂuid ﬂow could bemostly neglected, and accordingly, the memory bottleneck of addressing 3D domains with ConvNetswas side-stepped. Therefore, important geometries such as fractures and vuggy domains could not bewell-modeled.In this work, we address this limitation with a general multiscale deep learning model that is ableto learn from porous media simulation data. By using a coupled set of neural networks that viewthe domain on different scales, we enable the evaluation of large ( > ) images in approximatelyone second on a single Graphics Processing Unit. This model architecture opens up the possibilityof modeling domain sizes that would not be feasible using traditional direct simulation tools on adesktop computer. We validate our method with a laminar ﬂuid ﬂow case using vuggy samples andfractures. As a result of viewing the entire domain at once, it is able to perform accurate predictionon domains exhibiting a large degree of heterogeneity. We expect the methodology to be applicableto many other transport problems where complex geometries play a central role. K eywords Convolutional Neural Networks · Multiscale · Machine Learning · Permeability · Lattice-Boltzmann · Representative Elementary volume (REV) a r X i v : . [ c s . L G ] F e b PREPRINT - F

EBRUARY

17, 2021

In the last few decades, micro-tomographic imaging in conjunction with direct numerical simulations (digital rocktechnologies) have been developed extensively to act as a complementary tool for laboratory measurements of porousmaterials[1]. Many of these breakthroughs are partly thanks to advances in data storing and sharing [2], wider availabilityof imaging facilities [3], and better technologies (hardware and software) to visualize ﬁne-scale features of porousmedia [4]. Nevertheless, characterization based on stand-alone images do not provide enough insight of how thesmall-scale structures affect the macroscopic behavior for a given phenomenon (i.e. ﬂuid ﬂow). A more robust way ofunderstanding these (and potentially being able to upscale them), is through simulating the underlying physics of ﬂuidﬂow.The increase in speed and availability of computational resources (graphics processing units, supercomputer clusters, andcloud computing) has made it possible to develop direct simulation methods that obtain petrophysical properties basedon 3D images [5, 6, 7, 8]. However, solving these problems in time frames that could allow their industrial applicabilityrequires thousands of computing cores. Furthermore, the most insight could be gained in repeated simulation withdynamically changing conditions (inﬂuence of diagenetic processes such as cementation and compaction, surfaceproperties like roughness, or tuning the segmentation of a sample to match experimental measurements) where solvinga forward physics model several times (in similar domains) would be necessary. This is prohibitively expensive in manycases. A machine learning approach that could give fast and accurate approximations is of great interest.A particular physical framework of interest in digital rocks physics is to describe how a ﬂuid ﬂows through a givenmaterial driven by a pressure difference. This is relevant to characterize how easy it is for a ﬂuid to travel through aspeciﬁc sample, and it can also reveal preferential ﬂuid paths and potential bottlenecks for ﬂow. By understandingthe ﬂuid behavior in a small sample, is possible to use this data to inform larger scale processes about the effect ofthe microstructure. The simplest and most important way to summarize the microstructural effects on ﬂow is witha permeability, which is a volume-average property derived from the ﬂuid velocity and describes how well a ﬂuidcan advance through its connected void-space. Knowing the permeability is of interest for not only for petroleumengineering [9], carbon capture and sequestration [10] or, aquifer exploitation [11], but also in geothermal engineering[12], membrane design, and fuel cell applications [13].Despite the fact that there are many published analytical solutions and computational algorithms to obtain the perme-ability in a faster manner, they do not work well in the presence of strong heterogeneities associated with importantgeometries such as fractures. This is partly due to the fact that most of these proposed solutions are computed basedon averaged properties of the solid structure (like the porosity and the tortuosity of the sample [14, 15, 16, 17]). Themain issue is that samples with very similar average structural values could have widely different volumetric ﬂuxbehaviors (i.e. when fractures or vugs are present). For instance, a certain porous structure could have permeabilityvalues spanning three orders of magnitude depending whether the domain is not fractured, or if it hosts a fractureparallel or perpendicular to ﬂow. While these situations signiﬁcantly affect permeability, the porosity remains relativelyunchanged; there is no known route for characterizing the relationship between ﬂow and microstructure in terms ofsmall number of variables.To obtain a measure of the permeability sample taking into account the 3D microstructure, a ﬂuid ﬂow simulation canbe carried out with a wide variety of iterative numerical methods to approximate the solution of the Navier-Stokesequation [18]. One of the most prominent is the Lattice-Boltzmann Method (LBM). Although these simulations areperformed at a much smaller scale relatively to a natural reservoir, they provide the critical parameters to enable theupscaling of hard-data (cores coming from wells or outcrops) into ﬁeld scale simulators. Although it would be desirableto simulate bigger computational volumes that contain more information about the reservoir of interest (since imagingtechnology can provide volumes that are or larger), it is computationally expensive, making it very difﬁcult toperform routinely or repeatedly.A representative elementary volume (REV) has to be ensured to reliably utilize these properties in large scale (ﬁeld)simulations (and thus upscale). An REV is deﬁned as the size of a window where measurements are scale-independent,and that accurately represents the system [19]. Notwithstanding, having an REV for e.g. porosity (which is easilydetermined from a segmented image), does not guarantee that this window size would have a representative response ina ﬂow property like permeability. As shown in [20], for fairly homogeneous samples, the side length of the window toobtain an REV in a dynamic property is at least ﬁve times what is its for the structure (porosity). This is one of thereasons why porosity alone is a poor estimate for permeability: Even when the microstructural windows are similar, theﬂow structures that they host could be very different due to the global nature of the steady-state solution. In the contextof single fractures, it is still unclear if an REV exists [21, 22]. This puts into prominence the need for more advancedmethods that can provide accurate solutions on large samples that take in account all the complexities of the domain.2 PREPRINT - F

EBRUARY

17, 2021In the last decade, Convolutional Neural Networks (ConvNets) have become a prominent tool in the ﬁeld of imageanalysis. These have taken over traditional tools for computer vision tasks such as image recognition and semanticsegmentation, as result of being easily trained to create spatially-aware relationships between inputs and outputs. This isaccomplished with learnable kernels which can be identiﬁed with small windows that have the same dimensionality asthe input data (i.e. 2D for images, 3D for volumes). They have been successfully applied in many tasks regarding earthscience disciplines [23, 24, 25], and particularly in the ﬁeld of digital rocks [26, 27, 28, 29, 30]. These architectureshave also been useful for solving ﬂow [31], successfully modeling the relationship between 3D microstructure and ﬂowresponse much more accurately than empirical formulas that depend only on averaged properties.However, ConvNets are expensive to scale to 3D volumes. This is due to the fact that these structures are memoryintensive, so traditional networks used for computer vision tasks (i.e.

UNet [32] or the

ResNet [33] ) limit the input sizesto be around . As shown in [31], one could subsample the domain into smaller pieces to use these architectures,where the subsample does not need to be an REV but it has to be accompanied by features that inform the model aboutthe original location of this subdomain (i.e. tortuosity, connectivity, distance to the non-slip boundaries). This methodprovides accurate results, nevertheless, predictions stop being reliable in domains with large heterogeneities (such as afracture or a vug).A multiscale approach that is able to capture large and small scale aspects of the microstructure simultaneously isan attractive proposal to overcome this limitation. Multiscale approaches have precedent in the ConvNet literature.For example, Karra et al [34] described progressive growing of a generative network build a high-resolution modelof image datasets by starting with coarser, lower-resolution models and adding to them. Perhaps the most relevant toour work is SinGAN [35], another generative model that uses a linked set of networks to describe images at differentscales; ﬁner scales build upon the models for coarser scales. We invoke similar principles to build the MS-Net for hierarchical regression , which performs regression based on a hierarchical principle: coarse inputs provide broadinformation about the data, and progressively ﬁner-scale inputs can be used to reﬁne this information. A schematic ofthe workﬂow is shown in 1. In this paper we use MS-Net to learn relationships between pore structure and ﬂow ﬁeldsof steady-state solutions from LBM. Our model starts by analyzing a coarsened version of a porous domain (wherethe main heterogeneities affecting ﬂow are present), and then proceeds to make a partial prediction of the velocityﬁeld. This is then passed subsequently to ﬁner-scale models to reﬁne this coarse prediction until the entire ﬂow ﬁeld isrecovered. This paradigm exhibits advantages over other ConvNet approaches such as Poreﬂow-Net[31] with regards tothe ability to learn on heterogenous domains and in terms of the computational expense of the model. While appliedhere to ﬂuid ﬂow, we believe this hierarchical regression paradigm could be applied to many disciplines dealing with3D volumes, not limited to the problems studied here.The rest of this manuscript is organized as follows. In Section 2 we describe our methods, and in Section 3 we describethe data we have applied our methods to. In section 4 we describe the results of training to two different data sources.We show the results on test data comprised of a variety of samples presenting a wide range of heterogeneites at differentscales. In Section 5 we provide discussion, including comments on the memory-efﬁciency of the approach, and weconclude in section 6.

Our end goal is to train a neural network to learn a mapping between pore structure and the single-phase velocity ﬁeldof a ﬂuid, ﬁxing the ﬂuid properties and driving force. We aim to capture the steady-state ﬂuid ﬂow and associatedstatistics thereof, such as the permeability, but emphasize that other ﬁeld quantities at steady-state could be addressedwith the same framework. The main requirement for our approach is to have a domain (constituting a 3D binary array)and a matching response (simulation or experiment) across that domain. Additional information would be needed tocapture more complex situations such as time-dependent (unsteady) ﬂow.The task of learning the physics of transport in porous media requires a model that can learn complex relationships (likethe one between structure and ﬂuid velocity), and that has capacity to generalize many possible domains (outside ofthe ones used for training) for its broader applicability. The standard approach in deep learning applications obtaininga model with these two properties is: 1) by increasing its depth , and 2) by increasing its width . Although thesestrategies typically results in higher accuracy, they always result in a larger number of neurons required to evaluate themodel. The memory cost is proportional not only to the width and depth, but also with the volume that needs to beanalyzed. In practice this has limited the volume with which 3D data can be evaluated on a single GPU to sizes on theorder of [31]. Where the model depth refers to its number of layers Thereby increasing the number of neurons and layers PREPRINT - F

EBRUARY

17, 2021Figure 1: Overview of our multiscale network prediction. Starting from a 3D µ CT fractured carbonate (an unsegmentedcross-section of the domain is shown in the back, where unconnected microporosity can be observed, and the mainfracture of the domain is shown in dashed red lines), we predict the single-phase ﬂow ﬁeld of the image by combiningpredictions made over multiple resolutions of the input. Each of the neural network models (depicted as purple boxes)scans the image with different window sizes (that increase exponentially). The predictions of the models are all inkedtogether to provide an approximation of the Navier-Stokes solution.One approach to address this limitation is to break large domains into sub-samples, and augment the feature set sothat it contains hand-crafted information pertaining to the relative location of each sub-sample[31]. This can addinformation about the local and global boundaries surrounding the subsample. However, a clear limitation of thisapproach is its applicability for domains containing large-scale heterogeneity. Figure 2 shows the variation of propertiesas a function of window size for various data analyzed in this paper, and it is clear that in some cases the REV may bemuch larger than 80 voxels. If this data is split into sub-domains, the large-scale information about features is limited tothe hand-crafted information fed to the model, and the power of machine learning to analyze the correlations in the 3Dinput space is limited to window sizes that are orders of magnitude smaller than the REV.Figure 2: Coefﬁcient of variation (ratio between mean and standard deviation) of the porosity and ﬂuid ﬂow ﬁeld fordomains subsampled using increasingly larger window sizes. We show four examples: a sphere pack, a vuggy core,and imaged carbonate fracture (from Figure 1) and a synthetic fracture (from Section 4.2). For samples presentinglarge heterogeneities (like the fractures), very large windows are necessary to capture representative porosity and ﬂowpatterns.To address the difﬁculties with training to small volumes, we propose the MultiScale Network (

MS-Net ), a neuralnetwork system to learn physics in complex porous materials. The MS-Net is an coupled system of convolutional neuralnetworks that allows training to entire samples to understand the relationship between pore-structure and single-phaseﬂow physics, including large-scale effects. This makes it possible to provide accurate ﬂow ﬁeld estimations in largedomains, including large-scale heterogeneity, without complex feature engineering.4

PREPRINT - F

EBRUARY

17, 2021In the following sections, we ﬁrst by provide an overview of how convolutional neural networks work, and explainour proposed system, MS-NET, of single-scale models that work collectively to construct a prediction for a givensample. We then explain our loss function which couples these networks together. Finally, we explain the coarseningand reﬁning operations used to move inputs and outputs, respectively, between different scales.

Our model is comprised by individual, memory inexpensive neural networks which are described in Section A. All ofthe individual submodels of our system are composed by identical fully convolutional networks (which means thatthe dimensions of the 3D inputs are not modiﬁed along the way). The most important component of a convolutionalnetwork is the convolutional layer. This layer contains kernels (or ﬁlters ) of size ( k size ) that are slid across the input tocreate feature maps via the convolution operation: x out = f ( F (cid:88) i =1 x in ∗ k i + b i ) , (1)where F denotes the number of kernels of that layer, ∗ is the convolution operation, b a bias term, and f is a non-linear activation function . A detailed explanation of these elements is provided in [36]. The elements of these kernels arecalled trainable parameters , and they operate all the voxels of the domain. These parameters are optimized (or learned )during training. By stacking these convolutional layers, a convolutional network can build a model which is naturally translationally covariant , that is, a shift of the input image volume produces a shift in the output image volume [37]. Inthis work we use k size = 3 .An important concept in convolutional neural networks is the ﬁeld of vision . The ﬁeld of vision (FoV) dictates to whichextent parts of the input might affect sections of the output. For the case of a fully convolutional neural network with nocoarsening inside the layers of the network (like ours), the FoV is given by the following relation:FoV = L ( k size −

1) + 1 , (2)where L is the number of convolutional layers of the network, and k size the size of the kernel. For the case of the singlenetwork architecture used here (see Section A for details), FoV is 11 voxels. This is much smaller than the REV ofany of our samples (Figure 2). It is worth noting that it is not possible to add more layers to increase the FoV and stilltrain with samples that are or larger in current GPUs. To overcome this, we propose a system of multiscale neuralnetworks which will be explained in the next section. To be able to train a model with large samples, we propose a system of interacting small neural networks. The individualneural network structure is described in Section A. Each neural network takes as input the same domain at differentscales (as explained in Section 2.3). Each network is responsible for capturing the ﬂuid response at a certain resolutionand pass it to the next network (as shown in Figure 3).What changes between the individual networks is the number of inputs that they receive. The coarsest model receivesonly the domain representation at the coarsest scale (Equation 4), while the subsequent ones receive two, the domainrepresentation at the appropriate scale, and the prediction from the previous scale (Figure 3 and Equation 5). Asmentioned above, the input’s linear size is reduced by a factor of two between every scale.Mathematically, the system of networks can be described as such: X n = C ( X n − ) (3) ˆ y N = NN n ( X N ) (4) ˆ y n − = NN n − ( X n − , R m (ˆ y n )) + R m (ˆ y n ) (5) ... ˆ y = NN ( X , R m (ˆ y )) + R m (ˆ y ) , (6)where N indicates the coarsest scale, n indexes scales, NN n the individual neural networks, and C () and R m () , thecoarsening reﬁnement operations, respectively, which will be explained in Section 2.4. In this system of equations,the input is ﬁrst coarsened over as many times as there are networks. The coarsest network has the largest FoV withrespect to the input, and processes the largest scale aspects of the problem. The results of this network are used both asa component of the output of the system, and are made available for the ﬁner scale networks, so that ﬁner-scale, more5 PREPRINT - F

EBRUARY

17, 2021local processing that can resolve details of ﬂow have access to the information processed at larger scales. The processof coarsening an image progressively doubles the ﬁeld of vision per scale, yielding the following formula for FoV inMS-net: FoV MS − Net = ( L ( k size −

1) + 1) · N , (7)As we stated in the previous section, the N th network has a FoV of 11 voxels. With our proposed system, the ( N − th network can see with a window of 22, and so on, with the FoV increasing exponentially with the number of scales. Usingthese principles, the model is able to analyze large portions of the image that can contain large-scale heterogeneitiesaffecting ﬂow. The sizes of the windows do not strictly need to be REVs, since the network still processes the entireimage at once. Nevertheless, the bigger the FoV, the easier it is to learn the long range interactions of big heterogeneitiesaffecting ﬂow. Computationally, it would be possible to add enough scales to be able to have FoVs that are close to theentire computational size of the sample ( − ). Early experiments with a very large number of scales revealedthat this limits the applicability of the model when applied to small samples.Figure 3: The MS-Net pipeline. Our model consists of a system of fully convolutional neural networks where thefeed-forward pass is done from coarse-to-ﬁne (left to right). Each scale ( n ) learns the relationship of solid structure andvelocity response at the particular image resolution. The number of scales is set by the user and all these scales aretrained simultaneously. In this ﬁgure we are showing the original (ﬁnest) scale, the coarsest (n) and the second coarsest(n-1). The masked reﬁnement step is explained on Section 2.4). X represents the original structure and ˆ y the ﬁnalprediction of the model. Our workﬂow relies on a multiscale modeling approach. We identify the scale number to denote how many times theoriginal image has been coarsened. Hence, scale scale

Euclidean distance ), which labels each void voxel with the distance to the closest solid wall (seen in Figure 4). We6

PREPRINT - F

EBRUARY

17, 2021selected this feature because it is very simple and inexpensive to compute, and provides more information than the binaryimage alone. The fact that no additional inputs are needed makes the MS-Net less dependant on problem-dependentfeature engineering.This distance is related to the velocity ﬁeld in a non-linear way, which must be learned by the network. Nonetheless, itis possible to visualize how coarser images provide more straightforward information about ﬂuid ﬂow. In Figure 5 weshow input domains against different scales (top row) and corresponding, scatter plots relating the feature value to themagnitude of the velocity. At scale zero, the distance value is not strongly related to the velocity; for a given distancevalue, the velocity may still range over more than three orders of magnitude. At scale 3, the feature and velocity havebeen coarsened several times, and a clearer relationship between the distance and velocity emerges. It is then the job ofthe N th neural network to determine how the 3D pattern of features is non-linearly related to the velocity at this scale,and to pass this information on to networks that operate at a ﬁner scale, as shown in Figure 3.Figure 5: Top: XY-plane cross-section of the velocity in Z-direction of increasingly coarser scales. Bottom: Scatterplots of the normalized distance transform vs velocity after coarsening steps. As the system is coarsened, the correlationbetween the distance transform and the velocity becomes stronger. The normalized distance in the x-axis ranges from 0to 1. We use the term coarsening ( C ) to describe the operation of reducing the size of an image by operating (in thiscase, averaging) its neighboring pixels. We use the term reﬁnement ( R ) to denote the operation of increasing thecomputational size of an image, but not the amount of information (this is also known as image upscaling, but we usethe term reﬁnement to avoid potential confusion with upscaling in reservoir engineering or other surrogate modeling).The coarsening and reﬁning operations should have the following properties applied to data z (i.e. input or outputvolumes): (cid:104) z n (cid:105) = (cid:104) C ( z n − ) (cid:105) (8) z n = C ( R ( z n )) , (9)the angle brackets (cid:104)(cid:105) represent the volumetric average over space, and the operation R () projects solutions from a coarsespace back into the ﬁner resolution space while assigning zero predictions to regions that are occupied by the solid. Theﬁrst equation indicates that Coarsening should preserve the average prediction, and the second says that Reﬁnementshould be a pseudo-inverse for coarsening – that is, if we take an image, reﬁne it, and then subsequently coarsen it, weshould arrive back at the original image. Note that the opposite operation – coarsening followed by reﬁnement – cannotbe invertible, as the coarser scale image manifestly contains less information than than the ﬁne scale one.The coarsening operation is simple. As mentioned in Section 2.3, we ﬁrst coarsen our input domain n-times. Coarseningis applied via a simple nearest neighbor average; every region of the image is coarsened to a single voxel by averaging.This operation is known as pooling in image processing.The reﬁnement operation is more subtle. There exists a naive near-neighbors reﬁnement algorithm, wherein the voxelvalue is replicated across each region in the reﬁned image. However, this presents difﬁculties for prediction in porousmedia – namely, that if this operation is used for reﬁnement, ﬂow properties from coarser networks will be predicted onsolid voxels where they are by deﬁnition zero, and the ﬁne-scale networks will be forced to learn how to cancel these7 PREPRINT - F

EBRUARY

17, 2021predictions exactly. Early experiments with this naive reﬁnement operation conﬁrmed that this behavior is problematicfor learning.Instead, we base our reﬁnement operation on a reﬁnement mask derived from the input solid/ﬂuid nodes. This isperformed such that, when reﬁned back to the ﬁnest scale, the prediction will be exactly zero on the solid nodes andconstant on the ﬂuid nodes, while conserving the average. We refer to this masked reﬁnement as R m . This requirescomputing reﬁnement masks that re-weight regions in the reﬁned ﬁeld based on the percentage of solid nodes ineach sub-region. Reﬁnement masks for a particular example are visualized in Figure 6. An example calculation andpseudo-code for this operation is given in the appendix Section A.3. Then the masked reﬁnement operation can simplybe computed as naive reﬁnement, followed by multiplication by the mask. Figure A.4 demonstrates the differencebetween the operations by comparing naive reﬁnement with masked reﬁnement.The masked reﬁnement operation is cheap and parameter-free; nothing needs to be learned by the neural network,unlike, for example, the transposed convolution [36]. We thus ﬁnd it an apt way to account for the physical constraintsposed by ﬁelds deﬁned only within the pore. The masked reﬁnement operation is also the unique reﬁnement operationsuch that when applied to the input binary domain, coarsening and reﬁnement are true inverses; the masked reﬁnementoperation recovers the original input domain from the coarsened input domain.Figure 6: Schematic of the coarsening and masking process. Top: Starting from the original domain, a series ofcoarsening steps are performed, every neighboring voxels are averaged to produce one voxel at the following scale.Structural information is lost along the way. Bottom: Masks for each scale, which re-weight a naive reﬁnement operator.These masks have larger weights in regions where the prediction must be re-distributed, near the boundaries with solidnodes, and is zero in regions that correspond entirely to solid nodes. To train the MS-Net, the weights of the convolutional kernels and biases (Equation 12) are optimized to minimize thefollowing equation based on the mean squared error for every sample s at every scale n between the prediction at thatscale ˆ y n,s and the true value coarsened at that scale y n,s : L = S (cid:88) s =0 n (cid:88) i =0 (cid:104) ( y i,s − ˆ y i,s ) (cid:105) σ y s , (10)where n is the total number of scales, and S the number of samples. This equation accounts for the variance inpredictions, σ y s , in order to weight samples that contain very different overall velocity scales (permeabilities) moreevenly. Since the coarsest scale is implicitly present in the solution at every scale (c.f. Equation 6), the coarsest modelis encouraged to output most of the magnitude of the velocity. This loss function is also useful to be able to train withsamples of varying structural heterogeneity (and ﬂuid response), since the mean square error is normalized with thesample variance to obtain a dimensionless quantity that is consistent for every sample.8 PREPRINT - F

EBRUARY

17, 2021

To train and test our proposed model, we carried-out single-phase simulations using our in-house multi-relaxation timeD3Q19 (three dimensions and 19 discrete velocities) lattice-Boltzmann code [38]. Our computational domains areperiodic in the z-direction, where an external force is applied to drive the ﬂuid forward simulating a pressure drop. Therest of the domain faces are treated as impermeable. The simulation is said to achieve convergence when the coefﬁcientof variation of the velocity ﬁeld is smaller than − between 1000 consecutive iterations. We run each simulationon 96 cores at the Texas Advanced Computing Center. The output of the LB solver is the velocity ﬁeld in the directionof ﬂow (here, the z-direction). To calculate the permeability of our sample we use the following equation: k sample = vµ ∆ p (cid:18) dxdl (cid:19) , (11)where v is the mean of the velocity ﬁeld in the direction of ﬂow, µ and ∆ p are the viscosity and the pressure gradientrespectively, and dxdl is the resolution of the sample (in meters per voxel). This is called the Darcy Equation . Althoughwe used the LBM to carry-out our simulations, the following workﬂow is method agnostic. It only relies on having avoxelized 3D domain with its corresponding voxelized response.

Below we will present two computational experiments (porous media and single fractures) that we carried-out to showhow the MS-Net is able to learn from 3D domains with heterogeneities at different scales. In the ﬁrst subsection we willshow until what extent the MS-Net is able to learn from very simple sphere-pack geometries to be able to accuratelypredict a wide range of realistic samples from the Digital Rock Portal[2]. It is worth noting that simulating the trainingset took less than one hour per sample, and training the model took seven hours, while some samples in the test set tookover a day to achieve convergence through the numerical LBM solver. In the second experiment, we show that trainingto two fracture samples of different aperture sizes and roughness parameters is enough to estimate permeabilities for awider family aperture sizes and roughness.

To explore the ability of the MS-Net to learn the main features of ﬂow through porous media, we utilize a series of ﬁve256 numerically dilated sphere packs (starting from the original sphere pack imaged by from [39]) to train the network(Figure B.1). The porosities of the training samples range from 10 to 29%, and their permeabilities from 1 to 37 darcys.For reference, the simulation of the tightest sample took less than 50 minutes to converge.Our model consists of four scales, using n +1 ﬁlters per scale (2 in the ﬁnest model and 16 in the coarsest. Duringtraining, each sample is passed through the model (as shown in Figure 3), and the the model parameters (the numbers inthe 3D ﬁlters and the biases) are optimized to obtain a functional relationship between the 3D image and the velocityﬁeld by minimizing the loss function (Equation 10). In short, we are looking to obtain a relation of the form of velocity v z as a function of the distance transform feature X , that is, v z = f ( x ) .The network was trained for 2500 epochs, which took approximately seven hours. The ﬁrst 1000 epochs of the trainingare shown in Figure 7. We also tried augmenting the training dataset utilizing 90 degree rotations about the ﬂow axis,but no beneﬁts were observed. The loss function value per sample and the mean velocity per scale of the networkare plotted in Figure 7. As seen in the top row of the ﬁgure, the coarsest scale is responsible for most of the velocitymagnitude, and ﬁner-scale models make comparatively small adjustments. The bottom row of plots shows the loss foreach sample. We note that the normalization of our loss (Equation 10) puts the loss for each sample on a similar scale,despite the considerable variance in porosity and permeability between samples. Training to the sphere-pack dataset reveals that the model can learn the main factors affecting/contributing to ﬂowthrough permeable media. To assess the performance of the trained MS-Net, we used Fontainebleau sandstones [40] atdifferent computational sizes ( and ). The cross-sections of this structures are visualized in the right panel ofFigure B.1. The results are presented in Table 1. The relative percent error of the permeability can be calculated as e r = | − k pred k | . The typical accuracy of the permeability is approximately within 10%. One remarkable fact is thatthe model retains approximately the same accuracy when applied to samples as samples.It is worth noting that the simulation of the sample with a porosity of 9.8 % took 13 hours to converge; this singlesample takes as much computational effort as the entire construction of training data and model.9 PREPRINT - F

EBRUARY

17, 2021Figure 7: (top) Normalized mean velocity per scale. Coarser scales are shown with lighter lines. The normalized meanvelocity of each sample is shown on top. (bottom) Values of the loss function for each sample during training. In thisplot is visible that even when the permeability of the samples is in three different orders of magnitude the networkassigns roughly the same importance following the proposed loss function (its value is in the same order of magnitude).Also, it is worth noting that most of the permeability magnitude is predicted by the coarsest scale (since this one isimplicitly given extra importance in the loss computation from Equation 6).Table 1: Results of the Fontainebleau sandstone test set. We show the true permeability k, calculated using the LBM,and the ratio between the true permeability and the prediction of our model k pred /k.Porosity [%] Size k [m ] k pred /k10.4 256 To assess the ability of the model trained with sphere packs to predict ﬂow in more heterogeneous materials, we testedsamples of different computational size and complexity. We split the data into three groups according to their type:• Group I: Artiﬁcially-created samples: In this group we include a sphere pack with an additional grain dilation(that lowered the porosity) from the tightest training sample , a vuggy core created by removing 10% of thematrix grains from the original sphere pack [39] to open up pore-space and create disconnected vugs [41, 42],then the grain were numerically dilated two times to simulate cement growth, a sample made out of spheres ofdifferent sizes where the porosity at the inlet starts at 35% and it decreases to 2% at the outlet, and this lastsample reﬂected in the direction of ﬂow.• Group II: Realistic samples: Bentheimer sandstone [43], an artiﬁcial multiscale sample (microsand)[44],Castlegate sandstone [44]. The sizes where selected such that they were an REV.• Group III: Fractured domains: Segmented micro-CT image of a fractured carbonate [45], layered bidisperedpacking recreating a propped fracture [46], and a sphere pack where the spheres intersecting a plane in the We tried to carry-out a simulation of a structure with an additional grain dilation (4.7 % porosity), however, the job timed-outafter 48 hrs without achieving convergence. PREPRINT - F

EBRUARY

17, 2021middle of the sample where shifted to create a preferential conduit, and this same structure rotated 90 degreesin the direction of ﬂow (so that the fracture is in the middle plane of the ﬂow axis and deters ﬂow). [46].The tightest sample (porosity 7.5%) took 26 hours running on 100 cores to achieve convergence. Besides an accuratepermeability estimate, another measure of precision if the loss function value at the ﬁnest scale (from Equation 10).These two are related, but not simple transformations of each other. The loss function provides a volumetric average ofthe ﬂow ﬁeld error. We normalized this value using the sample’s porosity to obtain a comparable quantity, which resultsin a quantity that is roughly the same for all samples. Visualizations of some of these samples can be seen in Figure B.2and the prediction results are shown in Table 2.Table 2: Results of the predictions on the test set. We additionally show here the ratio between the loss (Equation 10)and the porosity which is another measure of accuracy of the prediction.Group Sample Size φ k [m ] loss/ φ k pred /k I Porosity gradient 256 Table 2 reveals remarkable performance on the breadth of geometries considered. Samples from all groups are predictedvery well, with permeability errors for the most part within about 25% of the true value, through samples ranging bythree orders of magnitude in permeability.Figure 8: Cross-sections of the Castlegate sandstone simulation result, MS-Net prediction, and L1 (absolute) error.Further analysis revealed that the highest voxel-wise errors were located in the most tortuous paths. We hypothesizethat this is due to the fact that the original training set did not contain structures like this. Nevertheless, the highesterrors are an order of magnitude smaller the true velocity.Two notable failure cases emerged. In the ﬁrst, the Castlegate sandstone, we ﬁnd that the ﬂow ﬁeld prediction is stillsomewhat reasonable, as visualized in Figure 8. The largest failure case (highlighted in bold in table 2), is the fracturedsphere pack with a fracture parallel to ﬂuid ﬂow. In this case, the model is not able to provide an accurate ﬂow ﬁeld dueto the difference in ﬂow behavior that a big preferential channel (like this synthetic fracture) imposes compared with thetraining data, and as a result the predicted permeability is off by a factor of 5. Likewise, the sample also has the highestloss value. However, since no example of any similar nature is found in the training set, we investigate in the followingsection the ability of the model to predict on parallel fractures when presented with parallel fractures during training.11

PREPRINT - F

EBRUARY

17, 2021Figure 9: Five synthetic fracture cross-sections. The distance transform map of ﬁve fractures with different roughnessis shown. All of the domains are shown in one merged ﬁgure for brevity. Their fractal exponent ( D f ) decreases from2.5 to 2.1 (top to bottom). Since the MS-Net is able to see the entire domain at each iteration, we carried-out an additional numerical experimentwith domains hosting single rough fractures. The domains were created synthetically using the model proposed by[47], where the roughness of the fracture surfaces is controlled by a fractal dimension D f . The cross-sections of thedomain can be seen in Figure 9. We utilize two sets of domains, each having a different mean fracture aperture (44and 22 voxels) and ﬁve fractures with increasing roughness [48]. The total computational size of these is 256x256x60.Since these domains were created synthetically, the voxel length can be scaled to any desired length as long a ﬂowremains in the laminar regime. We trained our model using two of these synthetic fractures and tested them on the other8 fractures. The results can be seen in Table 3.Table 3: Results of training and testing in different fractures. The ﬁrst column indicates if the sample was part of thetraining set, followed by the mean aperture Ap , fractal exponent D f and permeability k.Train Ap D f k [Darcy] k pred /k

44 2.1 953 1.032.2 764 1.082.3 577 1.012.4 423 1.03 (cid:88) (cid:88)

22 2.1 224 1.0072.2 191 1.0032.3 154 1.012.4 120 1.042.5 91 0.98Propped fracture 97 1.17Fractured sphere pack 520 1.03We selected two samples with different mean aperture and roughness exponent so that the network might learn howthese factors affected ﬂow. From the results of Table 3 we can conclude that the network is able to extrapolate accuratelyto construct solutions of the ﬂow ﬁeld for a wide variety of fractures. The training fractures have permeabilities of 224and 301 Darcys, whereas accurate test results are found ranging between 91 and 953 Darcys. This gives strong evidencethat the model is able to distill the underlying geometrical factors affecting ﬂow.We contrast the machine learning approach with classical method such as the cubic law [49]. For these syntheticallycreated fracture, the cubic law would return a permeability value that depends only on the aperture size, whereas LBM12

PREPRINT - F

EBRUARY

17, 2021data reveals that the roughness can inﬂuence the permeability by a factor of 3. There have been a number of papersattempting to modify the cubic law for a more accurate permeability prediction. However, there is evidence that thosepredictions could be off by six orders of magnitude [49]. There are also other approaches in line with our hypothesisthat a fracture aperture ﬁeld should be analyzed with local moving windows [50].

The number of scales used could be varied. For our experiments, we chose to train a model with four scales, this numberis a parameter that could be explored in further research. The FoV of the coarsest model is of 88 voxels wide, and themodel itself operates on the entire domain simultaneously, rather than on subdomains. For comparison, the FoV of thePoreFlow-Net[31] is of 20 voxels, and operated on subdomains of size due to memory limitations.We have shown the MS-Net performing inference in volumes up , chosen to obtain the LBM solutions in areasonable time-frame. MS-net can be scaled to larger systems on a single GPU. Table 4 reports the maximum sizesystem which a forward pass of the ﬁnest-scale model was successful for various recent GPU architectures, withoutmodifying our workﬂow. Additional strategies such as distributing the computation across multiple GPUs, or pruningthe trained model[51, 52] would be able to push this scheme to even larger computational domains. For all architecturestested, the prediction time was on the order of one second, whereas LBM simulations on a tight material may takeseveral days to converge, even when running on hundreds of cores.Table 4: Prediction size achieved in 3 different GPUs.GPU Size achievedNvidia M600 (24 Gb) 704 Nvidia P100 (12 Gb) 640 Nvidia A100 (40 Gb) 832 We also believe that our work workﬂow could be also utilized for compressing simulation data, since, as seen in Figure7, a single model is able to learn several simulations with a high degree of ﬁdelity to make them more easily portable(also called representation learning in deep learning). A training example is on the order of 500 Mb of data in singleprecision ﬂoat-point, whereas the trained model is approximately 25 Kb. Thus, when training to a single example,the neural network encodes the solution using approximately × − bytes per voxel; it is tremendously more efﬁcientthan any ﬂoating point representation. One would also need to keep the input domain to recover the solution, but thisis itself a binary array that is more easily compressed than the ﬂuid ﬂow itself. For example, we applied standardcompression methods to the binary input array for the Castlegate sandstone, which then ﬁt into 2.4 MB of memory. It is well-established that semi-analytical formulas or correlations derived from experiments can fail to predict perme-ability by orders of magnitude. Porosity values alone can be misleading due to the fact that this does not account forhow certain structures affect ﬂow in a given direction, or due to the presence of heterogeneities. However, going beyondsimple approximations is often expensive. We have presented MS-Net, a multiscale convolutional network approach, inorder to better utilize imaging technologies for identifying pore structure and associate them with ﬂow ﬁelds. Whentraining on sphere packs and fractures, MS-Net learns complex relationships efﬁciently, and a trained model can makenew predictions in seconds, whereas new LBM computations can take hours to days to evaluate.We believe that it would be possible to train the MS-Net using more data to create a predictive model that could be ableto generalize to more domains simultaneously (unsaturated soils, membranes, mudrocks and kerogen). This could bedone using the active learning principles, carrying out simulations where the model has a low degree of conﬁdence inits prediction, such as in [53].The MS-net architecture is an efﬁcient way of training with large 3D arrays compared to standard neural networkarchitectures from computer vision. Although this model is shown to return predictions that are accurate, thereare desirable physical properties that might be realized by a future variant, such as mass or momentum continuity.One avenue of future work could be to focused on designing conservation modules for the MS-Net using such hardconstraints for ConvNets[54]. An important hurdle hurdle of applying these techniques in porous media problems isthat the bounded domains make the implementation of these techniques more challenging.13

PREPRINT - F

EBRUARY

17, 2021Another important area of future work would be to address data from different scientiﬁc domains. This includes similarendeavors such as steady-state multiphase ﬂow, waved propagation through a solid matrix, and component transport inporous media. The model could also be applied to other 3D problems, such as astronomical ﬂows, or the ﬂow of bloodthrough highly branched vessel structures.Lastly, we believe that an important endeavor is to create more realistic domains, with multiscale features such as fractalstatistics. One avenue to pursue such methods is the Generative Adversarial Network (GAN), another ML techniquewhich allows a generator model to learn to create new data by fooling a discriminator model (the adversary ) that istrained to distinguish between real data and the Generator’s outputs. The multiscale technique has been applied to manyreal-world datasets such as human faces, but has not, to our knowledge, been used to construct synthetic porous media.

We gratefully recognize the Texas Advanced Computing Center, Los Alamos National Laboratory’s InsitutionalComputing, and Los Alamos National Laboratory’s Darwin cluster for their high performance computing resources. M.Pyrcz, J. Santos, and H. Jo acknowledge support from DIRECT Industry Afﬁliates Program (IAP), and M. Prodanovi´cand J. Santos acknowledge support from Digital Rock Petrophysics IAP both of The University of Texas at Austin. J.Santos, Q. Kang, H. Viswanathan, and N. Lubbers were funded in part by the U.S. Department of Energy through LosAlamos National Laboratory’s Laboratory Directed Research and Development program (LANL-LDRD). J. Santoswould like to thank Alexander Hillsley for his assistance with the diagrams, Hasan Khan for providing the vuggygeometry, and Rafael Salazar-Tío for many useful discussions.

Appendix A Single neural network description

The individual submodels of our system are composed by fully convolutional networks (which means that the dimensionsof the 3D inputs are not modiﬁed along the way). Each of them is composed by stacks with the following layers:

Conv3D 3 Conv3D 3 InstanceNorm

Conv3D 3 Conv3D 3 Conv3D 3 InstanceNorm

CELU

InstanceNormInstanceNormInstanceNorm

Conv3D 1 CELUCELUCELU

Figure A.1: Schematic of a single-scale model.• 3D convolution with a kernel: This layer contains kernels (or ﬁlters ) of size that are slid across the inputto create feature maps via the convolution operation: x out = F (cid:88) i =1 x in ∗ k i + b i , (12)14 PREPRINT - F

EBRUARY

17, 2021where F denotes the number of kernels of that layer, ∗ is the convolution operation and b a bias term. Thenumbers contained in these kernels are called trainable parameters , and are optimized during training.• Instance Normalization [55]: This layer normalizes its inputs to have a mean of zero and a standard deviationof one. This facilitates training a model with samples that have strong velocity contrasts (different orders ofmagnitude). This is done to every sample using their individual statistics. x out = x in − x √ σ + (cid:15) , (13)where x is the sample mean and σ its standard deviation, (cid:15) is a small constant to avoid divisions over zero.This layers allows better ﬂow of information (by constraining the mean and the standard deviation of theoutputs) and reduces the risk of training diverging.• Continuously Differentiable Exponential Linear Unit ( CELU ) [56]: This layers help to build non-linearrelationships (like the one between pore-structure and velocity ﬁeld, shown in Figure 5). All the data thatpasses through this layer is transformed using the following equation: x out = max(0 , x in ) + min(0 , α · ( e x in α − , (14)where α is set to 2. We utilize this function because it speeds-up and improves training by virtue of not havingvanishing gradients and by having mean values near zero. The outputs of this network are constrained fromminus two to inﬁnity.• 3D convolution with a kernel: The kernel acts as a linear regressor which reduces the dimensionality ofthe output to one single 3D image (in our case, the velocity ﬁeld). This is done to combine all the feature mapsfrom the previous blocks (Figure A.1) and output a single 3D matrix (in this case, the velocity at that particularscale).The fourth block of our network does not include an activation function because we would like to give the modelexpressive power to be able to output negative velocities. A.1 Normalization of the data and initialization of the network parameters

We ﬁrst start the training workﬂow by coarsening the initial inputs n times (depending on the number of scales desiredSection 2.3). Then we center the velocity of all the training set to be near one by dividing the LB simulation resultswith a constant (this procedure can be seen in Figure A.2). This has the advantage of not having to compute and storethe summary statistics of the training set (as opposed to default normalization approaches). It also preserves the solidvalues as zero.We have also observed, that if we scale the weights of the last layer of the coarsest model to output results that are closeto one (the mean velocity of our normalized data), the training exhibited a speed-up of several hours, since the initialprediction is a closer approximation to the solution compared to the default initialization scheme [57].Figure A.2: (left) Mean velocity per scale of the LB simulation results (in lattice units). Each dot represents a sample.The samples have the same mean velocity at each scale. (right) Mean velocity after normalizing the data.15 PREPRINT - F

EBRUARY

17, 2021

A.2 Coarsening (pooling) operation

The coarsening (or pooling ) operation is deﬁned as: k = J d · d (15) C ( x in ) = x in ∗ k where J is an array of all ones, d the number dimensions of the problem, and ∗ the convolution operation with a strideof 2. A.3 Mask calculation

We use the following python code to generate the masks of each sample: masks = [ None ]*( num_scales -1) pooled = [ None ]*( num_scales ) pooled [0] = binary_image for scale in range (1 , num_scales ): pooled [ scale ] = AvgPool3d ( kernel_size = 2) ( pooled [ scale -1]) denom = pooled [ scale ]. clone () denom [ denom ==0] = 1 e8 for ax in range (3) : denom = denom . repeat_interleave ( repeats =2 , axis = ax ) masks [ scale -1] = pooled [ scale -1]/ denom Listing 1: Python example for obtaining the masks.Figure A.3: Schematic of the coarsening and reﬁning process.Figure A.4: Comparison between upscaling and the proposed masked upscaling. Even when all the details are not fullyrecovered with the method, it preserves the mean velocity predictions from the coarse scale and it does not allow ﬂuidin the solid space. Note how the dimensions of the image change (not to scale).16

PREPRINT - F

EBRUARY

17, 2021as an example if we have a 2D neighborhood of 2x2 pixels where 3 of the pixels are solid, when we upscale our imagethe only void pixel

Appendix B Training and testing data

Figure B.1: Superimposed cross sections of the sphere pack training set and the sandstone test set. The image showsthe ﬁve binary samples per set which were superimposed for visualization purposes. The highs of the color bar stand forsolids that are present in every domain while the lows are sections that are only present in the lower porosity samples.Figure B.2: Samples of the additional test set: a) Vuggy core, b) Porosity gradient, c) Propped fracture, d) Fracturedsphere pack. The computational size of these samples is 256 .17 PREPRINT - F

EBRUARY

17, 2021Figure B.3: More images of the test set

References [1] Laura L. Schepp, Benedikt Ahrens, Martin Balcewicz, Mandy Duda, Mathias Nehler, Maria Osorno, David Uribe,Holger Steeb, Benoit Nigon, Ferdinand Stockhert, Donald A. Swanson, Mirko Siegert, Marcel Gurris, and Erik H.Saenger. Digital rock physics and laboratory considerations on a high-porosity volcanic rock.

Scientiﬁc Reports ,10(1):1–16, 2020.[2] Matthew Hanlon Gaurav Nanda Prateek Agarwal Masa Prodanovic, Maria Esteva. Digital Rocks Portal: arepository for porous media images.[3] V. Cnudde and M. N. Boone. High-resolution X-ray computed tomography in geosciences: A review of thecurrent technology and applications.

Earth-Science Reviews , 123:1–17, 2013.[4] Dorthe Wildenschild and Adrian P. Sheppard. X-ray imaging and analysis techniques for quantifying pore-scalestructure and processes in subsurface porous medium systems.

Advances in Water Resources , 51:217–246, 1 2013.[5] Chongxun Pan, M. Hilpert, and C. T. Miller. Lattice-Boltzmann simulation of two-phase ﬂow in porous media.

Water Resources Research , 40(1):1–14, 2004.[6] Alexandre Tartakovsky and Paul Meakin. Modeling of surface tension and contact angles with smoothed particlehydrodynamics.

Physical Review E - Statistical, Nonlinear, and Soft Matter Physics , 72(2):1–9, 2005.[7] Joshua A White, Ronaldo I Borja, and Joanne T Fredrich. Calculating the effective permeability of sandstone withmultiscale lattice Boltzmann/ﬁnite element simulations.

Acta Geotechnica , 1(4):195–209, 2006.[8] Patrick Jenny, S H Lee, and Hamdi A Tchelepi. Multi-scale ﬁnite-volume method for elliptic problems insubsurface ﬂow simulation.

Journal of Computational Physics , 187(1):47–67, 2003.[9] Huafeng Sun, Sandra Vega, and Guo Tao. Journal of Petroleum Science and Engineering Analysis of heterogeneityand permeability anisotropy in carbonate rock samples using digital rock physics.

Journal of Petroleum Scienceand Engineering , 156(September 2016):419–429, 2017.[10] Clare E Bond, Yannick Kremer, Gareth Johnson, Nigel Hicks, Robert Lister, Dave G Jones, R Stuart Haszeldine,Ian Saunders, Stuart M V Gilﬁllan, Zoe K Shipton, and Jonathan Pearce. International Journal of Greenhouse GasControl The physical characteristics of a CO 2 seeping fault : The implications of fracture permeability for carboncapture and storage integrity.

International Journal of Greenhouse Gas Control , 61:49–60, 2017.18

PREPRINT - F

EBRUARY

17, 2021[11] K.J. Cunningham and M.C. Sukop. Multiple Technologies Applied to Characterization of the Porosity andPermeability of the Biscayne Aquifer , Florida. Technical Report February, 2011.[12] Eduardo Molina, Gloria Arancibia, Josefa Sepúlveda, Tomás Roquer, Domingo Mery, and Diego Morata. DigitalRock Approach to Model the Permeability in an Artiﬁcially Heated and Fractured Granodiorite from the Liqui neGeothermal System ( 39 ◦ S ).

Rock Mechanics and Rock Engineering , 53(3):1179–1204, 2020.[13] Brian Holley and Amir Faghri. Permeability and effective pore radius measurements for heat pipe and fuel cellapplications.

Applied Thermal Engineering , 26(4):448–462, 2006.[14] P C Carman. Permeability of saturated sands, soils and clays.

The Journal of Agricultural Science , 29:262–273,1939.[15] P. G. Carman. Fluid ﬂow through granular beds.

Chemical Engineering Research and Design , 75(1 SUPPL.):S32–S48, 1997.[16] Josef Kozeny. Ueber kapillare Leitung des Wassers im Boden.

Akad. Wiss.Wien , 136:271–306, 1927.[17] Jacob Bear.

Dynamics of ﬂuids in porous media . American Elsevier, 1972.[18] Nishank Saxena, Ronny Hofmann, Faruk O. Alpak, Steffen Berg, Jesse Dietderich, Umang Agarwal, Kunj Tandon,Sander Hunter, Justin Freeman, and Ove Bjorn Wilson. References and benchmarks for pore-scale ﬂow simulatedusing micro-CT images of porous media and digital rocks.

Advances in Water Resources , 109:211–235, 2017.[19] Yehuda Bachmat and Jacob Bear. On the concept and size of a representative elementary volume (REV). In

Advances in transport phenomena in porous media , pages 3–20. Springer, 1987.[20] Molly S. Costanza-Robinson, Benjamin D. Estabrook, and David F. Fouhey. Representative elementary volumeestimation for porosity, moisture saturation, and air-water interfacial areas in unsaturated porous media: Dataquality implications.

Water Resources Research , 47(7):1–12, 2011.[21] J.E. Santos, M. Prodanovic, C.J. Landry, and H. Jo. Determining the impact of mineralogy composition for multi-phase ﬂow through hydraulically induced fractures. In

SPE/AAPG/SEG Unconventional Resources TechnologyConference 2018, URTC 2018 , 2018.[22] Eric J Guiltinan, J. E. Santos, M. Bayani Cardenas, D. Nicolas Espinoza, and Qinjun Kang. Two-phase ﬂuid ﬂowproperties of rough fractures with heterogeneous wettability: analysis with lattice Boltzmann simulations.

WaterResources Research , 12 2020.[23] Abdullah Alakeely and Roland N. Horne. Simulating the Behavior of Reservoirs with Convolutional and RecurrentNeural Networks.

SPE Reservoir Evaluation and Engineering , 23(3):992–1005, 2020.[24] H. Jo, J.E. Santos, and M.J. Pyrcz. Conditioning well data to rule-based lobe model by machine learning with agenerative adversarial network.

Energy Exploration and Exploitation , 38(6), 2020.[25] Wen Pan, Carlos Torres-Verdín, and Michael J. Pyrcz. Stochastic Pix2pix: A New Machine Learning Methodfor Geophysical and Well Conditioning of Rule-Based Channel Reservoir Models.

Natural Resources Research ,2020.[26] Eric Guiltinan, Javier E Santos, and Qinjun Kang. Residual Saturation During Multiphase Displacement inHeterogeneous Fractures with Novel Deep Learning Prediction. In

Unconventional Resources TechnologyConference (URTeC) , Austin, 2020. Society of Petroleum Engineers (SPE).[27] Lukas Mosser, Olivier Dubrule, and Martin J. Blunt. Reconstruction of three-dimensional porous media usinggenerative adversarial neural networks. 2017.[28] Lukas Mosser, Olivier Dubrule, and Martin J. Blunt. Stochastic reconstruction of an oolitic limestone by generativeadversarial networks. pages 1–22, 2017.[29] Traiwit Chung, Ying Da Wang, Ryan T. Armstrong, and Peyman Mostaghimi. CNN-PFVS: Integrating NeuralNetwork and Finite Volume Models to Accelerate Flow Simulation on Pore Space Images.

Transport in PorousMedia , 135(1):25–37, 2020.[30] Abhishek Bihani, Hugh Daigle, and Javier E Santos. MudrockNet : Semantic Segmentation of Mudrock SEMImages through Deep Learning MudrockNet : Semantic Segmentation of Mudrock SEM Images through DeepLearning. (August), 2020.[31] Javier E. Santos, Duo Xu, Honggeun Jo, Christopher J. Landry, Masa Prodanovi´c, and Michael J. Pyrcz. PoreFlow-Net: A 3D convolutional neural network to predict ﬂuid ﬂow through porous media.

Advances in Water Resources ,138(February):103539, 4 2020.[32] Debesh Jha, Pia H Smedsrud, Michael A Riegler, Dag Johansen, and Thomas De Lange. ResUNet ++ : AnAdvanced Architecture for Medical Image Segmentation.19

PREPRINT - F

EBRUARY

17, 2021[33] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Identity mappings in deep residual networks.

LectureNotes in Computer Science (including subseries Lecture Notes in Artiﬁcial Intelligence and Lecture Notes inBioinformatics) , 9908 LNCS:630–645, 2016.[34] Samuli Laine Jaakko Lehtinen Tero Karras, Timo Aila. Progressive Growing of GANs for Improved Quality,Stability, and Variation. pages 1–26, 2018.[35] Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. SinGAN: Learning a Generative Model from a SingleNatural Image. 2019.[36] Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

Deep Learning . MIT Press, 2016.[37] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.

Nature , 521(7553):436–444, 5 2015.[38] Dominique D’Humieres, Irina Ginzburg, Krafczyk Manfred, Lallemand Pierre, and Luo Li-Shi. Multiple-Relaxation-Time Lattice Boltzmann Models in 3D Multiple-relaxation-time lattice Boltzmann.

NASA

SPE Journal , 24(3):1164–1178, 2019.[42] Hasan Javed Khan, David DiCarlo, and Masa Prodanovi´c. The effect of vug distribution on particle straining inpermeable media.

Journal of Hydrology

SPE Journal , 14(3):11–14, 9 2009.[46] Masa Prodanovic, Steven L Bryant, and Zuleima T Karpyn. Investigating Matrix / Fracture Transfer via a LevelSet Method for Drainage and Imbibition. (March):125–136, 2010.[47] Steven R. Ogilvie, Evgeny Isakov, and Paul W.J. Glover. Fluid ﬂow through rough fractures in rocks. II: A newmatching model for rough rock fractures.

Earth and Planetary Science Letters

Journal of Geophysical Research B:Solid Earth , 120(8):5453–5466, 2015.[50] Assaf P. Oron and Brian Berkowitz. Flow in rock fractures: The local cubic law assumption reexamined.

WaterResources Research , 34(11):2811–2825, 1998.[51] Hidenori Tanaka. Pruning neural networks without any data by iteratively conserving synaptic ﬂow. (NeurIPS),2020.[52] Hao Li, Hanan Samet, Asim Kadav, Igor Durdanovic, and Hans Peter Graf. Pruning ﬁlters for efﬁcient convnets.In , number2016, pages 1–13, 2017.[53] Javier E. Santos, Mohammed Mehana, Hao Wu, Masa Prodanovi´c, Qinjun Kang, Nicholas Lubbers, HariViswanathan, and Michael J. Pyrcz. Modeling Nanoconﬁnement Effects Using Active Learning.

The Journal ofPhysical Chemistry C , 2020.[54] Arvind T. Mohan, Nicholas Lubbers, Daniel Livescu, and Michael Chertkov. Embedding Hard Physical Constraintsin Neural Network Coarse-Graining of 3D Turbulence. pages 1–13, 1 2020.[55] Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Instance Normalization: The Missing Ingredient for FastStylization. (2016), 2016.[56] Jonathan T. Barron. Continuously differentiable exponential linear units. arXiv , (3):1–2, 2017.[57] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectiﬁers: Surpassing human-levelperformance on imagenet classiﬁcation.