[PDF] Learning the constitutive relation of polymeric flows with memory

Abstract

We develop a learning strategy to infer the constitutive relation for the stress of polymeric flows with memory. We make no assumptions regarding the functional form of the constitutive relations, except that they should be expressible in differential form as a function of the local stress- and strain-rate tensors. In particular, we use a Gaussian Process regression to infer the constitutive relations from stress trajectories generated from small-scale (fixed strain-rate) microscopic polymer simulations. For simplicity, a Hookean dumbbell representation is used as a microscopic model, but the method itself can be generalized to incorporate more realistic descriptions. The learned constitutive relation is then used to perform macroscopic flow simulations, allowing us to update the stress distribution in the fluid in a manner that accounts for the microscopic polymer dynamics. The results using the learned constitutive relation are in excellent agreement with full Multi-Scale Simulations, which directly couple micro/macro degrees of freedom, as well as the exact analytical solution given by the Maxwell constitutive relation. We are able to fully capture the history dependence of the flow, as well as the elastic effects in the fluid. We expect the proposed learning/simulation approach to be used not only to study the dynamics of entangled polymer flows, but also for the complex dynamics of other Soft Matter systems, which possess a similar hierarchy of length- and time-scales.

Full PDF

Naoki Seryo, Takeshi Sato, ∗ John J. Molina, † and Takashi Taniguchi † Department of Chemical Engineering, Kyoto University, Kyoto 615-8510, Japan (Dated: August 6, 2020)We develop a learning strategy to infer the constitutive relation for the stress of polymeric ﬂowswith memory. We make no assumptions regarding the functional form of the constitutive relations,except that they should be expressible in diﬀerential form as a function of the local stress- andstrain-rate tensors. In particular, we use a Gaussian Process regression to infer the constitutiverelations from stress trajectories generated from small-scale (ﬁxed strain-rate) microscopic polymersimulations. For simplicity, a Hookean dumbbell representation is used as a microscopic model,but the method itself can be generalized to incorporate more realistic descriptions. The learnedconstitutive relation is then used to perform macroscopic ﬂow simulations, allowing us to updatethe stress distribution in the ﬂuid in a manner that accounts for the microscopic polymer dynamics.The results using the learned constitutive relation are in excellent agreement with full Multi-ScaleSimulations, which directly couple micro/macro degrees of freedom, as well as the exact analyti-cal solution given by the Maxwell constitutive relation. We are able to fully capture the historydependence of the ﬂow, as well as the elastic eﬀects in the ﬂuid. We expect the proposed learn-ing/simulation approach to be used not only to study the dynamics of entangled polymer ﬂows, butalso for the complex dynamics of other Soft Matter systems, which possess a similar hierarchy oflength- and time-scales.

I. INTRODUCTION

Polymeric materials are ubiquitous in our modern in-dustrial societies, having transformed our food, infras-tructure, and modes of transportation. Not surpris-ingly, the 20th century has been dubbed the “PolymerAge” by Rubinstein and Colby[1]. There is a growingdemand for producing more high-functioning polymericproducts, and to do this in a cost-eﬀective way. Unfor-tunately, there is still much we do not understood aboutpolymer physics, particularly with regards to the fab-rication method of sophisticated polymer products. Atpresent, one of the preferred manufacturing methods forpolymeric materials is polymer melt processing, where amolten polymer is extruded or molded into the desiredshape, before allowing it to cool and solidify[2]. To ac-complish these requirements, we need to understand notonly the macroscopic ﬂow behavior of the polymer pro-cess, but also the microscopic dynamics of the polymerchains, in order to reliably control the resultant proper-ties of the product. However, it is not easy to understandsuch properties using only experimental observations,due to the hierarchy of length- and time-scales neededto characterize the microscopic polymer dynamics andthe macroscopic ﬂow. Computer simulations, which pro-vide an alternative and complimentary approach, havebecome an indispensable tool for studying such systems. ∗ Present address: Institute for Chemical Research, Kyoto Univer-sity, Kyoto 611-0011, Japan. † Corresponding [email protected] (J. J. Molina)[email protected] (T. Taniguchi)

Molecular Dynamics (MD) simulations provide access tothe detailed dynamics of polymer chains, but they areunable to deal with the macroscopic ﬂow, because of theprohibitive cost. Computational Fluid Dynamics (CFD)simulations, with an appropriate constitutive equationfor the stress tensor of the polymeric material[3], can pre-dict the macroscopic ﬂow behavior, but cannot provideany microscopic information on the constituent polymerchains. To address this issue, and elucidate the micro-scopic origin of the ﬂow problems at hand, Multi-ScaleSimulation (MSS) methods, which make it possible to si-multaneously consider the dynamics at both scales, havebeen extensively developed over the last twenty years.Within the MSS approach, the macroscopic and micro-scopic degrees of freedom are coupled through the stressand strain-rate tensor ﬁelds. Originally proposed byLaso and ¨Ottinger in 1993[4, 5], with the so-called CON-NFFESSIT model, these types of approaches representthe state-of-the-art in polymer rheology, as they pro-vide a rigorous connection between the microscopic andmacroscopic degrees of freedom[6–16]. However, giventhe computational complexity, they have been limited tosimple ﬂow geometries and small system sizes, and haveyet to be widely adopted within industry.In this paper, we demonstrate how to leverage thepower of Machine-Learning techniques to accelerate theMSS to the point where they are competitive with exist-ing macroscopic descriptions, without any signiﬁcant lossof accuracy. In particular, we will show that it is possibleto learn the constitutive relation from training data gen-erated from small system size microscopic simulations.To this end, we adopt a simple microscopic description,which models the polymers as non-interacting Hookeandumbbells. We note that the proposed method is appli- a r X i v : . [ phy s i c s . c o m p - ph ] A ug cable to any microscopic polymer model, the Hookeandumbbell model has been chosen for its simplicity. Asis well known, in the limit when the number of dumb-bells goes to inﬁnity, the time evolution equation for thestress of such an ensemble converges to an analytic ex-pression. This makes it possible to give a stringent as-sessment of our proposed ML approach. We then performsimulations at ﬁxed strain-rates, in order to measure thetime-evolution of the stress. This information is used asthe training data for a Gaussian Process (GP) regres-sion, in order to learn the corresponding constitutive re-lation. A GP approach avoids over-ﬁtting of the dataand allows us to incorporate unknown and/or noisy datawithin a Bayesian framework, in a convenient and eﬃ-cient manner[17–20]. The learned constitutive relationsare then used in macroscopic ﬂow simulations, openingthe possibility of probing length- and time-scales thatwould be unreachable with standard MSS techniques.Previous work by Zhao et al. has used a similar approachto learn the constitutive relation of generalized Newto-nian ﬂuids[21], but as proposed, the method cannot beapplied to non-Newtonian viscoelastic materials that ex-hibit a history dependent ﬂow. A recent extension of thismethod has used GP to learn the eﬀective viscosity andrelaxation time needed to parameterize a given viscoelas-tic constitutive relation[22]. In both cases, however, thefunctional form of the constitutive equation is a ﬁxed in-put of the method. Here we show that this restrictioncan be lifted, and that the constitutive relation itself canbe learned.Compared to full MSS (using Hookean dumbbells), weobtain speedups of around two orders of magnitude, andwe expect this will only increase when more realistic(computationally expensive) polymer models are used,as the cost of performing the macroscopic ﬂow simula-tions remains constant. Finally, we note that the pro-posed learning strategy, which is here used to learn theconstitutive relation of polymeric ﬂows from microscopicdata, is not limited only to polymeric materials. In fact,we envision similar approaches being used to bridge be-tween length- and time-scales in other Soft-Matter sys-tems, such as colloidal dispersions or cellular tissues. II. MULTI-SCALE SIMULATIONSA. Macroscopic Model

In order to consider the memory eﬀects inherent topolymer ﬂows we adopt a Lagrangian particle descrip-tion to describe the dynamics of the ﬂuid. This allowsus to account for the convection of the polymer chainsand the corresponding strain-rate history dependence ontheir dynamics. In particular, we will use the SmoothParticle Hydrodynamics (SPH) method(see Appendix Afor details)[23]. The system is discretized into ﬂuid par-ticles, carrying mass, momentum, and all relevant hydro-dynamic variable for the problem under consideration.

FIG. 1. Schematic representation of the MSS method usedto study the micro/macro coupling of polymeric ﬂows. Theﬂuid is discretized into ﬂuid particles carrying mass and mo-mentum, as well as microscopic polymer simulators. Solvingfor the dynamics of the polymers, under the macroscopicallyobtained velocity gradient κ , allows us to compute the micro-scopic polymer contribution to the local stress σ . The resul-tant stress distribution is then used to solve for the macro-scopic ﬂow dynamics. Let x i and v i be the position and velocity of parti-cle i ; its time-evolution is determined by the followingequations d x i d t = v i (1)d v i d t = ρ i − [ ∇ · ( σ − p I )] x i + F ( x i ) (2)where ρ i is the density of the ﬂuid particle and F ( x i )is any external force acting on the particle (at position x i ). The ﬁrst term on the right-hand side of Eq. (2)corresponds to the forces on the particle due to internalstresses in the ﬂuid (with p the pressure ﬁeld). The stress σ comes from the time-dependent state of the polymerchains, i.e., the orientation and stretching of the dumb-bells. The pressure p i is deﬁned via the following quasi-incompressible equation of state p i = c s ρ γ (cid:20)(cid:18) ρ i ρ (cid:19) γ − (cid:21) (3)with c s the speed of sound and ρ the initial density.Since we are interested in low-Reynolds number ﬂows,we use γ = 1[24].All that remains is to specify how σ is computed. Thesimplest approach would be to adopt a constitutive equa-tion (e.g., Oldroyd-B), instead, within a MSS approach,we place microscopic polymer simulators inside each ﬂuidparticle, in order to directly couple the microscopic andmacroscopic degrees of freedom[8, 9, 13–15]. The choiceis then reduced to that of deﬁning an appropriate micro-scopic model for the polymeric ﬂuid. B. Microscopic Model

To describe the rheological properties of the polymericﬂuid, We choose the simplest possible microscopic model,that of non-interacting Hookean dumbbells. Thus, weplace N p polymer chains inside each of the N f ﬂuid par-ticles, with each polymer chain represented by two pointparticles connected by a Hookean spring (Fig. 1). Whilemore realistic microscopic models are known, such as theﬁnite-extensible Hookean dumbbell model[25], the Rousemodel[26], the Kremer-Grest beads-spring model[27],the Doi-Edwards reptation model[28], and the slip-linkmodels[29–33], the basic Hookean dumbbell model oﬀersone main advantage over the others: In the limit whenthe number of dumbbells goes to inﬁnity, the exact ana-lytical constitutive equation for the stress is known andcorresponds to that of a Maxwell viscoelastic ﬂuid. Thisprovides us with analytical results against which we cantest our proposed learning strategy. However, the learn-ing method we propose is not contingent on this choice,and in fact, will be most useful when considering moresophisticated polymer models, for which full MSS canbecome prohibitively expensive.Since we do not consider interactions between dumb-bells, the conﬁguration of the system can be describedsolely in terms of the distance vector r between the twobeads composing the dumbbell. The dynamics of thechains are then determined by the following Langevinequation for r [25, 34]d r d t = κ · r − ζ H r + 1 ζ ξ (4) κ = t ∇ v (5)where H is the spring constant, κ the velocity gradienttensor, ζ the friction coeﬃcient, and ξ a random-forcesatisfying the ﬂuctuation-dissipation theorem, (cid:104) ξ (cid:105) = and (cid:104) ξ ( t ) ξ ( t (cid:48) ) (cid:105) = 4 k B T ζ I δ ( t − t (cid:48) ), with I the unit ten-sor. The polymer contribution to the stress tensor canbe expressed in Kramers’ form as[35] σ = n (cid:16) H (cid:104) r t r (cid:105) − k B T I (cid:17) (6)where n is the number density of dumbbells, k B the Boltz-mann constant, and T the temperature. From Eqs. (4)-(6), the following constitutive equation for the stress ten-sor can be derivedd σ d t = σ · t κ + κ · t σ + nk B T (cid:16) κ + t κ (cid:17) − Hζ σ (7)which corresponds to the upper-convected Maxwellmodel. C. Nondimensionalized Equations

To facilitate the simulation and analysis, we willrewrite the previous equations in non-dimensional form, introducing basic units for both macroscopic and micro-scopic descriptions. These two sets are labeled with anuppercase (M) and lowercase (m) superscript, for macro-scopic and microscopic units, respectively. At the macro-scopic level, the characteristic scales are set by the length L and velocity U of the ﬂow problem under consider-ation, and the corresponding ﬂuid-advection time scale τ (M) = L/U . Together with the shear-viscosity of theﬂuid η s , and the initial (target) density ρ , we use thesecharacteristic scales to set the units of density, time,length, and stress to be ρ = ρ , t = τ (M) , l = L , and σ = η s /τ (M) = η s U/L , respectively.With these units, the equations governing the macro-scopic dynamics, Eqs.(1)-(3), becomed (cid:101) x i d (cid:101) t = (cid:101) v i (8)Re d (cid:101) v i d (cid:101) t = (cid:101) ρ − i (cid:2) (cid:101) ∇ · ( (cid:101) σ − (cid:101) p I ) (cid:3) (cid:101) x i + (cid:101) F ( (cid:101) x i ) (9) (cid:101) p i = Cs ( (cid:101) ρ −

1) (10)where a tilde (˜ · ) denotes an adimensional variable. Here,the control parameters are the Reynolds number, deﬁnedas Re = ρ U L/η s = ρ U /σ , and the dimensionlessartiﬁcial sound speed Cs = c s ρ t /η s .At the microscopic level, the characteristic scales aregiven by the equilibrium length of the dumbbells l eq = (cid:112) k B T /H , the dumbbell relaxation time τ (m) = λ = ζ/ H , and the shear viscosity of the Maxwell ﬂuid η s ≡ nk B T λ . To facilitate comparisons between the micro-scopic and macroscopic models, we will use the sametime and stress units for both, t = t and σ = σ = nk B T τ (m) /τ (M) , while the microscopic unit oflength is taken to be the equilibrium dumbbell length l = l eq . Thanks to the coupling between the macro-scopic ﬂow and the microscopic chain dynamics, the Deb-orah number, deﬁned as the ratio of time-scales associ-ated to the microscopic dumbbell relaxation and macro-scopic ﬂuid advection, De = τ (m) /τ (M) , has appeared inthe deﬁnition of the unit stress, σ (m) = σ (M) = nk B T De.With these microscopic units, Eqs. (4)-(7) becomed (cid:101) r d (cid:101) t = (cid:101) κ · (cid:101) r − (cid:101) r + (cid:114) (cid:101) ξ (11) (cid:101) σ = 3De (cid:18) (cid:104) (cid:101) r t (cid:101) r (cid:105) − I (cid:19) (12)d (cid:101) σ d (cid:101) t = (cid:101) σ · t (cid:101) κ + (cid:101) κ · t (cid:101) σ + 1De (cid:16)(cid:101) κ + t (cid:101) κ (cid:17) − (cid:101) σ (13)with the Deborah number De as the only control param-eter. Note that (cid:101) κ = κ t and (cid:101) ξ = (cid:113) t / (4 k B T ζ ) ξ ,with (cid:104) (cid:101) ξ (cid:105) = and (cid:68) (cid:101) ξ (cid:0)(cid:101) t (cid:1) (cid:101) ξ (cid:0)(cid:101) t (cid:48) (cid:1)(cid:69) = I δ ( (cid:101) t − (cid:101) t (cid:48) ). Finally, westress that, while we have chosen τ (M) as unit of time,it can also be useful to use the dumbbell relaxation time τ (m) ≡ λ as reference. Scaled time values using bothunits are directly related by the Deborah number, with (cid:101) t = t/τ (M) = ( t/λ )De.In summary, the MSS method amounts to solvingEqs. (8)-(10) and (11)-(12). With the Reynolds num-ber Re, the Deborah number De, and the dimension-lessartiﬁcial sound speed Cs as the main control parameters.Unless stated otherwise, we focus on the Deborah num-ber regime 10 − ≤ De ≤

10, at low-Reynolds numbersRe (cid:46)

1; with the sound-speed Cs = 10 chosen to keepthe density variations within (cid:46) (cid:101) t (cid:39) − .The strain-rate tensor is evaluated at the macroscopiclevel and used as input for the microscopic simulators, inorder to evolve the conﬁguration of the polymers (dumb-bells). Then, the polymer contribution to the stress iscomputed and used as input to the macroscopic ﬂowsimulation, in order to update the ﬂuid velocity, andthe process is repeated (see Figure 1). This methodhas been successfully used to study a polymer melt-spinning process[14], as well as ﬂows of well-entangledpolymer melts in a contraction-expansion channel (withthe dumbbell model replaced by the Doi-Takimoto Slip-Link model)[15]. Unfortunately, while the predictive ca-pabilities of such a bottom-up approach represent thecurrent state-of-the-art in the ﬁeld of polymer simula-tions, their heavy computational cost has mostly limitedthem to relatively simple ﬂow-geometries in 2D. To ad-dress this issue, we propose a learning strategy basedon Gaussian Processes, in order to uncouple the micro-scopic and macroscopic degrees of freedom in the gov-erning equations. However, our method remains Multi-Scale, in the sense that the microscopic model is usedto generate the training data used to learn the appro-priate constitutive equation, i.e. how to evolve σ . Thiswill allow us to consider the time-evolution of the stressat the macroscopic level, but in a way that satisﬁes thedynamics of the underlying microscopic polymer model. III. LEARNING METHODA. Gaussian Processes (GP)

Formally, a Gaussian Process is deﬁned as “a collectionof random variables, any ﬁnite number of which have ajoint Gaussian distribution”[19]. It provides a (prior)probability distribution over functions f , allowing us touse known values of the function, the so-called “training”data, to infer the values of the function at new “test”positions. Let f ( x ) be a function from R D to R , whichis sampled at N values of the input x , x i ∈ R D ( i =1 , . . . , N ). We denote by X = ( x , . . . , x N ) the D × N design matrix, and f = f ( X ) = ( f , . . . , f N ) the cor-responding output matrix, with f i = f ( x i ). The f i areconsidered to be correlated random variables, where thecorrelation between any two of them, f i and f j , is as-sumed to be a function only of the input values, x i and x j . The joint distribution for the f i is a multi-variateGaussian, speciﬁed in terms of an average function µ ( x )and a correlation function k ( x , x (cid:48) ). The probability of observing function values f at X ,given µ and k , is p ( f ( X ) | X , µ, k ) = 1 (cid:113) (2 π ) N det K ( X , X ) (14) × exp (cid:104) − t δ f ( X ) · K ( X , X ) − · δ f ( X ) (cid:105) where δ f = f − µ , and K ( X , X ) denotes the N × N correlation matrix, whose ( i, j )-th entry is deﬁned as K ( A , B ) ij = k ( A i , B j ). Note that a GP is uniquelydeﬁned in terms of its average and correlation functions, (cid:10) f ( x ) (cid:11) = µ ( x ) (15) (cid:10) δf ( x ) δf ( x (cid:48) ) (cid:11) = k ( x , x (cid:48) ) (16)Without loss of generality, and in the absence of any in-formation about f , one can take µ ( x ) = 0, which leavesonly k to be speciﬁed. Note that no assumptions havebeen made regarding the functional form of f , the corre-lation function k only determines the higher-order prop-erties of the family of functions from which f is drawn,such as continuity, diﬀerentiability, and periodicity. Fol-lowing Rassmussen and Williams[19], we also use the fol-lowing shorthand notation to specify a GP f ∼ N ( µ , K ) (17)which should be interpreted according to Eq. (14). Thefact that the function values at diﬀerent positions ( f ( x )and f ( x (cid:48) )) are correlated, with (cid:104) δf ( x ) δf ( x (cid:48) ) (cid:105) = k ( x , x (cid:48) ),is what allows us to make predictions. Basically, the datafor f ( x ), measured at the “training” points x , allowsus to “learn” how the data is correlated. This is doneby inferring the hyper-parameters Θ of the correlationfunction k , given the training data set. These hyper-parameters Θ determine the precise shape of k , and thusthe properties of the random functions f drawn from theGP. Once we have learned how, and to what degree, thefunction values are correlated with each other, we use thisinformation, together with the known values of f at thetraining points x , to predict the values of the function f (cid:63) at new “test” locations x (cid:63) .Assuming the N input values consist of n trainingpoints and m test points, we partition the input and out-put data into training and test data sets, to arrive at thefollowing (prior) joint distribution (cid:20) ff (cid:63) (cid:21) ∼ N (cid:18)(cid:20) µ ( X ) µ ( X (cid:63) ) (cid:21) , (cid:20) K ( X , X ) K ( X , X (cid:63) ) K ( X (cid:63) , X ) K ( X (cid:63) , X (cid:63) ) (cid:21)(cid:19) (18)where X = ( x , . . . , x n ) is now the design matrix for thetraining points and X (cid:63) = ( x (cid:63) , . . . , x m(cid:63) ) that of the testpoints. Thus, the correlation sub-matrices K ( X , X ), K ( X , X (cid:63) ) = t K ( X (cid:63) , X ), and K ( X (cid:63) , X (cid:63) ) have dimen-sions n × n , n × m , and m × m , respectively. We stress thefact that Eqs. (17) and (18) are equivalent, at this pointwe have simply relabeled the points as belonging to eitherthe training or test data sets. In addition, if the training FIG. 2. Schematic representation of the GP-MSS strategy used to learn the constitutive relation of polymeric ﬂows frommicroscopic data. (1) We perform small-scale microscopic simulations at ﬁxed strain-rate κ , in order to obtain the input ( κ , σ )and output ˙ σ training data. (2) Placing a GP prior on the constitutive relation, ˙ σ ∼ N ( µ, K ), the training data is usedto learn the hyper-parameters Θ (specifying the function variance and length-scales) of the GP, by maximizing the posteriordistribution for Θ . (3) The probability distribution for the constitutive relation at new “test” inputs ( κ (cid:63) , σ (cid:63) ) is then speciﬁedby a conditional GP, i.e., ˙ σ (cid:63) | σ ∼ N ( ν , Σ). We take the average value to be the best estimate, ˙ σ (cid:63) = ν , and use this withina macroscopic simulation, in order to update the stress at each point in the ﬂuid. The prediction uncertainty is given by thecorresponding covariance Σ, and it can be used to perform on-the-ﬂy diagnostics[21]. data is not known exactly, but subject to noise, we canaccount for this by adding the corresponding contribu-tion to the covariance sub-matrix. For example, in thepresence of Gaussian noise with variance σ , we wouldsimply replace K ( X , X ) with K ( X , X ) + σ I n × n .The beneﬁt of using GP comes from the Gaussian formof the distributions, since this allows us to perform mostof the calculations analytically. In particular, the con-ditional distribution for the function values at the testpoints, conditioned on the training data, can be obtainedfrom Bayes’ rule as p ( f (cid:63) | f ) = p ( f (cid:63) , f ) /p ( f ), and it re-sults in yet another GP[19] f (cid:63) | f ∼ N ( ν , Σ) (19) ν = µ ( X (cid:63) ) + K ( X (cid:63) , X ) · K ( X , X ) − · δ f ( X )Σ = K ( X (cid:63) , X (cid:63) ) − K ( X (cid:63) , X ) · K ( X , X ) − · K ( X , X (cid:63) )The prediction for the function values at the test pointsis then given by the mean ν , with the covariance ma-trix Σ providing a measure of the uncertainty at eachpoint. Conceptually, one can interpret this prediction asthe result of drawing random functions from the prior f ∼ N ( µ , K ), and keeping only those that are consis-tent with the measured training data. The average andvariance of the functions that remain will coincide with ν and Σ, i.e., f (cid:63) ∼ N ( ν , Σ).We have used a squared-exponential kernel for all ourGP regressions. For a 1D regression problem, the kernel is deﬁned as k ( x, x (cid:48) ; Γ , l ) = Γ k SE ( x, x (cid:48) ; l ) (20) k SE ( x, x (cid:48) ; l ) = exp (cid:20) − ( x − x (cid:48) ) l (cid:21) Here, Θ = (Γ , l ) are the hyper-parameters that must beinferred from the data, with Γ specifying the amplitude ofthe function variance and l the characteristic length-scaleover which the function is varying in the x -dimension.This commonly used Squared-Exponential kernel resultsin GPs that are inﬁnitely diﬀerentiable, but other choicesare possible, and might be more suitable for a given learn-ing problem. Kernels for higher dimensional functionspaces can be deﬁned by taking sums or products of 1Dkernels[19, 36]. As an example, the following are validGP kernels for 2D functions, with input x = ( x , x ) k sum ( x , x (cid:48) ; Γ , l , l ) = Γ (cid:0) k SE ( x , x (cid:48) ; l ) + k SE ( x , x (cid:48) ; l ) (cid:1) k mul ( x , x (cid:48) ; Γ , l , l ) = Γ k SE ( x , x (cid:48) ; l ) · k SE ( x , x (cid:48) ; l )In principle, we have independent sets of hyper-parameters for each dimension, but for simplicity, whenusing additive kernels we will assume that the ampli-tude of the variance is equal along all dimensions, i.e.,Γ = Γ = Γ . B. GP Accelerated Multi-Scale Simulations

The idea of learning a constitutive relation for poly-mer ﬂows from microscopic data is not entirely new. In-deed, previous work by Zhao et al.[21] has consideredthis precise problem, in order to perform macroscopicpolymer ﬂow simulations without having to introducean arbitrary constitutive relation. In this work, theyhave assumed a generalized Newtonian model, and usedsimple-shear simulations to learn the apparent viscosity η (app) = σ xy / ˙ γ , of (inelastic) non-Newtonian ﬂuids asa function of the shear-rate ˙ γ . When performing themacroscopic ﬂow simulations, the maximum shear rate(computed from the second invariant of the strain ratetensor) is taken as the local shear rate, and used within aGP regression scheme in order to obtain η (app) , and thusthe local shear stress σ xy = η (app) ˙ γ . In practice, thisamounts to placing a GP prior on the stress tensor itself σ ∼ N ( µ, K ) (21)However, as acknowledged by the authors, this excludesmany interesting rheological properties of (viscoelastic)non-Newtonian ﬂuids, as it does not allow for any typeof history dependence in the ﬂow, and relies on a separa-tion of time-scales between the microscopic and macro-scopic dynamics. This history dependence, which arisesfrom the internal stresses, is one of the most signiﬁcantfeatures of polymeric ﬂows. Subsequent work by Zhaoet al.[22] has considered viscoelastic ﬂows, but the learn-ing is restricted to parameterizing a constitutive relationwith a predetermined functional form. Therefore, thegoal of the current work is to generalize the learning strat-egy, so that it can be used to model viscoelastic polymericﬂuids without having to specify any constitutive relation.Following Zhao et al.[21, 22], we also use small-scalemicroscopic simulations of polymer chains at ﬁxed strain-rates to generate the training data necessary to learn theconstitutive relation. However, we now place a GP prioron the time-derivative of the stress-tensordd t σ ≡ ˙ σ ∼ N ( µ, K ) (22)not on the stress-tensor itself (or the eﬀective viscosity).This diﬀerence allows us to consider the time-dependentmemory eﬀects crucial to describe the dynamics of poly-mer chains. Even in the absence of polymer entangle-ment, this memory eﬀect is non-negligible, thanks to theﬁnite relaxation time of the polymer stretching and reori-entation. No assumptions are made regarding the formof the constitutive equation, except for the fact that itshould be expressed in diﬀerential form, as a function ofthe local instantaneous stress σ and strain-rate κ tensors.This includes most commonly used diﬀerential models,such as the Maxwell, Jeﬀreys, and Oldroyd ﬂuids[3]. Fi-nally, while it is possible to consider correlated output,we use a separate GP for each independent component ofthe stress tensor, resulting in D ( D + 1) / D -dimensions.Here, we have assumed a separation of length-scalesbetween the microscopic and macroscopic descriptions,such that at the microscopic level the system can be con-sidered to be homogeneous, i.e., all ﬁeld gradients areeﬀectively zero. It is only at the macroscopic-level where ﬁeld gradient induced phenomena are incorporated. Theappropriate microscopic length scale needed for this ap-proximation to be valid will depend on the precise ﬂowproblem being studied. For the systems presented here itis given by the characteristic size of the polymer chains.It is still possible (in principle) to apply the same typeof learning strategy to microscopic models in which ﬁeld-gradients are not homogeneous. In such cases, the consti-tutive relations would be functions of the local stresses,strains, and their gradients. However, such considera-tions lie outside the scope of the current work.A schematic diagram of the three-step learning strat-egy, consisting of data generation, learning, and predic-tion steps, is given in Fig. 2. For the ﬁrst step, weperformed ﬁxed strain-rate simulations of the 3D mi-croscopic model, corresponding to either simple-shear orplanar elongational ﬂow κ (shear) =  γ

00 0 00 0 0  ,  γ  (23) κ (elongational) =  ˙ ε − ˙ ε

00 0 0  (24)with ˙ γ and ˙ ε the shear and elongational ﬂow rates, re-spectively. These particular ﬂows are chosen because thelearned constitutive relations will be used in simulationswith an imposed 2D ﬂow, where the z -direction is as-signed to be neutral. However, if one wants to apply thisstrategy to a more complex situation, additional appliedﬂows, with diﬀerent deformation rate tensors, will needto be used during the learning process. All simulationswere started from a random and isotropic initial conﬁgu-ration of the polymer chains ( (cid:104) σ (cid:105) = 0). The simulationswere performed until a steady-state was reached, suchthat ˙ σ (cid:39)

0. Each simulation provides us with a tra-jectory ( t, σ ( t )), from which we can compute ˙ σ ( t ). Werandomly chose a fraction of these tuples ( κ , σ , ˙ σ ) toserve as training data for the GP regression, with input x = ( κ , σ ) and output f ( x ) = ˙ σ . To compute ˙ σ we use aﬁnite-diﬀerence approximation over a characteristic time-scale ∆ t ( c ) . To reduce the noise in the measurements for σ and/or ˙ σ , it is recommended to apply a data smooth-ing operation, the details of which are given below.The second step is to train the model, i.e. to determinethe hyper-parameters Θ of the kernel function k ( x , x (cid:48) ),as well as any unknown uncertainties in the test data for˙ σ . The posterior probability distribution for the hyper-parameters, given the model and measured training data,is determined from Bayes’ theorem as p ( Θ | X , f ) ∝ p ( f | X , Θ ) p ( Θ ) (25)where the likelihood p ( f | X , Θ ) is given by the corre-sponding GP, Eq. (14), and p ( Θ ) is a suitably chosenprior for the hyper-parameters. A full Bayesian treat-ment, in which we integrate out the hyper-parametersand propagate the uncertainties during the simulation ispossible, but for simplicity, we will consider only a point-wise solution. For the 1D problem considered below, wetake the posterior average Θ (avg)0 = (cid:90) d Θ Θ p ( Θ | X , f ) (26)estimated from Hamiltonian Monte-Carlo (HMC) simu-lations. For the 2D problem, we instead use a stochasticgradient-based optimization method[37] to maximize thelog-posterior, and ﬁnd the “optimal” value Θ (map)0 = arg max Θ log p ( Θ | X , f ) (27)Finally, the optimized Θ are used to parameterize theconditional distributions for the test data f (cid:63) | f , as deﬁnedin Eq.(19). In practical terms, we use the (conditional)mean ν = (cid:104) ˙ σ (cid:63) (cid:105) as our prediction for the constitutive rela-tion, such that the stress of each ﬂuid particle is updatedaccording to σ i ( t + ∆ t ) = σ i ( t ) + (cid:104) ˙ σ (cid:63) (cid:105) i ∆ t (28)where (cid:104) ˙ σ (cid:63) (cid:105) i is a function of the training data x = ( κ , σ ),as well as the instantaneous (test) strain-rate and stresstensors at the position of particle i , x (cid:63) = ( κ (cid:63) , σ (cid:63) ) =( κ i ( t ) , σ i ( t )).We refer to this method, which uses a constitutiverelation learned through a Gaussian Process regressionscheme, within a macroscopic ﬂow simulation, as a GPaccelerated Multi-Scale Simulation, or GP-MSS. We aremainly interested in learning this constitutive relationfrom microscopic polymer simulations, but to check theconsistency of this approach, we will also consider learn-ing from the constitutive relations themselves (e.g., fromthe upper-convected Maxwell model). C. Algorithmic Complexities

To understand the beneﬁts of our proposed GP-MSSapproach with respect to a standard MSS, we should con-sider the complexities of both algorithms. Since bothmethods employ the same macroscopic description forthe ﬂow, any gains or losses will be found in the calcula-tion of the stresses. For the MSS, this is done by solvingfor the microscopic dynamics of the polymer chains. As-suming a coarse-grained entanglement model, the timeand memory requirements will scale as O ( N f × N p × z ),with z the number of entanglement points per chain (forthe case of non-interacting dumbbells we have z = 1).Our experience shows that N p × z should be of the or-der of 10 − or larger, in order to obtain reliablestress measurements. The complexity associated to theGP learning procedure is divided in two: the trainingstep and the prediction step. The former is the most ex-pensive of the two, scaling in time as O (cid:0) n (cid:1) , and inmemory as O ( n training )[38, 39]. However, we note thatthis procedure only needs to be done once, the resulting constitutive relation can then be used to perform arbi-trarily complex ﬂow simulations. To evaluate the per-formance of the GP-MSS, we focus then on the cost ofmaking new predictions to update the stresses, as givenby Eq. 28. This involves evaluating the average of theconditional GP at the test positions X (cid:63) , correspondingto the values of κ and σ for each ﬂuid particle. FromEq. (19), we see that we must compute expressions ofthe form K ( X (cid:63) , X ) · (cid:16) K ( X , X ) − · δ f ( X ) (cid:17) , where theterm in parenthesis depends only on the training points X . This term, which evaluates to a vector of length n training , can also be precomputed. The prediction stepcan then be reduced to a simple matrix-vector multipli-cation, which scales as O ( N f × n training ). However, thememory requirements are still only O ( n training ).The asymptotic ratio of the GP-MSS to MSS stresscalculation time is then O ( n training / ( N p × z )), whichhighlights the importance of generating a minimal high-quality training data set. The rule-of-thumb is to havethe number of training points be smaller than the numberof degrees of freedom in the microscopic polymer chainsimulator. For the MSS simulations considered here, weneed N p (cid:39) dumbbells to obtain reliable ﬂow predic-tions. Using a random sampling protocol to generate thetraining data, we can achieve this same level of accuracyusing n training (cid:39) training point. Although this doesnot account for the constant proportionality factors as-sociated to each of the calculation costs, this two orderof magnitude speedup is in agreement with the results ofour simulations.The issue of generating an optimal training dataset,which in this case will maximize the time-saving withrespect to the MSS, while keeping the same level of accu-racy, is at the heart of most ML problems. It is quitelikely such a protocol will have to be tailored to thespeciﬁc microscopic model one wishes to study, as wellas the macroscopic ﬂow regimes to be simulated. Forthe systems studied here, random sampling proved to beenough, but it is not sure whether or not this will gener-alize to more complex setups. IV. RESULTS

We will consider two basic ﬂow problems in order tovalidate our proposed learning strategy (see Fig. 3): (1)simple-shear ﬂow and (2) pressure driven ﬂow. Giventheir symmetry, the ﬂow in both systems is eﬀectivelyone-dimensional, but a complete description of their dy-namics requires that we account for all components of thestress, and their coupling to the ﬂow. Furthermore, if thisapproach is to be applied to general geometries, it shouldbe capable of learning the appropriate form of the con-stitutive relation for the stress without any simplifyingassumptions (although it can be useful to introduce thisadditional information in speciﬁc cases). To show howour learning procedure can be extended as the dimen-

FIG. 3. Schematic representation of the two standard ﬂowproblems we have used to test our learning strategy, simple-shear ﬂow and pressure driven ﬂow. For the former, we con-sider a simpliﬁed 1D description, while for the latter we takeinto account the full 2D nature of the stress and strain-ratetensors. sionality of the system increases, we will study (1) thesimple-shear ﬂow problem in 1D and the (2) pressure-driven ﬂow problem in 2D.For the simple-shear ﬂow case we assume that v y = 0and the strain-rate and stress tensors have only one non-zero component, i.e., the xy component. The learningproblem is then 2D, since the constitutive relation isof the form ˙ (cid:101) σ xy ( (cid:101) κ xy , (cid:101) σ xy ). For the pressure-driven casewe consider the full 2D nature of the stress and strain-rate dependence, such that the learning problem is now7-dimensional, ˙ (cid:101) σ αβ ( (cid:101) κ xx , (cid:101) κ xy , (cid:101) κ yx , (cid:101) κ yy , (cid:101) σ xx , (cid:101) σ xy , (cid:101) σ yy ). Forthe ﬂows we are considering (cid:101) κ αz = (cid:101) κ zα = 0, and sinceTr ( (cid:101) κ ) = 0, we have that (cid:101) κ xx = − (cid:101) κ yy . Instead of ex-plicitly introducing such relationships into the model, wehave preferred to learn them directly from the trainingdata. As we have chosen a system of non-interactingHookean dumbbells for our microscopic polymer model,we are able to compare our results with the exact analyt-ical constitutive equation, Eq.(13). Thus, we can checkthe convergences and sensitivity of our results, both interms of the number of dumbbells N p used in the mi-croscopic simulations, as well as the number of train-ing points used in the GP regression n training . Finally,since the absolute value of the local stress-tensor is not aphysically measurable quantity, as opposed to the forcesor velocities, we prefer to evaluate the accuracy of thelearning strategy by comparing the error in the macro-scopic predictions for the forces in the ﬂuid and the ﬂowvelocities. A. Simple-Shear Flow (2D learning)

To arrive at an eﬀective 1D description for this simple-shear ﬂow problem we have assumed the stress is a func-tion of y only, and that the system starts from a relaxedstate (cid:101) σ αβ ( t = 0) = 0. The Maxwell constitutive relation, Eq. (13), then takes the following form˙ (cid:101) σ xy ( (cid:101) y, (cid:101) t ) = 1De (cid:20)(cid:101) κ xy (cid:0)(cid:101) y, (cid:101) t (cid:1) − (cid:101) σ xy (cid:0)(cid:101) y, (cid:101) t (cid:1)(cid:21) (29)where (cid:101) σ xx ( t ) = (cid:101) σ yy ( t ) = (cid:101) σ zz ( t ) = (cid:101) σ xz ( t ) = (cid:101) σ yz ( t ) = 0.Microscopic training data was generated by perform-ing ﬁxed shear-rate simulations for a system of N p =10 , , non-interacting dumbbells in 3D for two dif-ferent Deborah numbers, De = 1 and 10. In each case, weused nine diﬀerent values of the adimensionalized shear-rate (cid:101) κ xy within the range [ − , (cid:101) κ xy = 0.The time step was set to ∆ (cid:101) t = 10 − and the simulationswere performed up to a maximum time of (cid:101) t max = 5De.The trajectory of the stress (cid:101) σ xy ( t ) was then smoothedusing a Gaussian ﬁlter with a width of 5 × − τ ( m ) , inorder to reduce the noise in the estimates for ˙ (cid:101) σ xy . Thetime-derivative of the stress, obtained from the smootheddata, was computed using the following ﬁnite-diﬀerenceapproximation˙ (cid:101) σ xy ( (cid:101) y, (cid:101) t ) = (cid:101) σ xy ( (cid:101) t ) − (cid:101) σ xy ( (cid:101) t − ∆ (cid:101) t )∆ (cid:101) t (30)The resulting ( (cid:101) t, (cid:101) σ xy , ˙ (cid:101) σ xy ) data is then partitioned intoten (possibly overlapping) randomly selected intervalsover (cid:101) σ xy , of width (max ( (cid:101) σ xy ) − min ( (cid:101) σ xy )) /

10. Aver-aging over the points contained in each interval allowsus to deﬁne a corresponding pair of input x = ( (cid:101) κ xy , (cid:101) σ xy )and output f ( x ) = ˙ (cid:101) σ xy training points. The variance (cid:15) in ˙ (cid:101) σ xy , within each interval, is used as an estimate ofthe measurement error, and added to the correspond-ing covariance sub-matrix K ( X , X ), in order to per-form the GP regression (see Eq. 18), i.e., K ( X , X ) ij = k ( X i , X j ) + (cid:15) i δ ij .We use a product kernel for this 2D regression problem x = ( x , x ) = ( (cid:101) σ xy , (cid:101) κ xy ), such that k ( x , x (cid:48) ) = Γ k SE ( x , x (cid:48) ; l ) · k SE ( x , x (cid:48) ; l ) (31)We thus have three hyper-parameters, Γ, and the twolength scales for (cid:101) σ xy and (cid:101) κ xy , so that Θ = (Γ , l , l ).We use independent and uniform logarithmic priors, p (ln Θ i ) ∝ const, which are scale-invariant, such that p ( Θ ) = p (Γ) · p ( l ) · p ( l ) ∝ · l · l To train the model, we performed Hamiltonian Monte-Carlo simulations, using the No-U-Turn Sampler(NUTS), with the PyMC3 python package[40–42]. Theoptimal parameters Θ are then taken to be the averagesover the posterior distribution Θ avg , and used to deﬁnethe conditional GP for the constitutive relation. Fig. 4shows the results of this learning procedure for the caseof De = 10. We are able to learn the Maxwell consti-tutive relation, Eq. (29), providing excellent predictionsover a wide range of parameters. However, the predic-tion error increases the farther we get from the trainingpoints, as expected. This highlights the importance ofgenerating a well chosen training data set, representativeof the region in function space one is interested in. Itis reassuring to note that even with the naive randomsampling strategy outlined above, we are able to obtainprecise predictions for the constitutive relation. Whilethis is due to the simplicity of the function, we will showbelow that this approach also generalizes to more compli-cated functional forms and higher dimensions. Using thislearned constitutive relation, we performed ﬂow simula-tions under simple shear, at Re = 10, and compared theresults to those obtained from standard MSS (with mi-croscopic dumbbell simulators), as well from the Maxwellconstitutive equation.Considering the one-dimensional nature of the ﬂowproblem, one can use a simple Eulerian description in-stead of the Lagrangian one. Thus, the system was dis-cretized in the vertical y direction, using 128 grid points,with the velocity of the top and bottom walls set to (cid:101) v = 1and (cid:101) u = 0, respectively. Starting from a quiescent ﬂuid,a velocity wave will start at the top wall, and propa-gate through the channel, bouncing back and forth atthe walls, before a steady-state linear velocity proﬁle isobtained[43]. This transient regime is more pronouncedat high De, as evidenced by the three kicks in the time-evolution of the stress at De = 10, corresponding to thearrival of the wave-front. For comparison, at De = 1only one kick is observed (see Fig. 5 (a-b)). The excellentagreement obtained between the MSS and GP-MSS pre-dictions for the stress is further evidence of the success inlearning the constitutive relation. The stress ﬂuctuationsobtained from the MSS at long times are a consequenceof the large statistical ﬂuctuations that come from usinga ﬁnite number of dumbbells. It is encouraging to seethat the GP-MSS predictions do not show such behavior.This is because our learning strategy properly accountsfor the measurement error in the training data, allowingus to infer the “true” function. To further test our abilityto capture the history-dependence of the ﬂow, we havealso considered an oscillatory shear ﬂow (not shown), andobtained a similar level of agreement between MSS andGP-MSS predictions. We note that a generalized Newto-nian approach, which assumes the stress σ is a functionof κ , would not be able to capture this memory eﬀect[44].Fig. 5 (c-d) shows the maximum absolute error in thepredicted velocities, among all points in the system, asa function of the number of dumbbells N p used in themicroscopic simulations. This error is evaluated with re-spect to the velocities obtained from macroscopic simu-lations with the exact constitutive relation. Not surpris-ingly, given the good agreement in the stress proﬁles, thevelocities also coincide. There is however a small oﬀsetin the transient regime[43], with the GP-MSS velocitywave showing a slight delay (advance) with respect tothe MSS or Maxwell predictions at De = 1 (10). This isthe main source error reported in Fig. 5 (c-d), and is dueto the small number of training data around this partic-ular point in ( (cid:101) κ , (cid:101) σ ) space. Two aspects deserve to be highlighted: First, in all cases we have considered, theerror of the GP-MSS results is of the same order of mag-nitude as the MSS results. While the error can be twotimes larger, for times near the beginning of the start-up ﬂow, it can also be considerably smaller, particularlyat steady-state. This is most obvious for the case withsmall number of dumbbells N p = 10 . Second, the er-ror decreases considerably as N p → ∞ , as this decreasesthe statistical errors in both the MSS and the trainingdata used to learn the constitutive relations. However,we note that the accuracy of the GP-MSS will also beaﬀected by the quality of the training points. In the caseof De = 10, for example, we happened to obtain slightlybetter results for N p = 10 than for N p = 10 . B. Pressure-Driven Flow (7D learning)

For the pressure driven-ﬂow problem we will considerthe full 2D nature of the system. The learning problemthen consists of three GP regressions, one each for ˙ (cid:101) σ xx ,˙ (cid:101) σ xy , and ˙ (cid:101) σ yy , all of them functions in a seven-dimensionalspace x = ( (cid:101) κ xx , (cid:101) κ xy , (cid:101) κ yx , (cid:101) κ yy , (cid:101) σ xx , (cid:101) σ xy , (cid:101) σ yy ). Given theincreased complexity with respect to the eﬀective 1D ﬂowconsidered previously, which resulted in a 2D learningproblem, we have simpliﬁed the procedure to generatethe training data. This is not directly related to the typeof ﬂow, but rather to the dimensionality of the problem.Microscopic 3D simulations for N p (cid:46) non-interactingdumbbells at ﬁxed strain-rate are performed. We con-sidered both simple-shear and planar elongational ﬂowswithin the range | (cid:101) κ αβ | ≤ (cid:101) κ xx , (cid:101) κ xy , and (cid:101) κ yy , in addition, results for (cid:101) κ yx were obtained by taking the transpose of those at (cid:101) κ xy . The simulation time step was ﬁxed to ∆ (cid:101) t = 10 − ,the maximum time was set to be (cid:101) t max = 10De, and thecorresponding σ ( t ) trajectories where saved and used togenerate the training data. For this, we randomly se-lected N (cid:39) points ( (cid:101) t, (cid:101) κ γδ , (cid:101) σ αβ ), and included theinitial state (cid:101) σ αβ ( t = 0) = 0. Then, we use each pointto deﬁne a time interval of width 0 . τ (m) , with τ (m) thepolymer relaxation time. For the training data, we takethe stress to be the average stress within the time in-terval, whereas the time derivative of the stress ˙ (cid:101) σ αβ istaken to be the diﬀerence at the two end-points, i.e., us-ing a coarse-grained time-step of ∆ t (c) (cid:39) . (cid:101) σ αβ cannot be ignored, but it isalso not easy to evaluate in this high-dimensional space.Therefore, we introduce this error (cid:15) αβ as an additionalhyper-parameter that should be learned from the data.It is in such situations where the beneﬁt of adopting aBayesian approach pays oﬀ.To summarize, we place a GP prior on each ˙ σ αβ , suchthat ˙ σ αβ = f ( x ) ∼ N (0 , K ( X , X )). For the covariancematrix, we have K ( X , X ) ij = k ( X i , X j ) + (cid:15) δ ij , withthe measurement error assumed to be constant, equal for0all x . Here, we have used an additive ﬁrst-order kernel,such that k ( x , x (cid:48) ) = Γ (cid:88) Λ k (Λ) ( x Λ , x (cid:48) Λ ; l Λ ) (32)with Λ = 1 , · · · , x = ( (cid:101) κ xx , (cid:101) κ xy , (cid:101) κ yx , (cid:101) κ yy , (cid:101) σ xx , (cid:101) σ xy , (cid:101) σ yy ).The choice of an additive kernel is motivated by the factthat it allows for non-local interactions, making it lesssusceptible to the curse of dimensionality and allowingfor better extrapolation in regions far way from the train-ing data. In contrast, a multiplicative Kernel will rapidlyrevert to the mean away from the training data[36, Ch.2.4and 6.1]. This is not an issue for the 2D learning problemconsidered previously, as it was easy to generate a largeenough sample of training points. For each 1D kernel k (Λ) we have one associated hyper-parameter or length-scale l Λ , for a total of seven length-scale hyper-parameters.Together with Γ and the (unknown) measurement error (cid:15) αβ associated to the training data for ˙ (cid:101) σ αβ , this resultsin a total of nine hyper-parameters needed to learn each˙ (cid:101) σ αβ . Note that we are assuming that the measurementerror is constant, i.e., it does not depend on x , althoughit can be diﬀerent for the diﬀerent components of theconstitutive relation. The optimal values were obtainedby maximizing the log-posterior, Eq. (25), assuming aconstant prior for the hyper-parameters ( p ( Θ ) = const).For this, we use ADAM[45], a stochastic gradient descentalgorithm[37, 46], as implemented in GPyTorch[38, 39].This learning procedure results in three distinct func-tions ˙ (cid:101) σ xx , ˙ (cid:101) σ xy , and ˙ (cid:101) σ yy of seven variables. The mostinteresting, non-trivial components of the constitutive re-lations (for the ﬂows we are considering) are the xx and xy components, visualized in Fig. 6 for three diﬀerentnumber of training points n training = 1 , × generated from microscopic simulations of N p = 10 dumbbells[47]. As for the 1D problem considered above,we compare our results with data generated from the ex-act Maxwell constitutive relation ( N p → ∞ ), and, as ex-pected, obtain better agreement as n training increases. Fi-nally, for comparison purposes, we have also learned theconstitutive relation from training data generated fromthe exact solution, i.e., the Maxwell model ( N p → ∞ ).We used the learned constitutive relations to perform2D simulations for the pressure driven ﬂow problem, com-paring our GP-MSS results with those of the conventionalMSS and the exact Maxwell constitutive relation. Weperform SPH simulations for all three cases, in a channelwhose width (cid:101) L x = 0 . (cid:101) L y = 1. The ﬂuid is discretized using 540particles, initially arranged on a regular lattice withinthe system, corresponding to a 18 ×

30 (width × height)array of particles. The initial distance between each par-ticle is (cid:101) b = 1 /

30, the smoothing length is (cid:101) h = 1 . (cid:101) b , andthe cut-oﬀ length is (cid:101) R c = 3 (cid:101) h . We performed simulationsat Re = 10 − , for De (cid:46) − , which is enough to observeelastic eﬀects in the ﬂow. Fig.7 shows the time-evolution Parameter Description ValueRe Reynolds number 10 − De Deborah number 10 − − − Cs artiﬁcial sound-speed 10∆ (cid:101) t time-step 10 − N p Number of dumbbells (MSS) 10 − N f No. of ﬂuid particles 540 = 18 × (cid:101) L x , (cid:101) L y ) System Size (0 . , . (cid:101) b Initial particle size 1 / (cid:101) h Smoothing length 1 . (cid:101) b (cid:101) R c Cut-oﬀ length 3 (cid:101) h (cid:101) F e External driving force (12 , n training GP training points 1 × − × ∆ (cid:101) t (c) GP coarse-graining window 0 . of the velocity at the center of the channel ( (cid:101) y = 0 .

5) forDe = 1 , × − , as obtained from GP-MSS us-ing the constitutive relation learned from exact trainingdata ( N p → ∞ ), SPH simulations using the Maxwell con-stitutive equation, and a corresponding Newtonian ﬂuid(De = 0). The default parameters for these 2 D (SPH)MSS and GP-MSS are summarized in Table I.As expected, at steady-state the system is indistin-guishable from a Newtonian ﬂuid. However, at shorttimes (cid:101) t (cid:46) . (cid:38) − .The diﬀerences between the GP-MSS results, using thelearned constitutive equation, and those from the exactconstitutive equation are negligible. While these GP-MSS relied on a constitutive relation learned from exactnoiseless data ( N p → ∞ ), using a ﬁnite number of dumb-bells N p to generate the training data yields similar levelof agreement, as will be shown below.First, although there is a small diﬀerence between thelearned constitutive relation and the exact solution (seeFig. 6), particularly for the (cid:101) σ xx component, the macro-scopic predictions are in excellent agreement. This canbe seen when looking at the forces in the ﬂuid, as shownin Fig. 8 for three diﬀerent positions along the heightof the channel. Indeed, GP-MSS results using consti-tutive equations learned ( n training = 10 ) from the ex-act solution ( N p → ∞ ) or from microscopic simulations( N p = 10 ) are indistinguishable from each other at thisscale, and they coincide with macroscopic simulation re-sults using the Maxwell constitutive relation, althoughthere is a small lag in the forces for the N p = 10 MSScase. Second, we show that increasing the number oftraining points results in more accurate constitutive re-1lations, and thus more reliable macroscopic ﬂow simula-tions. We used the three constitutive relations of Fig. 6,generated from n training = 1 , × training points,to perform GP-MSS, and compared the predicted ve-locity proﬁles with the exact solution, as given by SPHsimulations using the Maxwell constitutive relation[48].Fig. 9 shows the maximum absolute error in the velocityas a function of time. Increasing the number of train-ing points dramatically reduces the error in the simu-lations. Furthermore, the simulations using the consti-tutive relation learned on N p = 10 dumbbells give thesame level accuracy as those using the constitutive re-lation learned from the exact solution. All things beingequal, increasing the number of training points will givebetter results; however, what matters is the quality ofthe training data set. This is the reason why the best re-sults are obtained with the constitutive relations learnedfrom the exact data, even though only a relatively smallnumber of training points are used. V. CONCLUSIONS

We have developed a learning strategy that is able toinfer the constitutive relation of polymer melt ﬂows froma small number of microscopic or coarse-grained polymersimulations. For this, we have used a Bayesian learningapproach based on Gaussian Process (GP) regressions.GPs provide a probability distribution over functions, al-lowing us to infer the most likely values, given knowntraining data. In addition, we can estimate the uncer-tainty in the predictions, as well as incorporate unknownor incomplete data (e.g., measurement errors). Previouswork has shown how one can use this type of approachto learn the eﬀective viscosity of a polymer melt ﬂow,as a function of the local shear-rate[21], or to parame-terize a viscoelastic constitutive relation[22]. Here, wedemonstrate that a learning scheme that includes mem-ory eﬀects can be developed, which is crucial in orderto describe the ﬂow dynamics of entangled polymers incomplex ﬂow geometries. This method has great poten-tial for polymer processing, as it will allow us to considerﬂow problems relevant to industrial settings. In addi-tion, we believe that similar learning strategies can be de-signed for other soft-matter systems, where the presenceof multiple length- and time-scales gives rise to complexdynamical behavior that is expensive to simulate directly.To validate the method, we have adopted the simplestpossible microscopic polymer model, that of an ensem-ble of non-interacting Hookean dumbbells, since the ex-act constitutive equation is known in this case. Thismodel was used in ﬁxed strain-rate κ simulations, undersimple-shear and planar elongation, to generate the re-quired training data. For this, the time-evolution of thestress σ ( t ) was used to estimate the time-derivative ˙ σ .Assuming that the constitutive relation can be written indiﬀerential form, as a function of the local strain-rate andstress, the goal is to learn the function ˙ σ ( κ , σ ). Thus, the training data consists of input points ( κ , σ ) and thecorresponding output ˙ σ . We randomly selected a subsetof (cid:39) points and used them within a GP regressionscheme in order to determine the optimal posterior dis-tribution p ( ˙ σ (cid:63) | κ , σ , κ (cid:63) , σ (cid:63) ) for the constitutive equation˙ σ (cid:63) at new input test points ( κ (cid:63) , σ (cid:63) ), for which ˙ σ is notknown. This conditional probability distribution is a GP,with average and covariance that are functions of boththe training and test points.We set the average (cid:104) ˙ σ (cid:63) (cid:105) , over the posterior distribu-tion, to be our best prediction for the constitutive rela-tion, and this function was then used within a macro-scopic simulation in order to predict the ﬂow behavior.In this way, we were able to carry out all our simulationsat the macroscopic level, without having to impose anyconstitutive relation. Again, all we assumed was thatthe time-derivative of the stress is a function of the localstress and strain-rate tensors. The appropriate constitu-tive equation is learned from a (relatively) small num-ber of microscopic simulations. The resultant method,which we have referred to as GP-MSS, gives results thatare as accurate as conventional MSS, at a fraction of thecost. With our non-optimized python+numpy code, thediﬀerence in runtime between the full MSS (using 10 dumbbells) and the GP-MSS (with n training = 10 train-ing points) is ∼

100 times for the 2D pressure driven ﬂowproblem. We note that the GP-MSS run only marginallyslower than simulations using the Maxwell constitutiverelations. Thus, we get the best of both worlds, achievingrun times comparable to macroscopic simulations, with-out sacriﬁcing the accuracy provided by a microscopicpolymer description. We expect the speedup aﬀorded bythe GP-MSS to improve dramatically when more realis-tic, and complex, microscopic polymer models are used.However, the reliability and eﬃciency of the GP-MSS willdepend on the quality and size of the training dataset.Thus, care should be taken when devising the data gener-ating protocol. As an added beneﬁt to our approach, wenote that it is also possible to maintain (consistent) infor-mation on the microscopic degrees of freedom, for exam-ple, by adopting a multi-ﬁdelity representation[49, 50]. Inthis case, a small number of microscopic simulators couldbe introduced in order to provide accurate (localized)stress measurements, which are then fused together withthe approximate predictions provided by the learned con-stitutive relation. In future work we will apply this learn-ing approach to tackle the problem of entangled polymermelt ﬂows. This will allow us to consider 3D ﬂows incomplex geometries, which have so far remained out ofreach for standard MSS techniques.

Appendix A: The SPH Method1. Introduction

The Smooth Particle Hydrodynamics (SPH) method,a particle-based method originally developed to solve as-2trophysics problems, provides a computationally conve-nient way to solve the Navier-Stokes equation in a La-grangian framework[23]. The ﬂuid is discretized into anumber N f of ﬂuid particles that carry mass, momen-tum, and energy. Any function of the system can then beexpressed as an interpolation over the (disordered) ﬂuidparticles, which serve as interpolation points. Considera scalar quantity A and a vector quantity V , which de-pend on position. The value of A ( x ) or V ( x ), at a givenposition x , which need not correspond to the position ofany of the ﬂuid particles, is given by A ( x ) = (cid:90) A ( x (cid:48) ) W ( x − x (cid:48) , h )d x (cid:48) (A1) V ( x ) = (cid:90) V ( x (cid:48) ) W ( x − x (cid:48) , h )d x (cid:48) (A2)with similar expressions for higher-order tensor ﬁelds.Here, W ( x , h ) is a smoothing or interpolating kernel thatshould integrate to unity and tend to a delta function inthe limit when the smoothing length h goes to zero. Inthis work, we adopt a Gaussian interpolating kernel W ( r , h ) = 1( h √ π ) D exp (cid:34) − (cid:107) r (cid:107) h (cid:35) (A3)with r a D -dimensional distance vector and (cid:107)·(cid:107) the L2-norm, i.e., (cid:107) r (cid:107) = (cid:16)(cid:80) Di =1 r i (cid:17) / . We note that derivativesof these functions can be easily evaluated, as the deriva-tive operator can be transferred to the kernel, for whichanalytic results can be computed in advance. Thus, ∇ A ( x ) = (cid:90) d x (cid:48) A ( x (cid:48) ) ∇ x W ( x − x (cid:48) , h ) ∇ · V ( x ) = (cid:90) d x (cid:48) V ( x (cid:48) ) · ∇ x W ( x − x (cid:48) , h ) (A4)Numerically, these integrals are replaced by sums overthe ﬂuid particles, i.e., the interpolating points, such that A ( x ) = (cid:90) d x (cid:48) A ( x (cid:48) ) W ( x − x (cid:48) ) −→ (cid:88) (cid:48) j (cid:48) ∈R ( x ) m j (cid:48) ρ j (cid:48) A ( x j (cid:48) ) W ( x − x j (cid:48) ) (A5)with m i and ρ i the mass and density of the i -th ﬂuidparticle. To reduce the computational burden, thesesums have been truncated to only include ﬂuid parti-cles within a cutoﬀ region R ( x ) centered at x , such that (cid:107) x j (cid:48) − x (cid:107) ≤ R c = 3 h . For notational simplicity, inwhat follows we will denote these truncated sums using aprimed summation symbol. As an example, the densityof each element can be computed by taking A = ρ andevaluating the function at x = x i , with A i ≡ A ( x i ) and W ij = W ( x i − x j ) ρ i = (cid:88) (cid:48) j m j W ij (A6) with the mass a constant.To solve the equations of motion for the ﬂuid particles,Eqs. (1)-(2), we need to evaluate the forces acting oneach particle, which involves computing the divergenceof the stress tensor. Instead of directly discretizing ∇ · σ , the “golden-rules” of SPH state that formulas shouldbe rewritten to place the density inside the diﬀerentialoperators[ ? ]. In this way, the forces are evaluated usingthe following expression ρ − ∇ · ( σ − p I ) ≡ ∇ · (cid:18) σ − p I ρ (cid:19) + σ − p I ρ ∇ ρ (A7)Therefore, the time derivative of the ﬂuid particle veloc-ity v i , Eq. (2), is given byd v i d t = (cid:88) j m j (cid:34)(cid:18) σ − p I ρ (cid:19) i + (cid:18) σ − p I ρ (cid:19) j (cid:35) · ∇ i W ij + F ( x i ) (A8)Finally, the equations of motion for the SPH particlesare discretized in time and integrated using the followingsecond-order scheme x n +1 i = x ni + v ni ∆ t + 12 m i (cid:18) d v i d t (cid:19) n (∆ t ) (A9) v n +1 i = v ni + 12 m i (cid:34)(cid:18) d v i d t (cid:19) n +1 + (cid:18) d v i d t (cid:19) n (cid:35) ∆ t

2. Modiﬁed SPH

Rigid boundaries, be they ﬁxed walls or moving parti-cles, can be incorporated within the SPH framework bydiscretizing them with boundary or wall particles. How-ever, using the standard SPH method introduced above,an unnatural ﬂow is observed in the presence of suchboundaries. To remove these artifacts, the values of theﬁrst and second derivatives of the physical quantities ofinterest can be used within the weighted averages[51], toarrive at the so-called Modiﬁed SPH (MSPH). The ap-propriate expressions can be obtained by starting fromthe second-order Taylor expansion of the quantity of in-terest. For a scalar quantity A ( x j ) we have A ( x j ) (cid:39) A ( x i ) + ∇ A ( x i ) · ( x j − x i ) (A10)+ 12 t ( x j − x i ) · ∇ i ∇ i A ( x i ) · ( x j − x i ) A j (cid:39) A i + ∇ A i · x ji + 12 t x ji · ∇∇ A i · x ji with x ij = x i − x j . Multiplying both sides of Eq. (A10)by the volume element m j /ρ j times the smoothing func-3tion W ij and summing over all particles, we obtain (cid:88) (cid:48) j m j ρ j A j W ij = (cid:88) (cid:48) j m j ρ j W ij A i (A11)+ (cid:88) (cid:48) j m j ρ j W ij x ji · ∇ A i + 12 (cid:88) (cid:48) j m j ρ j W ij t x ji x ji : ∇∇ A i which relates A to its ﬁrst and second order derivates. Tosolve this equation, we then need the corresponding rela-tions for ∇ A and ∇∇ A , which are given in Eqs. (A12)-(A13) (cid:88) (cid:48) j m j ρ j A j ∇ i W ij = (cid:88) (cid:48) j m j ρ j ∇ i W ij A i (A12)+ (cid:88) (cid:48) j m j ρ j ∇ i W ij x ji · ∇ A i = 12 (cid:88) (cid:48) j m j ρ j ∇ W ij t x ji x ji : ∇∇ A i (cid:88) (cid:48) j m j ρ j A j ∇ i ∇ i W ij = (cid:88) (cid:48) j m j ρ j ∇ i ∇ i W ij A i (A13)+ (cid:88) (cid:48) j m j ρ j ∇ i ∇ i W ij x ji · ∇ A i + 12 (cid:88) (cid:48) j m j ρ j ∇ i ∇ i W ij t x ji x ji : ∇∇ A i Eqs. (A11)-(A13) can be conveniently expressed in ma-trix form as follows: t i = B i · f i (A14)where f i = t (cid:16) A i , ∇ A i , ∇∇ A i (cid:17) is a vector whose en-tries are formed from A i and its derivatives. In 2D this re-sults in six independent components, thanks to the com-mutativity of the partial derivatives, such that f i ≡  A i A i,x A i,y A i,xx A i,xy A i,yy  (A15)where commas are used to denote partial derivatives, ∂ α A i = A i,α and ∂ α ∂ β A i = A i,αβ . Likewise, the vec-tor t i is composed using W ij and it’s derivatives t Ki = (cid:88) (cid:48) j m j ρ j A j Φ Kij (A16) where Φ Kij ( K = 1 , . . . ,

6) are the components of Φ ij = t ( W ij , ∇ i W ij , ∇ i ∇ i W ij ), given by Φ ij ≡  W ij W ij,x W ij,y W ij,xx W ij,xy W ij,yy  (A17)Finally, the B i matrix is deﬁned as B KLi = (cid:88) (cid:48) j m j ρ j Φ Kij Θ Lij (A18)with Θ Lij the components of the Θ ij vector, given by Θ ij =  x i − x j )( y i − y j ) ( x i − x j ) ( x i − x j )( y i − y j ) ( y i − y j )  (A19)Within the MSPH method, we use f i = B − i · t i to de-ﬁne the physical quantity A i of particle i at position x i .Similar expressions can be deﬁned for vector quantitiesand higher order tensors.

3. Boundary Conditions

To enforce the no-slip boundary condition at theﬂuid/solid interface, we use a virtual particle method[15,52]. The virtual particles are placed by reﬂecting theoutermost-layer of wall particles with respect to theboundary. Then, the velocity vector at the positionsof these virtual particles v (virtual) i is computed using theMSPH method, and the velocities of the symmetric wallparticles are set according to v (wall) i = − v (virtual) i . In thisway, the weighted average of the particle velocities at theboundary is guaranteed to be zero. ACKNOWLEDGMENTS

The authors would like to thank Prof. Ryoichi Ya-mamoto, Prof. Matthew Turner and Dr. Simon K.Schnyder for fruitful discussion. This work was sup-ported by the Japan Society for the Promotion of Sci-ence (Grants-in-Aid for Scientiﬁc Research KAKENHIno. 19H01862 and Wakate B no. 17K17825), the Oga-sawara Foundation, and the SPIRITS 2020 of KyotoUniversity. Figures and movies were generated usingMatplotlib[53], a Python 2D plotting library.4 (cid:101) κ x y (cid:101) σ x y ˙ (cid:101) σ x y De = 10

Training DataGP PredictionMaxwell Constitutive Equation − − − e σ x y ˙ e σ (Maxwell) xy − − −

50 0 50 100 150 e κ xy − − − e σ x y (cid:12)(cid:12)(cid:12) ˙ e σ (GP) xy − ˙ e σ (Maxwell) xy (cid:12)(cid:12)(cid:12) − − − FIG. 4. (color online) The learned constitutive relation forthe 1D simple shear ﬂow problem at De = 10 using N p = 10 dumbbells for the microscopic model. (top) Training points,GP prediction, and the exact Maxwell constitutive equation( N p = ∞ limit). (middle) Color map showing the exact con-stitutive relation ˙ (cid:101) σ Maxwell xy , as a function of (cid:101) κ xy and (cid:101) σ xy . (bot-tom) Color map showing the absolute error between the GPprediction and the exact solution. The markers in the bottomtwo graphs show the location of the training data set. e σ x y De = 1 . . . . De = 10

MSSGP-MSS t/λ M a x i m u m E rr o r( e v x ) × − t/λ × − MSSGP-MSS

FIG. 5. (color online) (a-b) Time evolution of the stress obtained from simple-shear ﬂow simulations (Re = 10) with thelearned constitutive equation (GP-MSS) as well as those from the full MSS, using microscopic dumbbell simulators ( N p = 10 ).Dark (light) colored lines correspond to small (large) values of the channel height (cid:101) y . At t = 0, a velocity wave starts at (cid:101) y = 1and propagates down the channel, bouncing back at the lower wall.(c-d) Maximum absolute error in the velocity, for all pointsin the channel, obtained from the MSS and GP-MSS predictions. The error is computed with respect to the results of the(exact) Maxwell constitutive equations, | (cid:101) v x − (cid:101) v (Maxwell) x | . Results for diﬀerent number of dumbbells N p = 10 , , used inthe microscopic simulations are shown. . . . ˆ ˙ σ xx . . . ˆ σ xx − ˆ ˙ σ x y GP-MSS ( N p = 10 ) Constitutive Equation − ˆ σ xy FIG. 6. (color online) GP predictions (cross symbols) for theconstitutive relation ˙ σ , learned using n training = 1 , × training points (generated from microscopic simulationswith N p = 10 dumbbells). Exact results (dot symbols) wereobtained from the Maxwell constitutive relation, correspond-ing to the N p → ∞ limit. The data clearly shows that theresults improve as the number of training points increases, asexpected. A caret ˆ( · ) indicates data that has been scaled tolie in the range [ − , κ γδ , σ αβ ) and output( ˙ σ αβ ) scaled separately, to facilitate visualization. − − − − e t e v x GP-MSS ( N p = ∞ ) Constitutive EquationNewtonian FluidGP-MSS ( N p = ∞ ) Constitutive EquationNewtonian Fluid

FIG. 7. (color online) Time evolution of the velocity, at (cid:101) y = 0 .

5, for the pressure-driven ﬂow problem (Re = 10 − ).Results for three diﬀerent De are shown, together with thecorresponding Newtonian ﬂuid (De = 0). Results obtainedusing the (solid line) exact Maxwell constitutive equation arecompared with (open symbols) GP-MSS using a constitu-tive relation learned from ( n training = 10 ) noiseless trainingpoints generated from the Maxwell model. − . − . . . . × − . − . . . . d e v x / d e t × Constitutive EquationGP-MSS ( N p = ∞ ) GP-MSS ( N p = 10 ) − − t/λ − . − . . . . × FIG. 8. (color online) Scaled force at three locations alongthe height of the channel, at (cid:101) y ∼ /

2, 1 /

4, and 0, for thepressure-driven ﬂow problem (Re = 10 − ). Results are fromGP-MSS, using a constitutive relation obtained from micro-scopic simulations ( N p = 10 dumbbells), as well as from theexact constitutive equation ( N p → ∞ ), with n training = 10 .Exact results, obtained from simulations using the Maxwellconstitutive relation are also shown. t/λ . . . . . M a x i m u m E rr o r n training10 ( N p = ∞ )10 · · FIG. 9. (color online) Maximum absolute error in the ve-locity obtained from GP-MSS of the pressure-driven ﬂow,max (cid:16)(cid:101) v − (cid:101) v (Maxwell) (cid:17) , using constitutive relations learned ondiﬀerent number of training points (generated from micro-scopic simulations using N p = 10 dumbbells). Results us-ing a constitutive relations learned from the exact solution( N p = ∞ ) are also shown. The simulations were performedat Re = De = 10 − . [1] M. Rubinstein and R. H. Colby, Polymer Physics , 1st ed.(Oxford University Press, Oxford, 2003).[2] National Research Council,

Polymer Science and Engi-neering: The Shifting Research Frontiers (The NationalAcademies Press, Washington, DC, 1994).[3] R. G. Larson,

Constitutive Equations for Polymer Meltsand Solutions : Butterworths Series in Chemical Engi-neering (Butterworth-Heinemann, Stoneham, 1988).[4] M. Laso and H. C. ¨Ottinger, Calculation of viscoelasticﬂow using molecular models: the connﬀessit approach,Journal of Non-Newtonian Fluid Mechanics , 1 (1993).[5] H. C. ¨Ottinger, Beyond Equilibrium Thermodynamics ,1st ed. (John Wiley & Sons, Ltd, Hoboken, New Jersey,2005).[6] O. Borodin, D. Bedrov, G. D. Smith, J. Nairn, andS. Bardenhagen, Multiscale modeling of viscoelasticproperties of polymer nanocomposites, Journal of Poly-mer Science Part B: Polymer Physics , 1005 (2005).[7] S. Yasuda and R. Yamamoto, A model for hybrid simula-tions of molecular dynamics and computational ﬂuid dy-namics, Physics of Fluids , 10.1063/1.3003218 (2008).[8] T. Murashima and T. Taniguchi, Multiscale Lagrangianﬂuid dynamics simulation for polymeric ﬂuid, Journal ofPolymer Science Part B: Polymer Physics , 886 (2010).[9] T. Murashima and T. Taniguchi, Multiscale simulation ofhistory-dependent ﬂow in entangled polymer melts, EPL(Europhysics Letters) , 18002 (2011).[10] Y. Masubuchi, T. Uneyama, and K. Saito, A multiscalesimulation of polymer processing using parameter-basedbridging in melt rheology, Journal of Applied PolymerScience , 2740 (2012).[11] S. Yasuda and R. Yamamoto, Synchronized molecular-dynamics simulation via macroscopic heat and momen-tum transfer: An application to polymer lubrication,Physical Review X , 041011 (2014).[12] C. Wu, Multiscale Modeling Scheme for SimulatingPolymeric Melts: Application to Poly(Ethylene Oxide),Macromolecular Theory and Simulations , 1700066(2018).[13] T. Sato, K. Takase, and T. Taniguchi, Multiscale Simu-lation of Polymer Melt Spinning by Using the DumbbellModel, Nihon Reoroji Gakkaishi , 265 (2017).[14] T. Sato and T. Taniguchi, Multiscale simulations for en-tangled polymer melt spinning process, Journal of Non-Newtonian Fluid Mechanics , 34 (2017).[15] T. Sato, K. Harada, and T. Taniguchi, Multiscale Simu-lations of Flows of a Well-Entangled Polymer Melt in aContractionExpansion Channel, Macromolecules , 547(2019).[16] Y. Mu, N. Li, L. Hang, G. Zhao, J. Gao, and Z. Niu,Investigation of the Rheological Behaviors of PolymericMaterials in the Film Casting Process through MultiscaleModeling and Simulation Method, Macromolecular The-ory and Simulations , 1900001 (2019).[17] E. T. Jaynes, Probability Theory: The Logic of Science ,1st ed. (Cambridge University Press, New York, 2003).[18] D. S. Sivia and J. Skilling,

Data Analysis: A BayesianTutorial , 2nd ed. (Oxford University Press, Oxford,2006).[19] C. E. Rasmussen and C. K. I. Williams,

Gaussian Pro-cesses for Machine Learning (The MIT Press, Cam- bridge, 2005).[20] K. P. Murphy,

Machine Learning : A Probabilistic Per-spective , 1st ed. (MIT Press, Cambridge, 2012).[21] L. Zhao, Z. Li, B. Caswell, J. Ouyang, and G. E. Kar-niadakis, Active learning of constitutive relation frommesoscopic dynamics for macroscopic modeling of non-Newtonian ﬂows, Journal of Computational Physics ,116 (2018).[22] L. Zhao, Z. Li, Z. Wang, B. Caswell, J. Ouyang, andG. E. Karniadakis, Active- and transfer-learning appliedto microscale-macroscale coupling to simulate viscoelas-tic ﬂows, , 1 (2020), arXiv:2005.04382.[23] J. Monaghan, Smoothed particle hydrodynamics, AnnualReview of Astronomy and Astrophysics , 543 (1992).[24] J. P. Morris, P. J. Fox, and Y. Zhu, Modeling LowReynolds Number Incompressible Flows Using SPH,Journal of Computational Physics , 214 (1997).[25] R. Byron Bird, C. F. Curtiss, R. C. Armstrong, andO. Hassager, Dynamics of polymeric liquids, Vol.2 : Ki-netic Theory , 2nd ed. (Wiley-Interscience, New York,1987).[26] P. E. Rouse, A theory of the linear viscoelastic propertiesof dilute solutions of coiling polymers, The Journal ofChemical Physics , 1272 (1953).[27] K. Kremer and G. S. Grest, Dynamics of entangled linearpolymer melts: A molecular-dynamics simulation, TheJournal of Chemical Physics , 5057 (1990).[28] M. Doi and S. F. Edwards, The theory of polymer dy-namics , 1st ed. (Oxford University Press, Oxford, 1986).[29] Y. Masubuchi, J. I. Takimoto, K. Koyama, G. Ianniru-berto, G. Marrucci, and F. Greco, Brownian simulationsof a network of reptating primitive chains, Journal ofChemical Physics , 4387 (2001).[30] M. Doi, J. I. Takimoto, P. G. De Gennes, R. Magerle,A. N. Semenov, D. J. Read, M. E. Cates, and X. H.Zheng, Molecular modelling of entanglement, Philosoph-ical Transactions of the Royal Society A: Mathematical,Physical and Engineering Sciences , 641 (2003).[31] J. D. Schieber, J. Neergaard, and S. Gupta, A full-chain, temporary network model with sliplinks, chain-length ﬂuctuations, chain connectivity and chain stretch-ing, Journal of Rheology , 213 (2003).[32] A. E. Likhtman, Single-chain slip-link model of entan-gled polymers: Simultaneous description of neutron spin-echo, rheology, and diﬀusion, Macromolecules , 6128(2005).[33] T. Uneyama and Y. Masubuchi, Multi-chain slip-springmodel for entangled polymer dynamics, Journal of Chem-ical Physics , 154902 (2012).[34] R. Byron Bird, R. C. Armstrong, and O. Hassager, Dy-namics of polymeric liquids, Vol. 1: Fluid Mechanics ,2nd ed. (Wiley-Interscience, New York, 1987).[35] R. B. Bird and C. F. Curtiss, Molecular theory expres-sions for the stress tensor in ﬂowing polymeric liquids,Journal of Polymer Science: Polymer Symposia , 187(1985).[36] D. Duvenaud, Automatic model construction with Gaus-sian processes , Ph.D. thesis, University of Cambridge(2014).[37] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld,N. Tishby, L. Vogt-Maranto, and L. Zdeborov´a, Machine learning and the physical sciences, Reviews of ModernPhysics , 045002 (2019).[38] J. R. Gardner, G. Pleiss, D. Bindel, K. Q. Weinberger,and A. G. Wilson, GPyTorch: blackbox matrix-matrixGaussian process inference with GPU acceleration, in Advances in Neural Information Processing Systems 31 ,edited by S. Bengio, H. Wallach, H. Larochelle, K. Grau-man, N. Cesa-Bianchi, and R. Garnett (Curran Asso-ciates Inc., 2018) pp. 7587–7597.[39] K. A. Wang, G. Pleiss, J. R. Gardner, S. Tyree, K. Q.Weinberger, and A. G. Wilson, Exact Gaussian Pro-cesses on a Million Data Points, in

Advances in NeuralInformation Processing Systems 32 , edited by H. Wal-lach, H. Larochelle, A. Beygelzimer, F. D’Alch´e-Buc,E. Fox, and R. Garnett (Curran Associates, Inc., 2019)pp. 14648–14659.[40] M. Betancourt, A Conceptual Introduction to Hamilto-nian Monte Carlo, arXiv:1701.02434 (2017).[41] M. D. Hoﬀman and A. Gelman, The no-U-turn sampler:Adaptively setting path lengths in Hamiltonian MonteCarlo, Journal of Machine Learning Research , 1593(2014).[42] J. Salvatier, T. V. Wiecki, and C. Fonnesbeck, Probabilis-tic programming in Python using PyMC3, PeerJ Com-puter Science , e55 (2016).[43] See Supplemental Material SM1 and SM2 at [URL] forthe time-evolution of the velocity proﬁle, for the simple-shear ﬂow case, obtained from MSS and GP-MSS, as wellas the exact solution given by the Maxwell constitutiverelation. Results for De = 1 (SM1) and De = 10 (SM2)are provided.[44] See Supplemental material SM3 at [URL] for the time-evolution of the velocity proﬁle, for oscillatory-shear ﬂow,obtained from MSS and GP-MSS, as well as the ex-act solution given by the Maxwell constitutive relation.Simulations are performed by setting the velocity of thetop wall to be v x = U cos ( ωt ), where the magnitude U and frequency ω of the shear ﬂow are set by the Deb-orah and (squared) Womersley numbers, De = 1 andWo = L ρω/η = 20, respectively. For comparison pur-poses, we have also shown simulation results for a cor-responding generalized Newtonian ﬂuid, i.e., assuming aconstitutive relation of the form σ xy = η (eﬀ) ( ˙ γ ) · ˙ γ (with˙ γ the shear rate, and η (eﬀ) the eﬀective viscosity), forwhich there is no memory eﬀect.[45] We mainly used the default parameters proposed byKingma and Ma for Machine Learning problems, namely, the step size is α = 10 − , the hyper-parameters are β = 0 . β = 0 . ε = 10 − . We have set themaximum number of iterations to be 500.[46] D. P. Kingma and J. L. Ba, Adam: A method forstochastic optimization, in (2014) pp. 1–15.[47] See Supplemental Material SM4 at [URL] for the fullconstitutive relation map used in the learning procedurefor the 2D pressure driven ﬂow problem, for the case of n training = 1 × points, together with data generatedfrom the exact Maxwell constitutive relation. In addi-tion, we also plot the trajectory data ( κ ( t ) , σ ( t ) , ˙ σ ( t ))obtained from GP-MSS (De = 1 × − ) for three repre-sentative points ˜ y ∼ , / , / − (open symbols), using constitutive relationslearned from n training = 1 , , × points (generatedfrom microscopic simulations with N p = 10 dumbbells),as well as the exact solution given by the Maxwell consti-tutive relation (solid line). Results obtained using a con-stitutive relation learned from n training = 1 × pointsgenerated from the exact constitutive relation are alsoshown (ﬁlled symbols).[49] P. Perdikaris, M. Raissi, A. Damianou, N. D. Lawrence,and G. E. Karniadakis, Nonlinear information fusion al-gorithms for data-eﬃcient multi-ﬁdelity modelling, Pro-ceedings of the Royal Society A: Mathematical, Physicaland Engineering Science , 20160751 (2017).[50] M. Raissi, P. Perdikaris, and G. E. Karniadakis, Infer-ring solutions of diﬀerential equations using noisy multi-ﬁdelity data, Journal of Computational Physics , 736(2017).[51] G. M. Zhang and R. C. Batra, Modiﬁed smoothedparticle hydrodynamics method and its application totransient problems, Computational Mechanics , 137(2004).[52] S. Adami, X. Y. Hu, and N. A. Adams, A generalized wallboundary condition for smoothed particle hydrodynam-ics, Journal of Computational Physics , 7057 (2012).[53] J. D. Hunter, Matplotlib: A 2D Graphics Environment,Computing in Science and Engineering9