[PDF] An innovating Statistical Learning Tool based on Partial Differential Equations, intending livestock Data Assimilation

Abstract

The realistic modeling intended to quantify precisely some biological mechanisms is a task requiering a lot of a priori knowledge and generally leading to heavy mathematical models. On the other hand, the structure of the classical Machine Learning algorithms, such as Neural Networks, limits their flexibility and the possibility to take into account the existence of complex underlying phenomena, such as delay, saturation and accumulation. The aim of this paper is to reach a compromise between precision, parsimony and flexibility to design an efficient biomimetic predictive tool extracting knowledge from livestock data. To achieve this, we build a Mathematical Model based on Partial Differential Equations (PDE) embarking the mathematical expression of biological determinants. We made the hypothesis that all the physico-chemical phenomena occurring in animal body can be summarized by the evolution of a global information. Therefore the developed PDE system describes the evolution and the action of an information circulating in an Avatar of the Real Animal. This Avatar outlines the dynamics of the biological reactions of animal body in the framework of a specific problem. Each PDE contains parameters corresponding to biological-like factors which can be learnt from data by the developed Statistical Learning Tool.

Full PDF

AAn innovative Statistical Learning Tool based on PartialDiﬀerential Equations for livestock Data Assimilation

Hélène Flourent , , , Emmanuel Frénod , , , Vincent Sincholle NutriX , France Université Bretagne Sud, Laboratoire de Mathématiques de BretagneAtlantique,UMR CNRS , Campus de Tohannic, Vannes, France See-d, , rue Henri Becquerel - CP , Vannes Cedex, France

Abstract

Realistic modeling of biological mechanisms requires a large volume of priorknowledge and leads to heavy mathematical models. On the other hand, theclassical Machine Learning algorithms, such as Neural Networks, need a largequantity of data to be ﬁtted. Nevertheless, to predict the evolution of biologicalvariables we are often facing a lack of knowledge and a lack of data, especiallyin the livestock sector. Therefore, we explored an intermediate approach, called"Data-Model Coupling". We demonstrated that parametrized Partial DiﬀerentialEquations (PDEs) can be embedded in a data ﬁtting process and then in aneﬃcient predictive Statistical Learning tool. We postulated that all the physico-chemical phenomena occurring in an animal body can be summarized by thecirculation, the evolution and the action of an overall information ﬂow. We builtthe PDE system which mathematically translates our assumption and we ﬁttedit on data.The applications of our approach to data relative to the growth of farm ani-mals showed that it increases the forecasting accuracy and reduces the trainingdata dependency of the resulting predictive tool. Moreover, learning the dynamicslinking the inputs and the outputs confers to the tool the capability to be trainedon a given range of data and then to be accurately applied outside this range ofdata. This extrapolation capability is a real improvement over existing predictivetools. keywords : Statistical Learning, PDE, Forecasting, Data Assimilation, Data-Model Coupling, Biological Mathematical Modeling.

Smart Farming corresponds to the use of new technologies to make the farm productionprocesses more eﬃcient.As it can be identiﬁed in [1], [2], [3], [4], [5] and [6], in the agri-food sector, simulatingand predicting the eﬀects of nutrition on animal performances are two decisive andstrategic goals for breeders and companies to optimize animal eﬃciency. However,the biological phenomena linking the nutrition and the performances of animals are helene.ﬂ[email protected] [email protected] The compagny wishes to remain anonymous a r X i v : . [ s t a t . O T ] J a n omplex. Furthermore, in most cases, to build tools able to predict the evolution ofbiological variables, it is necessary to jointly manage the complexity of the phenomenaoccurring in the studied biological system and the lack of data available to ﬁt thosetools.Data Assimilation is an approach that embeds mathematical theories, Data Scienceand Computer Science processes to estimate the most likely state of a connected systemat an instant t (See [7], [8], [9] and [10]). To do so, it combines the information given bya predictive tool and the one contained in a more or less continuous stream of collecteddata. To very brieﬂy sum up, it consists of considering that data ﬂows are gathered tocorrect at a given frequency the simulation done by the predictive tool. This correctiontakes into account that collected data contain noise and the predictive tool embeds anintrinsic model error.This combination of information could permit to know the state of an animal or agroup of animals, in terms of health and performances, according to their ingestions andthe drugs that are administered to them. Hence, this concept constitutes an interestingand promising way to oversee future livestock and address the Smart Farming issues([11], [12] and [13]).Biological data are not easy to collect and generally contain a large variability ([14]and [15]). Hence, to perform Data Assimilation in the livestock sector it is necessary todevelop eﬃcient and light predictive tools able to be ﬁtted on few and scattered datarelative to complex phenomena.According to Vázquez-Cruz et al. [16], there are currently two general approachesto build tools predicting biological responses.On the one hand, realistic modeling of biological mechanisms requires a large vol-ume of prior knowledge and generally leads to heavy mathematical models ([17] and[18]). However, the complex implementation of these models limits their adaptability,in particular when it comes to processing or assimilating ﬁeld data.On the other hand, the structure of classical Machine Learning (ML) algorithms,such as Neural Networks, have limited ability to take into account the existence ofcomplex underlying phenomena and need to be ﬁtted on a large quantity of data tocompensate for the absence of prior biological expertise ([19], [20], [21] and [22]).Hence, due to their lack of adaptability or their inability to be ﬁtted to few datathe existing tools are not entirely appropriate for achieving Data Assimilation in thecontext of "Biological Small Data".We assumed that a global and synthetic consideration of the biological processes mayhelp gain precision, in comparison with a classical ML tool which integrates no priorknowledge. We also assumed that this synthetic consideration permits us to do it whilekeeping a ﬂexible and light tool, in comparison with a tool based on realistic models.Therefore, we explored an intermediate approach, named "Data-Model Coupling" tobuild predictive tools able to deal with both the complexity of the biological responsesand the current lack of data.This emerging approach is midway between the realistic modeling and the "BlackBox" approach. As seen in [23], [24], [25] and [26], Data-Model Coupling approachconsists of building a mathematical model, corresponding to a mathematical synthesisof the studied system. Then, the parameters contained in the model are ﬁtted to data.2s in the above-cited studies, the construction of our tool is based on an optimal com-bination of knowledge, to design a relevant mathematical model, and data, to optimizethe model parameters. In our approach, the mathematical model is a wisely designedparametrized PDE system.Data Assimilation is a long-term objective of our research work. Nevertheless, toﬁnally obtain a tool particularly suitable to perform Data Assimilation, this long-termobjective has strongly guided the whole modeling approach introduced in this paper.Several contributions can be identiﬁed in this paper.In the ﬁrst place, the applications of this tool, on collected data relative to thegrowth of farm animals, put in evidence its extrapolation capability which is a realimprovement over existing predictive tools. As it is illustrated in Figure 1, in theapplication presented in Section 4, our tool was trained on a short training period tolink the inputs and the outputs. Yet, it can accurately predict the outputs from theinputs outside the range of the Training Data. This is the peculiarity of our approach:the model learns synthetic dynamics linking the inputs and the outputs by ﬁttingparameter-dependent evolution equations. Once the parameters ﬁtted, those dynamicscan be applied outside and even far from the Training Data range.This extrapolation capability permits to reduce the amount of data to collect andthus to reduce the costs relative to the experiments and the data management. It alsopermits to extend the validity period of the prediction provided by the tool. Hence,when it will come to Data Assimilation issues, in perspective with what explained above,the correction of the prediction via the use of data could be less frequent and thus thecomputational costs could be lower. Weight TimeTraining period Application period

Figure 1: Schematization of the extrapolation capability of the developed toolThe second contribution of our exploration is the development of a concept betweenthe reality and the model (Figure 2). In most cases, the objective of a mathematicalmodel is to translate the reality, adopting a higher or lower abstraction level. In ourapproach, a diﬀerentiation between the reality and our model is made. Indeed, the usedsupport of reﬂection is not directly the real animal, but an Avatar which conceptuallyand essentially outlines the global dynamics occurring in the animal body. A largenumber of physico-chemical phenomena occur in the animal body in response to theingestion or the injection of molecules. They lead, some time later, to the change3f biological variables. This supply of molecules and those biological variables can bemonitored and recorded to generate Input and Output data. We assumed that this kindof Inputs and Outputs can be linked by a dynamical model which is a mathematicaltranslation of the Avatar. Therefore, we designed the PDE system mathematicallytranslating our assumption and describing the convection, the diﬀusion and the actionof an overall information ﬂow.

Reality ● Animal body● Blood● Molecules● Nutrients● Enzyme● Hormones ● Organs● Muscles● ...● Ingestion of nutrient● Injection of molecules● Rumination● Blood circulation● Nervous circulation● Digestion● Fermentation● Absorption ● Assimilation● Metabolism● ...

Avatar ● Information flows● Convection● Diffusion● Fixation● Delay● Usage● Saturation

ComponentsProcesses

Model ● Mathematical functions● Parameter-dependent evolution equations● Convection operators● Diffusion operators● Delay operators● Integral operators● Transfer functions

Figure 2: Schematization of the diﬀerent sets considered in our approach and their owncomponents and processes.Thirdly, the application of our approach on real data showed that our tool can ac-curately link biological-related Inputs and Outputs, even if it is ﬁtted on few, scatteredand noisy data.Data-Model Coupling is so far essentially used in the ﬁelds of meteorology (see[27]), hydrology (see [28],[29] and [30]), biogeochemistry (see [31], [32], [33] and [34])and oceanography (see [35]). The successful use of a Data-Model Coupling approachto treat biological issues can be also considered as a contribution of this paper.In this paper, we will show that the use of a short and relevant PDE system in aﬁtting process leads to the construction of an eﬃcient predictive tool having a low datadependency and a high information extraction capability.This tool can be used to predict the evolution of biological variables according tothe ingestions and the injections of molecules in the animal body. The objective of theapplication presented in this paper was to predict the growth of two groups of animalsof a speciﬁc species, according to their initial weight and their feeding behavior. Butthe genericity and the parsimony of our tool might ensure its suitability to predict otherperformance indicators relative to other farm species.4his low data dependency and this high information extraction capability allow theuse of few data to ﬁt our tool. Therefore it can be used to reduce the costs relativeto experiments, data collection, and data storage. Furthermore, in comparison withthe existing predictive tools, these capabilities also make our tool more suitable to ef-ﬁciently perform Data Assimilation, even if the frequency of data collection and thequality of the collected data are low.To summarize, in our approach we distinguished diﬀerent dimensions. As it is il-lustrated by Figure 3, there is the

Reality in which there are

Intakes and

Injections inducing complex biological processes in the animal body. Some

Sensors extract in-formation from this

Reality which is stored in databases made of

Inputs and

Outputs .Since the model is not directly assimilable to the reality, the inﬂows and the outﬂowsof the model are also not directly assimilable to the input and the output data. The

Inputs have to be translated by a mathematical function into

Entries , that are piecesof information integrated into the

Mathematical Model and that induce the generationof

Outcomes , also linked to the

Outputs extracted from the

Reality by a mathematicalfunction.Figure 3: Articulation of the diﬀerent elements of ourcapabilities explorationAs it can be noticed in Figure 3, our exploration relies on the relationship betweenseveral diverse elements such as the

Real Animal , the

Avatar and the

MathematicalModel . The

Algorithm comes out of the discretization of the

Mathematical Model , e.g.the PDE system which mathematically translates what takes place in the

Avatar . Thissystem of PDEs contains parameters corresponding to biological-like factors that can5e learned from a database. The

P rogram corresponds to the code that managesthis learning step. It uses an iterative training process during which an optimizationalgorithm ﬁnds the values of the parameters that minimize the diﬀerence between themeasured and the predicted

Outputs . The

T ool ﬁnally corresponds to the

MathematicalModel parameterized with the values of the parameters obtained at the end of thelearning step.The presence of parameters that can be learned from data in the mathematicalmodel confers learning ability to the tool based on this model. Using a database, weobtained a tool able to reconstitute dynamics between inputs and outputs to performforecasts and extrapolations. Hence, the constructed tool can be considered as a

Sta-tistical Learning Tool .In this paper, we will present our modeling approach, the conception of our tooland the results of the applications of our approach on ﬁctitious and real data.After this Introduction, putting this research work in its proper context, we willdetail in Section the conception of the mathematical model and its applicability.We tested the well-functioning and the capacities of our tool in two diﬀerent ways.First, we established a ﬁtting method taking into account the relations existing betweenthe model parameters and we generated a database to test this ﬁtting method on it.This ﬁrst application on mastered data allowed us to verify the ability of our tool to ﬁtthe parameters. Those simulation tests are presented in Section . After those tests onﬁctitious data, we applied our approach to data collected on a farm and relative to thefeeding behavior and the growth of two groups of animals. The results are presented inSection . This application demonstrated the prediction capability of the tool in realconditions.We put in evidence the potential of our tool and the improvement conferred byit. To do so, we compared the capabilities of our tool with the ones of some LogisticModels, Mechanistic Models and Machine Learning algorithms. These comparisons willbe detailed in Section . In our approach, particular attention was paid to the construction of the

MathematicalModel embedded in the ﬁnal predictive tool. Indeed, the designing of this model - thatis a PDE system - was the key element to achieve our objectives of lightness, accuracyand learning potency.

Through the conceptualization of the

Avatar , we set up a parsimonious summary of anybiological process. Indeed, we mathematically summarized the global intern dynamicsof the animal via several equations and mathematical operators which we assumednecessary and suﬃcient.We hypothesized that, when a molecule or a group of molecules enter the body ofa living organism, it circulates in the body through a network of vessels containing6 ﬂuid. It integrates this ﬂuid and uses it as a vector to evolve via convection anddiﬀusion mechanisms. In the network of vessels, the molecules may be in competitionwith other mechanisms which may delay its progression. The circulating molecules maythen be captured and accumulated in an organ or a speciﬁc tissue. During its storage,the molecules can be used and induce a change in some biological variables. Then, webuilt the PDE system which mathematically translates the previously set up summaryillustrated by Figure 4. S Φ b Φ f < < < <> > > > • Q Ψ f Ξ u O< rω, c − ω, c P rediction \ SimulationInputsEntries Outputs P Outcomes

Figure 4: Schematization of the Mathematical ModelWe modeled our

Avatar using variables, densities, and ﬁelds that are all unitlessand dimensionless. We also reduced the considered geometrical space to interval [0 ; 1] .We considered a Forward Flow Φ f , and a Backward Flow Φ b streaming in this one-dimensional geometrical space. As it was described in the introduction, these ﬂows canbe seen as a very synthetic summary of blood circulation, a circulation in the nervoussystem or circulation in the digestive tract, according to the problematic.The involved Inputs in our tool can correspond to collected data relative to feedintakes, water intakes, and administered drugs. These Inputs are integrated in theMathematical Model via a function Q transforming these Inputs into information in-ﬂows, called Entries . We modeled that part of the injected information circulates in aforward direction, via Φ f and the rest circulates backward, via Φ b . This informationcan evolve via convection and diﬀusion phenomena. We assumed that the circulatinginformation can be delayed, captured, stored and used to ultimately induce a modiﬁ-cation in the Outcomes O . These Outcomes correspond to the model outﬂows. ThoseOutcomes are transformed by a mathematical function to be comparable with collectedoutputs.Therefore, Φ f ( t, x ) and Φ b ( t, x ) are, at each instant t , two space densities respectivelyassociated with a forward ﬂux with a velocity ω and a backward ﬂux with a velocity − ω . 7he spatial density Φ f ( t, x ) is supposed to be solution to: ∂ Φ f ∂t ( t, x ) + ω ∂ Φ f ∂x ( t, x ) − c ∂ (cid:20) χ ∂ (cid:104) Φ f + Φ b (cid:105) ∂x (cid:21) ∂x ( t, x )= 12 Q ( t, x ) − f F ( x )Φ f ( t, x ) − r Φ f ( t, x ) , (1)Similarly, Φ b ( t, x ) is supposed to be solution to: ∂ Φ b ∂t ( t, x ) − ω ∂ Φ b ∂x ( t, x ) − c ∂ (cid:20) χ ∂ (cid:104) Φ f + Φ b (cid:105) ∂x (cid:21) ∂x ( t, x )= 12 Q ( t, x ) − f F ( x )Φ b ( t, x ) + r Φ f ( t, x ) , (2)In these equations, the parameter c is the diﬀusion velocity of the information. Thespace-time density Q , corresponds to an external source of information. The function F is worth in certain areas of the involved geometrical space and in others. Thearea where this function is worth corresponds to the location of the entity capturingthe information. The parameter f determines the rate of ﬁxed information. The pa-rameter r determines the fraction of the circulating information transferred from theForward Flow to the Backward Flow, which induces a delay in the progression of theinformation.The function χ is compactly supported in (0 , , mainly constant and worthing . Thisfunction integrated into the diﬀusion term makes diﬀusion vanish at the edges of the do-main. At each time t , the spatial density Ψ( t, x ) , associated with the ﬁxed information,is solution to: ∂ Ψ ∂t ( t, x ) = f F ( x ) (cid:20) Φ b ( t, x ) + Φ f ( t, x ) (cid:21) − u Ψ( t, x ) . (3)The parameter u is the coeﬃcient determining the usage rate of the ﬁxed information.At each time t , the spatial density Ξ( t, x ) , associated with the used information, issolution to: ∂ Ξ ∂t ( t, x ) = u Ψ( t, x ) . (4)The parameter Ω corresponds to the area of action of the circulating information onthe Outcome . O ( t ) is the Outcome of the model, given by: O ( t ) = (cid:90) Ω Ξ( t, x ) dx. (5)In this mathematical model we imposed : ∀ t ∈ (0 , + ∞ ) , Φ f ( t,

0) = Φ b ( t, and Φ b ( t,

1) = Φ f ( t, (6)8hese conditions allow the circulating information to move back and forth between thetwo edges of the domain.The initial conditions Φ f (0 , x ) , Φ b (0 , x ) , Ψ(0 , x ) , Ξ(0 , x ) and O (0) are given for all x in (0 , . The previously presented mathematical model made of Equations (1), (2), (3), (4) and(5) can be used to simulate and predict an accumulative process. Hence, it can be usedto study data relative to a total production over a given period.The fourth equation of the model is the «usage» equation. This equation deter-mines the action of the injected information on the variable to predict. Therefore, thisequation has to adapt the diﬀerent ways in which an intake or an injection may aﬀecta biological variable.To model a logistical growth, we added a limiter in this equation. In this case, the«usage» equation becomes: ∂ Ξ ∂t ( t, x ) = u Ψ( t, x ) (cid:18) L − O ( t ) L (cid:19) (4b)With this version of the equation, data related to the change in weight of an animalcan be tackled. This equation is essentially the diﬀerential equation of Verhulst [36]: ∂y∂t ( t ) = r y ( t ) (cid:18) K − y ( t ) K (cid:19) (7)whose structure is equivalent. Indeed, in the case when nothing depends on x , the valueof u is very high and Ω is the whole interval [0 ; 1] , Ξ , Ψ and O are very similar. Hence,Equations (7) and (4b) are essentially the same.It may be also necessary to model variations to use our tool to treat, for example,data concerning drug eﬀects. To do so, we have to be able to model an increase or adecrease in the Outcomes . Since it is the case of most biological variables, we assumedthat these outcomes may vary between an upper and a lower bound. Hence, we builttwo other equations: The equation, ∂ Ξ ∂t ( t, x ) = − (cid:16) Ξ( t, x ) − U pp (cid:17) − u Ψ( t, x ) (cid:16) Ξ( t, x ) − Low (cid:17) (4c)models that the ﬁxed information Ψ orients the Outcomes O toward a state that islower than the steady state, and the equation, ∂ Ξ ∂t ( t, x ) = − u Ψ( t, x ) (cid:16) Ξ( t, x ) − U pp (cid:17) − (cid:16) Ξ( t, x ) − Low (cid:17) (4d)models that the ﬁxed information Ψ orients the Outcomes O toward a state which isgreater than the steady-state. In these two cases, the Outcomes vary between a lowerbound

Low and an upper bound

U pp . 9he «usage» equation has to be deﬁned a priori according to the variable to predictand the used inputs.We expect that the mathematical models made of Equations (1), (2), (3), (5) andEquations (4), 4b, 4c or 4d are suﬃciently generic to be ﬁtted on data relative to allthe diﬀerent farm species.

For the discretization of the Mathematical Model, we ﬁrst used the classical FiniteDiﬀerence method, with a given space step, to obtain semi-discrete in space equations.And, because the Mathematical Model is coded using R software, we used the R-function

Ode. D developed by Soetaert et al. [37] to manage the temporal discretizationof the semi-discrete equations. This R-function calls upon the fourth-order Runge Kuttamethod with a given time step (See [38]).In this ﬁrst exploration, to ﬁnd a compromise between precision and calculation time,we parameterized the mesh with a time step of . and a space step of . . The system of PDEs contains several parameters : ω , c , r , f , u and L . The diﬀusionparameter c is the less inﬂuent model parameter. Hence, we set it to . . All theother parameters are ﬁtted from a database by using an optimization algorithm. To doso, we used the function directL developed by Johnson [39], which is embedded in R([40]) and applying the DIRECT algorithm developed by Finkel [41]. This algorithmsearches the optimal values of the parameters, that is the values that minimize the errorassociated with the model on a given training database.A detailed mathematical analysis of the model and its discretization will be per-formed in an upcoming paper. Nevertheless, we already know that the convection anddiﬀusion speeds must follow the CFL conditions (See [42] and [43]). Indeed, since weset the discretization steps (Section 2.3), ω must be smaller than and c must besmaller than . .To ﬁt the model we have to specify lower and upper values for each parameterbetween which the optimization algorithm will search their optimal values. Therefore,a comprehensive study of the ranges of values of the diﬀerent model parameters wasperformed and presented in the working paper [44]. We refer to it for the details of thisstudy. The objective of this section is to present the tests by simulation performed to verifythe ability of the tool to learn parameters from noisy biological data. To do so, westarted by generating a ﬁctitious database from the parameterized mathematical model10ade of Equations (1), (2), (3), (4) and (5). Then, we used this database to studythe compensation eﬀects existing between the parameters, to simulate the ﬁtting of theparameters and then to verify if the model ﬁts the data correctly.

We generated a ﬁctitious database containing 50 individuals, which is 50

Output Curves .The objective was to obtain a database having the same characteristics as a real ﬁelddatabase. To do so, we integrated noise and individual variations in it.The construction of this ﬁctitious database is presented in Appendix I. We refer toit for the details.Figure 13 shows an example of the generated curves without and with noise. Wedivided the obtained database into two parts: A Training Database made of curvesand a Test Database made of curves.In the rest of this section, we supposed that we have an experimental-like databaseand a model containing four parameter values to determine: ω , c , r , f and u .Figure 5: Example of simulated curves without and with noise A study of the compensation eﬀects existing between ω and r and between f and u , putin evidence the existence of relations existing between these two pairs of parameters.For example, the relation existing between the parameters f and u can be noticed inFigure (6).This study is presented in Appendix II. We refer to it for the details.We concluded from this study that, using a Nadaraya-Watson kernel regressions(See [45] and [46]), we obtained a non-parametric relationship linking ω and r in theform of, r = ˆ m ω ( ω ) + (cid:15) ω , (8)and another one linking f and u in the form of, u = ˆ m f ( f ) + (cid:15) f , (9)where ˆ m ω and ˆ m f corresponds to the Nadaraya-Watson estimators and (cid:15) ω and (cid:15) f arethe residuals. 11igure 6: The D representation of the value of an error indicator (the Relative ResidualSum of Squares). This value is calculating according to the values of f and u .Knowing the relationship existing between ω and r and the one existing between f and u , it is possible to ﬁt ω and f and then deduce the values of r and u . Hence, theserelations permits to reduce the number of parameters to learn simultaneously and sofacilitate and reinforce the ﬁtting process. We ﬁtted the parameters to the Training Database and then we tested the accuracy ofthe obtained model by calculating the error made on the Test Database. ω and f and accuracy of the obtained predictive tool To perform several ﬁttings we sampled the Training Database: we sampled curvesfrom the curves of the Training Database and we ﬁtted the parameters to the sampled curves. By proceeding in this manner, we performed ﬁttings of the parame-ters. To determine the values of ω , r , f and u , we ﬁtted ω and f on the selected curvesof the Training Database and we deduced the values of r and u using Equalities (8) and(9).To optimize the parameters, we used the R-function directl to ﬁnd the values of ω and f minimizing the function, f obj ( ω, f ) = 1 n n (cid:88) i =1 (cid:32) m (cid:88) j =1 (cid:18) ( y ij obs − y ij pred ( ω, f )) y ij obs (cid:19) (cid:33) . (10)After the ﬁttings of the parameters, we obtained values of ω , r , f and u . Wecalculated the average value and the Relative Standard Deviation (RSD) of each pa-rameter (Table 1). We also looked at the ﬁt of the model calculating from the TrainingDatabase the value of the Determination Coeﬃcient ( R ) (Figure 7 and Table 1). Theresults show that the model ﬁts the curves of the Training Database well.To validate the ability of the tool to learn parameters from noisy data, we calculatedthe accuracy of the model on the Test Database. To do so, we calculated the RelativeResidual Sum of Square ( RRSS ) and the Determination Coeﬃcient associated with12igure 7: Examples of results given by the predictive tool in comparison with sometraining curves.Table 1: Average and Relative Standard Deviation of the parameters and the Determi-nation Coeﬃcient calculated on the Training Database.Parameter Average Relative standard deviation ω . . f . . r . . u . . R .

97 0 . each curve contained in the Test Database and we obtained the distributions showed inFigure 8. The RRSS is low and the Determination Coeﬃcient is high, indicating thatthe model ﬁts the curves of the Test Database well.We compared the R and the RRSS associated with the Generator model ( R Gener and

RRSS

Gener ) - i.e. the model used to generate the ﬁctitious Database - and the R and the RRSS associated with the Fitted Model ( R F it and

RRSS

F it ) (Figure 8 andTable 2).

RRSS

F it is low and this value is very similar to the value of

RRSS

Gener . The R F it is high and this value is also very similar to the value of R Gener . These indicatorsthus demonstrate that the model ﬁtting method is highly satisfactory and the errorassociated with the adjusted model is limited to the amount of noise and individualdiﬀerences initially integrated into the generated database.Table 2: Comparison between the indicators associated with the Generator Model andthe Fitted Model.

RRSS R Generator Model . Fitted Model . RRSS and of the R coeﬃcient associated with the Generator M odel and the Fitted Model.

In this section, we present an application of our approach to ﬁeld data. The database weused is conﬁdential therefore only the dimensionless Inputs and Outputs are presented.

The objective of this application is to build a tool that can predict the logistic growthof animals according to their initial weights and their intakes all along a given period.

To mimic a logistic behavior, we used Equation (4b) as «usage» equation (Section 2.2).In this equation, L corresponds to the maximum weight attainable by the animals ofthe studied species. Experts have an idea of the value of L . Therefore, during themodel ﬁtting, the value of this parameter was search in a restricted range of values.Therefore, in this application we used the mathematical model made of Equations(1), (2), (3), (4b) and (5). This model considered as a growth model contained ﬁveparameters to ﬁt: ω , r , f , u and L . The database we used is made of two parts corresponding to two diﬀerent groups ofanimals monitored during two diﬀerent periods (Table 3). The ﬁrst group contained individuals, monitored over a unit-period from t = 0 until t = 1 . For this group,the weight of the animals was measured at t = 0 and at t = 1 . The second groupcontained individuals, monitored from t = 0 until t = 2 . . For this group the weightof the animals was measured at t = 0 , t = 0 . , t = 1 . and at t = 2 . . For bothgroups, intakes of each individual were recorded over each time-step of . time-unit.Therefore, for each individual, information relative to those intakes are periodicallyinjected in the model with a time-step of . .14he dataset relative to the ﬁrst group constitutes our Training Database and thedataset relative to the second group constitutes our Test Database.Table 3: Description of the data used.First group Second groupNumber of individuals t = 0 t = 0 Output measured at t = 1 t = 0 . t = 1 . t = 2 . Time step of the

Entries injections ∆ t In = 0 .

16 ∆ t In = 0 . As in Section 3.2 and Appendix II, we built a relationships between some parametersof the model by applying the same methodology. Using a Nadaraya-Watson kernelregressions, we obtained a non-parametric relation linking ω and r and another onelinking f and u . Knowing the relationships between these parameters, it is possible toﬁt ω and f and then deduce the value of r and u . Therefore in this application we onlyﬁtted ω , f and L and deduced the values of r and u .The parameters were ﬁtted to the Training Database by minimizing the diﬀerencebetween the simulated and the real Outputs at time t = 1 . Hence, to ﬁt the parameters,we used the algorithm DIRECT that minimized the function, f obj ( ω, f, L ) = 1 n n (cid:88) i =1 (cid:18) ( s i obs (1) − s j pred (1)) s i obs (1) (cid:19) , (11)where n is the number of individuals contained in the training database and O i obs (1) and O i pred (1) correspond respectively to the values of the observed and the predictedOutputs values for the i th individual at t = 1 .To performed several ﬁtting procedures. We sampled the Training Database: werandomly selected individuals from the individuals before each ﬁtting procedure andwe ﬁtted the parameters on the data associated with the selected individuals. Therefore,we performed ﬁttings and we obtained sets of values of ( ω, r, f, u, L ) . We calculated the average and the

RSD of each parameter (Table 4). The

RSD ofeach parameter is low, indicating that our ﬁtting method permitted to identify oneset containing the parameter values that minimize the error associated with the FittedModel. The existence of a single optimal set of values of ( ω, r, f, u, L ) attests to theidentiﬁability of the model.We parameterized the model with the average values of the parameters.15e calculated the error associated with the model on the Training Database. Todo so, we calculated the Average Relative Error ( ARE ) between the measured andpredicted values of the Output at time t = 1 : ARE ( t ) = 1 n n (cid:88) i =1 (cid:115)(cid:18) ( s i obs ( t ) − s j pred ( t )) s i obs ( t ) (cid:19) (12)Table 4: Average values and RSD of the ﬁtted parameters.

ARE calculated at time t = 1 on the Training Database. P arameter M ean RSDω .

24 0 . r .

91 0 . f .

01 0 . u .

49 0 . L .

70 0 . ARE (1) (%) 1 .

83 0 . The

ARE value calculated on the Training Database at time t = 1 , is worth . .This result is satisfactory, but the accuracy of the model must be calculated on a TestDatabase to ensure that the model does not overﬁt the training data.To do so, we calculated the ARE given by Equation (12), on the Test Database attime t = 0 . , t = 1 . and t = 2 . (Table 5 and Figure 9).Figure 9: Diﬀerence obtained between the measured ( + ) and predicted ( × ) values ofthe Output variable at diﬀerent times t for the individuals in the Test Database.Table 5: Average Relative Error ( ARE ) calculated on the Test Database at diﬀerentinstants. t . .

52 2 . ARE(t) (%) . . . .6 Discussion of the results The error associated with the model is low on the Test Database. The errors madebefore and after time t = 1 remain low. Those results indicate that our tool can betrained on a very small database to link the inputs and the outputs and then it canaccurately predict the weight of the animals over a period . times longer than thetraining one.This extrapolation capability, obtained despite the very low quantity of training data,illustrates that our tools hold high potential for information extraction. As it will bedemonstrated below, this capability distinguishes our approach from other inferencemethods.Moreover, in addition to the information extraction potential, the extrapolation ca-pability helps to reduce the training-data-dependency of our tool. Indeed, we demon-strated that our tool can be applied outside the training data range and provide accurateextrapolations. Hence, we do not need to ﬁt it on data covering the whole curve topredict and so we can use smaller Training Database. Therefore, this extrapolationcapability permits to reduce the duration of data collection, the duration of in situexperiments, and thus the computational and the experimental costs. According to [16] and [47], the current methods used to simulate and predict logisticgrowth processes, involve two main types of models: Phenomenological Models corre-sponding to «Black Box» models, and Mechanistic Models corresponding to «WhiteBox» models. In this section, we will compare some models belonging to these twomain categories with the Biomimetic Statistical Learning tool presented in this paper.

As deﬁned in [16], the Phenomenological Models include Linear, Multiple Linear andNonlinear Regressions, Logistic Models and Neuronal Networks.

The models of Gompertz [48], dN ( t ) dt = a G .N ( t ) . ln (cid:18) K G N ( t ) (cid:19) , (13)and Verhulst [36], dN ( t ) dt = a V .N ( t ) . (cid:18) − N ( t ) K V (cid:19) , (14)are two models frequently used to model growth processes (e.g. see [49], [50], [51], [52],[53] and [54]).We ﬁtted the parameters of these two models on our Training Database by using the17ame optimization algorithm that we used to ﬁt the Biomimetic Model and by mini-mizing ARE (1) .To test and compare the accuracy of the diﬀerent models, we calculated on the TestDatabase the Average Relative Accuracy,

ARA at diﬀerent times t : ARA ( t ) = 1 − ARE ( t ) , (15)where ARE is given by Equation (12). We used the three parameterized models togenerate the growth curve of the individuals of the Test Database and we comparedthe measured and the predicted values at times t = 0 . , t = 1 . and t = 2 . .The results contained in Table 6 and the curves of Figures 10 and 11 show that the curvesgenerated from the Gompertz’s model featured a premature deceleration. However, theVerhulst’s model is associated with good accuracy over the whole studied period.The similarity of the results from the Verhulst and the Biomimetic Growth Mod-els was expected because our model includes an equation assimilable to the Verhulst’sequation (see Section 2.2). The real advantage of our biomimetic growth model is itsability to integrate input data. Indeed, the Verhulst equation only takes into accountthe initial conditions of the system under study, whereas our model also integrates in-takes throughout the studied period. The capability to integrate additional informationappears to help reﬁne the results and increase the accuracy of our model. Moreover,since the Verhulst’s and the Gompertz’s models can not integrate input data over time,they are not able to perform Data Assimilation, contrary to our tool.Table 6: Parameter values and ARA (1) calculated on the Training Database.Model a K ARA(1)Gompertz a G = 0 . K G = 0 .

563 0 . Verhulst a V = 0 . K V = 1 .

563 0 . Biomimetic . Figure 10: The

ARA calculated on the Test Database at diﬀerent times t and associatedwith diﬀerent models. 18igure 11: Plot of the predicted growth curves of two individuals contained in the TestDatabase with the diﬀerent models. Over the past decade, the use of Machine Learning (ML) algorithms and especiallyNeural Networks (NN) has been on the rise [55]. According to some studies ([56], [57],[58] and [59]), the popularity of these tools can be explained by the ease of their imple-mentation and the diversity of issues that these algorithms can handle. Nevertheless,these algorithms are based on relatively simple mathematical models that are cannoteasily take into account complex phenomena, such as delay and saturation. Hence, weapplied diﬀerent Neural Networks on our Training Database to compare this kind ofML tool and our Biomimetic Growth Model. We tested six Neural Networks havingdiﬀerent numbers of nodes and hidden layers, taken as inputs the initial weight of eachindividual and their periodically recorded intakes (Table 7).Table 7: The

ARA calculated on the Training Database (

ARA

T rain ), and on the TestDatabase (

ARA

T est ), at t = 1 , with diﬀerent Neural Networks. The Neural Network( k ,..., k i ,..., k n ) corresponds to a Neural Network containing n hidden layers and the i th hidden layer contains k i nodes.Structure ARA

T rain (1) ( % ) ARA

T est (1) ( % )(4) 99.9 78.8(4,3) 99.8 90.5(6,5) 99.7 93.4(4,6,6,3) 99.9 94.8(5,7,7,7,4) 99.8 95.3(5,9,9,9,5) 99.9 93We ﬁtted each tested Neural Network on our Training Database by using the R-function neuralnet developed by Fritsch et al. [60], and we calculated the accuracy ofthose Neural Networks on the T raining and on the Test Database.The results given in Table 7 show that all the tested Neural Network ﬁt the curvesof the Training Database better than the ones of the Test Database. It shows that thetested Neural Network overﬁt the training curves, particularly when the structure of19he studied Neural Networks is composed of too many or too few nodes and hiddenlayers. The accuracy of the Neural Networks on the Test Database increases up to acertain number of nodes and hidden layers and then decreases when the complexity ofthe structure continues to increase. On the test database, the highest accuracy valueis reached using a Neural Networks containing hidden layers, but this value is lowerthan that obtained using our Biomimetic Model (Table 7 and Figure 10).Nevertheless, the accuracy of these ML tools is satisfactory and the real advantageof our Biomimetic tool over Neural Networks is its extrapolation capability. Indeed,as the Biomimetic Model, the studied Neural Networks were ﬁtted only from the valueof the Output at t = 1 . In this case, the ﬁtted classical Neural Networks can onlybe used to predict the Output at t = 1 . Hence, Neural Networks cannot interpolateor extrapolate, in contrast to our Biomimetic Model. Therefore, contrary to our tool,Neural Networks can not be used in a "Biological Small Data" context to reduce theexperimental and computational costs. They are also less suitable to perform DataAssimilation in this context. Mechanistic Growth Models are another kind of tool permitting to gathered biologicalinputs to predict the growth of plants or animals. Some models of this type have beendeveloped in [61], [62], [63] and [64]. These models integrate numerous Inputs, andnot all of which are available in our study. Hence, these models can not be appliedto our database. Therefore, we only compared the structure, the functioning and theobjectives of those Mechanistic Models with our Biomimetic one.As it is said in [16], [65], [17] and [66], the construction of Mechanistic GrowthModels generally focuses on the biological meaning of the overall model. Therefore,the construction of the explanatory mechanistic models takes time, requires a largequantity of zootechnical knowledge and results in complex models. As it is explainedin [67], [17] and [68], these models contain a large number of unknown parametersand include many factors, forcing the user to enter a large number of Input values,which are sometimes diﬃcult or costly to obtain. Hence, the complex structure of thesemodels makes Mechanistic Realistic Models inappropriate for ﬁtting data and DataAssimilation.

To conclude, we built a

Biomimetic Statistical Learning tool based on a PDE systemembedding the mathematical expression of biological determinants. This PDE systemcontains parameters that can be ﬁtted to data. This PDE system was carefully designedto have a high learning potency and a great accuracy but also to remain light andﬂexible.In the particular «Biological Small Data» context, the performed applications showedthat this tool has higher accuracy than the existing tools. However, our

BiomimeticStatistical Learning tool distinguishes itself in light of its suitability to perform DataAssimilation even if the frequency of data collection and the quality of the collecteddata are low. 20he extrapolation capability of our tool, coupled with its high learning potency permitsto ﬁt it on a few data but also to accurately applied it outside the range of the trainingdata. Hence, the quantity of collected data can be reduced as the costs relative toexperiments and data management.To sum up, our tool can be used to predict health and performance indicators ac-cording to the ingestion or the injection of molecules in an animal body and to performaccurate and inexpensive Livestock Data Assimilation .The pursuit of an optimal combination between the use of data and the use of priorknowledge via the use of PDEs seems to be an interesting way to build Artiﬁcial In-telligence (AI) tools. Those AI tools could have a strong learning ability and a weakTraining-Data-Dependency.Nevertheless, the results coming from the Biomimetic Model was obtained from acertain number of hypotheses. Some Model Selection methods could be applied to selectthe structure of the Mathematical Model, permitting to obtain a more satisfying modelin terms of

ARE and the number of parameters to learn. This suggested improvementwill be studied in a forthcoming work.

Acknowledgements

The authors are very grateful to D. Causeur, G. Durrieu and E. Fokoué for fruitfuldiscussions related to this article.

References [1] M. McPhee. “Mathematical modelling in agricultural systems : A case study ofmodelling fat deposition in beef cattle for research and industry”. In: 2009.[2] L. Puillet, O. Martin, D. Sauvant, and M. Tichit. “Introducing eﬃciency into theanalysis of individual lifetime performance variability: a key to assess herd man-agement”. In: animal doi : . url : https://hal.archives-ouvertes.fr/hal-01137029 .[3] O. Martin and D. Sauvant. “A teleonomic model describing performance (body,milk and intake) during growth and over repeated reproductive cycles through-out the lifespan of dairy cattle. 2. Voluntary intake and energy partitioning”. In: Animal doi : .[4] J. D. Nkrumah, J. Basarab, Z. Wang, C. Li, M. Price, E. Okine, D. H. Crews, andS. S. Moore. “Genetic and phenotypic relationships of feed intake and measuresof eﬃciency with growth and carcass merit of beef cattle”. In: Journal of animalscience

85 (2007), pp. 2711–20. doi : .[5] H. Nesetrilova. “Multiphasic growth models for cattle”. In: Czech Journal of An-imal Science

50 (2005), pp. 347–354. doi : .[6] J. Basarab, M. Price, J. L. Aalhus, E. Okine, W. M. Snelling, and K. L. Lyle.“Residual Feed intake and body composition in young growing cattle”. In: Cana-dian Journal of Animal Science

83 (2003), pp. 189–204. doi : .217] D. Auroux and J. Blum. “Back and forth nudging algorithm for data assimilationproblems”. In: Comptes Rendus Mathematique

Journal of Marine Systems

Monthly Weather Review

Remote Sensing

Agricultural Sciences inChina issn : 1671-2927. doi : https://doi.org/10.1016/S1671- 2927(11)60156- 9 . url : .[12] Bert Rijk. “Integration of sensor data in crop models for precision agriculture”.In: (2013).[13] S Janssen, CH Porter, AD Moore, IN Athanasiadis, I Foster, JW Jones, andJM Antle. “Towards a new generation of agricultural system models, data, andknowledge products: building an open web-based approach to agricultural data,system modeling and decision support. AgMIP”. In: Towards a New Generationof Agricultural System Models, Data, and Knowledge Products

91 (2015).[14] J. C. W. Locke, A. J. Millar, and M. S. Turner. “Modelling genetic networks withnoisy and varied experimental data: the circadian clock in Arabidopsis thaliana.”In:

Journal of theoretical biology

234 3 (2005), pp. 383–93.[15] Y. Qi, Z. Bar-Joseph, and J. Klein-Seetharaman. “Evaluation of diﬀerent biolog-ical data and computational classiﬁcation methods for use in protein interactionprediction”. In:

Proteins: Structure, Function, and Bioinformatics doi :

10 . 1002 / prot . 20865 . eprint: https : / / onlinelibrary .wiley.com/doi/pdf/10.1002/prot.20865 . url : https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.20865 .[16] M. A. Vázquez-Cruz, A. Espinosa-Calderón, A. R. Jiménez-Sánchez, and R.Guzmán-Cruz. “Mathematical Modeling of Biosystems”. In: Biosystems Engineer-ing: Biofactories for Food Production in the Century XXI . Cham: Springer Inter-national Publishing, 2014, pp. 51–76. doi :

10 . 1007 / 978 - 3 - 319 - 03880 - 3 _ 2 . url : https://doi.org/10.1007/978-3-319-03880-3_2 .[17] D. Bastianelli and D. Sauvant. “Modelling the mechanisms of pig growth.” In: Livestock Production Science (1997).[18] O. Martin and D. Sauvant. “A teleonomic model describing performance (body,milk and intake) during growth and over repeated reproductive cycles throughoutthe lifespan of dairy cattle. 1. Trajectories of life function priorities and geneticscaling.” In:

Animal (2010). 2219] A. C. Tan and D. Gilbert. “An empirical comparison of supervised machine learn-ing techniques in bioinformatics”. In:

Proceedings of the First Asia-Paciﬁc bioin-formatics conference on Bioinformatics 2003-Volume 19 . Australian ComputerSociety, Inc. 2003, pp. 219–222.[20] J. Shavlik, L. Hunter, and D. Searls. “Introduction”. In:

Machine Learning issn : 1573-0565. doi : . url : https://doi.org/10.1007/BF00993376 .[21] T. Hubbard and A. Reinhardt. “Using neural networks for prediction of the sub-cellular location of proteins”. In: Nucleic Acids Research issn : 0305-1048. doi :

10 . 1093 / nar / 26 . 9 . 2230 . eprint: http : / / oup .prod.sis.lan/nar/article-pdf/26/9/2230/9471729/26-9-2230.pdf . url : https://dx.doi.org/10.1093/nar/26.9.2230 .[22] S. H. Dumpala, R. Chakraborty, and S. K. Kopparapu. k-FFNN: A priori knowl-edge infused Feed-forward Neural Networks . 2017. arXiv: .[23] E. Frénod. “A PDE-like Toy-Model of Territory Working”. In: Understanding In-teractions in Complex Systems - Toward a Science of Interaction . UnderstandingInteractions in Complex Systems - Toward a Science of Interaction. CambridgeScholar Publishing, 2017, pp. 37–47. url : https://hal.archives- ouvertes.fr/hal-00817522 .[24] A. Rousseau and M. Nodet. “Modélisation mathématique et assimilation de don-nées pour les sciences de l’environnement”. In: Bulletin de l’APMED

505 (2013),pp. 467–472.[25] W. J. Sacks, D. S. Schimel, and R. K. Monson. “Coupling between carbon cyclingand climate in a high-elevation, subalpine forest: a model-data fusion analysis”. en.In:

Oecologia issn : 0029-8549, 1432-1939. doi : . url : http://link.springer.com/10.1007/s00442-006-0565-2 (visited on 11/22/2018).[26] L. Wang, H. Zhang, K. C. L. Wong, H. Liu, and P. Shi. “Physiological-model-constrained noninvasive reconstruction of volumetric myocardial transmembranepotentials”. In: IEEE Transactions on Biomedical Engineering

Quarterly Journal of the Royal MeteorologicalSociety doi : . url : https://rmets.onlinelibrary.wiley.com/doi/abs/10.1256/003590002321042135 .[28] G. Kim and A. P. Barros. “Space–time characterization of soil moisture from pas-sive microwave remotely sensed imagery and ancillary data”. In: Remote Sensingof Environment issn : 0034-4257. doi : https://doi.org/10.1016/S0034- 4257(02)00014- 7 . url : .[29] W. L. Crosson, C. A. Laymon, R. Inguva, and M. P. Schamschula. “Assimilat-ing remote sensing data in a surface ﬂux–soil moisture model”. In: HydrologicalProcesses

16 (2002), pp. 1645–1662. doi : .2330] D. S. Mackay, S. Samanta, R. R. Nemani, and L. E. Band. “Multi-objective param-eter estimation for simulating canopy transpiration in forested watersheds”. In: Journal of Hydrology issn : 0022-1694. doi : https://doi.org/10.1016/S0022-1694(03)00130-6 . url : .[31] D. J. Barrett. “Steady state turnover time of carbon in the Australian terrestrialbiosphere”. In: Global Biogeochemical Cycles doi : . eprint: https://agupubs.onlinelibrary.wiley.com/doi / pdf / 10 . 1029 / 2002GB001860 . url : https : / / agupubs . onlinelibrary .wiley.com/doi/abs/10.1029/2002GB001860 .[32] D. Barrett, M. Hill, L. Hutley, J. Beringer, J. H. Xu, G. Cook, J. Carter, andR. J. Williams. “Prospects for improving savanna biophysical models by usingmultiple-constraints model-data assimilation methods”. In: Australian Journal ofBotany doi : .[33] P. J. Rayner, M. Scholze, W. Knorr, T. Kaminski, R. Giering, and H. Widmann.“Two decades of terrestrial carbon ﬂuxes from a carbon cycle data assimilationsystem (CCDAS)”. In: Global Biogeochemical Cycles doi : . eprint: https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2004GB002254 . url : https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2004GB002254 .[34] W. J. Sacks, D. S. Schimel, R. K. Monson, and B. H. Braswell. “Model-datasynthesis of diurnal and seasonal CO2 ﬂuxes at Niwot Ridge, Colorado”. In: GlobalChange Biology doi : . eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1365- 2486.2005.01059.x . url : https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1365-2486.2005.01059.x .[35] P. Ailliot, E. Frénod, and V. Monbet. “Long term object drift in the ocean withtide and wind.” In: SIAM Journal on Multiscale Modeling and Simulation: ASIAM Interdisciplinary Journal url : https://hal.archives-ouvertes.fr/hal-00129093 .[36] P. F. Verhulst. “Notice sur la loi que la population suit dans son accroissement”.In: Corresp. Math. Phys.

10 (1838), pp. 113–126. url : https://ci.nii.ac.jp/naid/10015246307/en/ .[37] K. Soetaert, T. Petzoldt, and R. Woodrow Setzer. “Solving Diﬀerential Equationsin R: Package deSolve”. In: Journal of Statistical Software issn : 1548-7660. doi : . url : .[38] W. Enright. “The Numerical Analysis of Ordinary Diﬀerential Equations: RungeKutta and General Linear Methods”. In: SIAM Review doi :

10 . 1137 / 1031147 . eprint: https : / / doi . org / 10 . 1137 / 1031147 . url : https://doi.org/10.1137/1031147 .[39] S. G. Johnson. “The NLopt nonlinear-optimization package”. 2008.2440] R Core Team. R: A Language and Environment for Statistical Computing . RFoundation for Statistical Computing. Vienna, Austria, 2014. url : .[41] D. E. Finkel. DIRECT Optimization Algorithm . North Carolina State University,2003.[42] R. Courant, K. Friedrichs, and H. Lewy. “Über die partiellen Diﬀerenzengle-ichungen der mathematischen Physik”. In:

Mathematische annalen

Wolfram MathWorld–AWolfram Web Resource. (2014).[44] H. Flourent. “Study of the ranges of values of a Biomimetic Statistical Learn-ing Tool parameters”. Working paper. 2019. url : https : / / hal . archives -ouvertes.fr/hal-02067374 .[45] E. A. Nadaraya. “On estimating regression”. In: Theory of Probability & Its Ap-plications

Sankhy¯a: The Indian Journal ofStatistics, Series A (1964), pp. 359–372.[47] R. Guzmán-Cruz, R. Castaneda-Miranda, J. Garcia-Escalante, L. Solis-Sánchez,D. Alaniz-Lumbreras, J. Mendoza-Jasso, A. Lara-Herrera, G. Ornelas-Vargas, E.Gonzalez-Ramirez, and R. Montoya-Zamora. “Evolutionary Algorithms in Mod-elling of Biosystems”. In: 2011. isbn : 978-953-307-171-8. doi : .[48] B. Gompertz. “XXIV. On the nature of the function expressive of the law of humanmortality, and on a new mode of determining the value of life contingencies. In aletter to Francis Baily, Esq. FRS &c”. In: Philosophical transactions of the RoyalSociety of London

115 (1825), pp. 513–583.[49] C. P. Winsor. “The Gompertz Curve as a Growth Curve”. In:

Proceedings ofthe National Academy of Sciences issn : 0027-8424. doi :

10 . 1073 / pnas . 18 . 1 . 1 . eprint: . url : .[50] N. K. Sakomura, F. A. Longo, E. O. Oviedo-Rondon, C. Boa-Viagem, and A. Fer-raudo. “Modeling energy utilization and growth parameter description for broilerchickens”. In: Poultry Science

Life sciences

Journal of Biological Chemistry

The chemical basis of growth and senescence . JB LippincottCompany, 1923.[54] P. Román-Román and F. Torres-Ruiz. “Modelling logistic growth by a new dif-fusion process: Application to biological systems”. In:

Biosystems

Com-mun. ACM

55 (2012), 78–87. doi : .[56] M. T. Gorczyca, H. F. M. Milan, A. S. C. Maia, and K. G. Gebremedhin. “Machinelearning algorithms to predict core, skin, and hair-coat temperatures of piglets”.In: Computers and Electronics in Agriculture

151 (2018), pp. 286 –294. issn :0168-1699. doi : https : / / doi . org / 10 . 1016 / j . compag . 2018 . 06 . 028 . url : .[57] J. J. Valletta, C. Torney, M. Kings, A. Thornton, and J. Madden. “Applications ofmachine learning in animal behaviour studies”. In: Animal Behaviour

124 (2017),pp. 203 –220. issn : 0003-3472. doi : https://doi.org/10.1016/j.anbehav.2016.12.005 . url : .[58] C. Ma, H. H. Zhang, and X. Wang. “Machine learning for Big Data analyticsin plants”. In: Trends in Plant Science issn : 1360-1385. doi : https://doi.org/10.1016/j.tplants.2014.08.004 . url : .[59] R. H. L. Ip, L. M. Ang, K. P. Seng, J. C. B., and J. E. Pratley. “Big data and ma-chine learning for crop protection”. In: Computers and Electronics in Agriculture

151 (2018), pp. 376 –383. issn : 0168-1699. doi : https://doi.org/10.1016/j . compag . 2018 . 06 . 008 . url : .[60] S. Fritsch, F. Guenther, and Ma. Suling. neuralnet: Training of neural networks .R package version 1.32. 2012. url : http : / / CRAN . R - project . org / package =neuralnet .[61] D. Bastianelli, D. Sauvant, and A. Rerat. “Mathematical modeling of digestionand nutrient absorption in pigs.” In: Journal of animal science (1996).[62] J. Mach and Z. Kristkova. “Modelling The Cattle Breeding Production in theCzech Republic”. In:

AGRIS on-line Papers in Economics and Informatics

Animal : an international journal of animal bioscience doi : .[64] E. C. T. Zúñiga, I. L. L. Cruz, and A. R. García. “Parameter estimation for cropgrowth model using evolutionary and bio-inspired algorithms”. In: Applied SoftComputing

23 (2014), pp. 474 –482. issn : 1568-4946. doi : https://doi.org/10.1016/j.asoc.2014.06.023 . url : .[65] L. O. Tedeschi, D. G. Fox, R. D. Sainz, L. G. Barioni, S. R. de Medeiros, andC. Boin. “Mathematical models in ruminant nutrition”. en. In: Scientia Agricola

62 (2005), pp. 76 –91. issn : 0103-9016. url : .2666] D. E. Beever, A. J. Rook, J. France, M. S. Dhanoa, and M. Gill. “A review ofempirical and mechanistic models of lactational performance by the dairy cow”.In: Livestock Production Science issn : 0301-6226. doi : https : / / doi . org / 10 . 1016 / 0301 - 6226(91 ) 90061 - T . url : .[67] D. Wallach, B. Goﬃnet, J. E. Bergez, P. Debaeke, D. Leenhardt, and J. N. Auber-tot. “Parameter estimation for crop models”. In: Agronomy journal

Publication-EuropeanAssociation For Animal Production

78 (1995), pp. 223–223.27 ppendix I: Generation of the Learning Database

To test the learning capability of the model we generated a Learning Database contain-ing 50 individuals, that is 50

Output Curves . The objective is to obtain a databasehaving the same characteristics as a real ﬁeld database. To do that we integrated intothis ﬁctitious database noise and individual variability.

I.1 Integration of individual variability

The model parameters are constants to determine. Nevertheless, to introduce individualvariability in the generated data, we considered (only in this Section) the parametersas biological-like factors following a Normal distribution. To simulate individual diﬀer-ences, we assigned to each parameter a Normal distribution centered on an arbitrarilychosen value and with a relative variance of . (See Table 8). From these Nor-mal probability laws, we generated values of the parameters ω , r , f and u . Theirrespective statistical and probabilistic distributions are given in Figure 12.Figure 12: Distributions of the parameters ω , r , f and u I.2 Generation of ﬁctitious Inputs

The

Inputs integrated into the model correspond to the injected volume (

V olQ ) andthe moment of the injection ( c t ). These parameters can take on any value between and , therefore we applied a Uniform distribution over the interval [0; 1] to these twotypes of Inputs (Table 8).From the values of the parameters and the ﬁctitious

Inputs , we generated Output Curves . 28 .3 Addition of a random noise

Continuing with the objective of obtaining an experimental-like database, we addednoise to the

Output Curves . To do so, we added a random component following aGaussian distribution centered on and with a variance of . to the generated curves(Table 8).Figure 13 shows some examples of generated curves without and with noise. Wedivided the obtained database into two datasets: A T raining Database made of curves and a T est Database made of curves.In the rest of this Section, we assumed that we have an experimental-like databaseand a model containing four parameter values to determine.Table 8: The distributions followed by the parameters and the Inputs .Parameter Probability law ω N (10 , . r N (35 , . f N (800 , . u N (125 , V olQ U (0 , c t U (0 , N oise N (0 , . Figure 13: Example of simulated curves without and with noise29 ppendix II: Study of the compensation eﬀects existingbetween the model parameters

Among the parameters ω , r , f and u , some parameters oﬀset each other.Velocity ω , can be oﬀset by any delay r , the information undergoes. For example,a low convection speed associated with a low delay may induce kinetics equivalent tothat induced by a high convection speed associated with a long delay.The ﬁxation f , and the use of the information u , are also two counterbalanced pro-cesses. For instance, a high ﬁxation rate followed by a low usage of the information caninduce the same eﬀect on the Outcome as a low ﬁxation rate followed by an importantuse of the ﬁxed information.Therefore, relations exist between the parameters of those two couples. The objec-tive of this part is to use the ﬁctitious

T raining Database to study these relations.

II.1 Study of the relationship between ω and r First, we demonstrated the relationship existing between ω and r by calculating theerror made on the T raining Database by the model parametrized with diﬀerent ( ω, r ) pairs. To do so, we ranged the domain ω × r and we calculated the Relative ResidualSum of Squares ( RRSS ) (16) associated with the models parametrized with diﬀerenttested ( ω, r ) pairs: RRSS ( ω, r ) = n (cid:88) i =1 (cid:32) m (cid:88) j =1 (cid:18) ( y ij obs − y ij pred ( ω, r )) y ij obs (cid:19) (cid:33) , (16)where n corresponds to the number of individuals contained in the T raining Database and m the number of points on the curves. y ij obs and y ij pred correspond respectively tothe observed and the predicted value of the j th point of the i th individual. Therefore RRSS corresponds to the sum of the squared relative diﬀerences between the predictedcurves and the initially generated curves.Figures 14 and 15 give the values of the RRSS according to the values of ω and r . The existence of a series of equivalent pairs - that is a series of pairs leading to thesame value of RRSS - can be seen in Figure 14(a). There is an area where the

RRSS values are lower (Figure 15), and corresponding to the curve EC in Figure 14(b). Weassumed that the optimal ( ω Opt , r Opt ) pair, inducing the lowest RRSS, belongs to thiscurve. Therefore, we set out to determine the equation of the curve EC . II.2 Search for the ( ω Opt , r

Opt ) pairs inducing the lowest RRSS

To ﬁnd the equation of the curve EC we sought for diﬀerent values of ω , the valueof r minimizing the RRSS value. To do that, for each tested value of ω we used30igure 14: The value of the RRSS according to ω and r (a: left), and the schema ofthe diﬀerent Equivalent Couples (EC) (b: right)Figure 15: The D representation of the value of the RRSS according to ω and r the optimization algorithm DIRECT to ﬁnd the value of r minimizing the objectivefunction, f obj ( r ) = 1 n n (cid:88) i =1 (cid:32) m (cid:88) j =1 (cid:18) ( y ij obs − y ij pred ( ω, r )) y ij obs (cid:19) (cid:33) , (17)corresponding to the average RRSS.To obtain several ﬁtted values of r for each tested value of ω , we sampled the T raining Database : we sampled curves from the test curves and we ﬁtted r onthose selected curves. We ultimately obtained three values of r for each tested valueof ω (Figure 16). Using a Nadaraya-Watson kernel regression (See [45] and [46]), weobtained a non-parametric relationship linking ω Opt and r Opt in the form of: r opt = ˆ m ( ω opt ) + (cid:15), (18)31here ˆ m corresponds to the Nadaraya-Watson estimator.Figure 16: The Nadaraya-Watson kernel regression linking the ( ω Opt , r Opt ) parameterpair.Knowing the relationship between ω Opt and r Opt , it is possible to deduce one of thesetwo parameters according to the value of the other parameter. Hence, this relationshipreduces the number of parameters that need to be learned simultaneously.

II.3 Study of the relationship between f and u There is also a compensation eﬀect between f and u : a high value of f can be compen-sated by a low value of u , and vice versa.As above for ω and r , we sought to determine the relationship existing between f and u to be able to deduce one of these two parameters according to the other one andfurther reduce the number of parameters to learn simultaneously.As above, we ranged the domain f × u and calculates the RRSS of the modelsparameterized with diﬀerent pairs of values for ( f, u ) (Figures 17 and 18). This studydemonstrates a series of equivalent pairs. There is an area where the RRSS values arelower (Figure 18) and corresponding to the EC curve in Figure 17(a). We assumedthat the optimal ( f Opt , u

Opt ) pair inducing the lowest RRSS , belongs to this curve.Therefore, we set out to determine the equation of this curve.

II.4 Search for the ( f Opt , u

Opt ) pairs that lead to the lowest RRSS