An innovating Statistical Learning Tool based on Partial Differential Equations, intending livestock Data Assimilation
AAn innovative Statistical Learning Tool based on PartialDifferential Equations for livestock Data Assimilation
Hélène Flourent , , , Emmanuel Frénod , , , Vincent Sincholle NutriX , France Université Bretagne Sud, Laboratoire de Mathématiques de BretagneAtlantique,UMR CNRS , Campus de Tohannic, Vannes, France See-d, , rue Henri Becquerel - CP , Vannes Cedex, France
Abstract
Realistic modeling of biological mechanisms requires a large volume of priorknowledge and leads to heavy mathematical models. On the other hand, theclassical Machine Learning algorithms, such as Neural Networks, need a largequantity of data to be fitted. Nevertheless, to predict the evolution of biologicalvariables we are often facing a lack of knowledge and a lack of data, especiallyin the livestock sector. Therefore, we explored an intermediate approach, called"Data-Model Coupling". We demonstrated that parametrized Partial DifferentialEquations (PDEs) can be embedded in a data fitting process and then in anefficient predictive Statistical Learning tool. We postulated that all the physico-chemical phenomena occurring in an animal body can be summarized by thecirculation, the evolution and the action of an overall information flow. We builtthe PDE system which mathematically translates our assumption and we fittedit on data.The applications of our approach to data relative to the growth of farm ani-mals showed that it increases the forecasting accuracy and reduces the trainingdata dependency of the resulting predictive tool. Moreover, learning the dynamicslinking the inputs and the outputs confers to the tool the capability to be trainedon a given range of data and then to be accurately applied outside this range ofdata. This extrapolation capability is a real improvement over existing predictivetools. keywords : Statistical Learning, PDE, Forecasting, Data Assimilation, Data-Model Coupling, Biological Mathematical Modeling.
Smart Farming corresponds to the use of new technologies to make the farm productionprocesses more efficient.As it can be identified in [1], [2], [3], [4], [5] and [6], in the agri-food sector, simulatingand predicting the effects of nutrition on animal performances are two decisive andstrategic goals for breeders and companies to optimize animal efficiency. However,the biological phenomena linking the nutrition and the performances of animals are helene.fl[email protected] [email protected] The compagny wishes to remain anonymous a r X i v : . [ s t a t . O T ] J a n omplex. Furthermore, in most cases, to build tools able to predict the evolution ofbiological variables, it is necessary to jointly manage the complexity of the phenomenaoccurring in the studied biological system and the lack of data available to fit thosetools.Data Assimilation is an approach that embeds mathematical theories, Data Scienceand Computer Science processes to estimate the most likely state of a connected systemat an instant t (See [7], [8], [9] and [10]). To do so, it combines the information given bya predictive tool and the one contained in a more or less continuous stream of collecteddata. To very briefly sum up, it consists of considering that data flows are gathered tocorrect at a given frequency the simulation done by the predictive tool. This correctiontakes into account that collected data contain noise and the predictive tool embeds anintrinsic model error.This combination of information could permit to know the state of an animal or agroup of animals, in terms of health and performances, according to their ingestions andthe drugs that are administered to them. Hence, this concept constitutes an interestingand promising way to oversee future livestock and address the Smart Farming issues([11], [12] and [13]).Biological data are not easy to collect and generally contain a large variability ([14]and [15]). Hence, to perform Data Assimilation in the livestock sector it is necessary todevelop efficient and light predictive tools able to be fitted on few and scattered datarelative to complex phenomena.According to Vázquez-Cruz et al. [16], there are currently two general approachesto build tools predicting biological responses.On the one hand, realistic modeling of biological mechanisms requires a large vol-ume of prior knowledge and generally leads to heavy mathematical models ([17] and[18]). However, the complex implementation of these models limits their adaptability,in particular when it comes to processing or assimilating field data.On the other hand, the structure of classical Machine Learning (ML) algorithms,such as Neural Networks, have limited ability to take into account the existence ofcomplex underlying phenomena and need to be fitted on a large quantity of data tocompensate for the absence of prior biological expertise ([19], [20], [21] and [22]).Hence, due to their lack of adaptability or their inability to be fitted to few datathe existing tools are not entirely appropriate for achieving Data Assimilation in thecontext of "Biological Small Data".We assumed that a global and synthetic consideration of the biological processes mayhelp gain precision, in comparison with a classical ML tool which integrates no priorknowledge. We also assumed that this synthetic consideration permits us to do it whilekeeping a flexible and light tool, in comparison with a tool based on realistic models.Therefore, we explored an intermediate approach, named "Data-Model Coupling" tobuild predictive tools able to deal with both the complexity of the biological responsesand the current lack of data.This emerging approach is midway between the realistic modeling and the "BlackBox" approach. As seen in [23], [24], [25] and [26], Data-Model Coupling approachconsists of building a mathematical model, corresponding to a mathematical synthesisof the studied system. Then, the parameters contained in the model are fitted to data.2s in the above-cited studies, the construction of our tool is based on an optimal com-bination of knowledge, to design a relevant mathematical model, and data, to optimizethe model parameters. In our approach, the mathematical model is a wisely designedparametrized PDE system.Data Assimilation is a long-term objective of our research work. Nevertheless, tofinally obtain a tool particularly suitable to perform Data Assimilation, this long-termobjective has strongly guided the whole modeling approach introduced in this paper.Several contributions can be identified in this paper.In the first place, the applications of this tool, on collected data relative to thegrowth of farm animals, put in evidence its extrapolation capability which is a realimprovement over existing predictive tools. As it is illustrated in Figure 1, in theapplication presented in Section 4, our tool was trained on a short training period tolink the inputs and the outputs. Yet, it can accurately predict the outputs from theinputs outside the range of the Training Data. This is the peculiarity of our approach:the model learns synthetic dynamics linking the inputs and the outputs by fittingparameter-dependent evolution equations. Once the parameters fitted, those dynamicscan be applied outside and even far from the Training Data range.This extrapolation capability permits to reduce the amount of data to collect andthus to reduce the costs relative to the experiments and the data management. It alsopermits to extend the validity period of the prediction provided by the tool. Hence,when it will come to Data Assimilation issues, in perspective with what explained above,the correction of the prediction via the use of data could be less frequent and thus thecomputational costs could be lower. Weight TimeTraining period Application period
Figure 1: Schematization of the extrapolation capability of the developed toolThe second contribution of our exploration is the development of a concept betweenthe reality and the model (Figure 2). In most cases, the objective of a mathematicalmodel is to translate the reality, adopting a higher or lower abstraction level. In ourapproach, a differentiation between the reality and our model is made. Indeed, the usedsupport of reflection is not directly the real animal, but an Avatar which conceptuallyand essentially outlines the global dynamics occurring in the animal body. A largenumber of physico-chemical phenomena occur in the animal body in response to theingestion or the injection of molecules. They lead, some time later, to the change3f biological variables. This supply of molecules and those biological variables can bemonitored and recorded to generate Input and Output data. We assumed that this kindof Inputs and Outputs can be linked by a dynamical model which is a mathematicaltranslation of the Avatar. Therefore, we designed the PDE system mathematicallytranslating our assumption and describing the convection, the diffusion and the actionof an overall information flow.
Reality ● Animal body● Blood● Molecules● Nutrients● Enzyme● Hormones ● Organs● Muscles● ...● Ingestion of nutrient● Injection of molecules● Rumination● Blood circulation● Nervous circulation● Digestion● Fermentation● Absorption ● Assimilation● Metabolism● ...
Avatar ● Information flows● Convection● Diffusion● Fixation● Delay● Usage● Saturation
ComponentsProcesses
Model ● Mathematical functions● Parameter-dependent evolution equations● Convection operators● Diffusion operators● Delay operators● Integral operators● Transfer functions
Figure 2: Schematization of the different sets considered in our approach and their owncomponents and processes.Thirdly, the application of our approach on real data showed that our tool can ac-curately link biological-related Inputs and Outputs, even if it is fitted on few, scatteredand noisy data.Data-Model Coupling is so far essentially used in the fields of meteorology (see[27]), hydrology (see [28],[29] and [30]), biogeochemistry (see [31], [32], [33] and [34])and oceanography (see [35]). The successful use of a Data-Model Coupling approachto treat biological issues can be also considered as a contribution of this paper.In this paper, we will show that the use of a short and relevant PDE system in afitting process leads to the construction of an efficient predictive tool having a low datadependency and a high information extraction capability.This tool can be used to predict the evolution of biological variables according tothe ingestions and the injections of molecules in the animal body. The objective of theapplication presented in this paper was to predict the growth of two groups of animalsof a specific species, according to their initial weight and their feeding behavior. Butthe genericity and the parsimony of our tool might ensure its suitability to predict otherperformance indicators relative to other farm species.4his low data dependency and this high information extraction capability allow theuse of few data to fit our tool. Therefore it can be used to reduce the costs relativeto experiments, data collection, and data storage. Furthermore, in comparison withthe existing predictive tools, these capabilities also make our tool more suitable to ef-ficiently perform Data Assimilation, even if the frequency of data collection and thequality of the collected data are low.To summarize, in our approach we distinguished different dimensions. As it is il-lustrated by Figure 3, there is the
Reality in which there are
Intakes and
Injections inducing complex biological processes in the animal body. Some
Sensors extract in-formation from this
Reality which is stored in databases made of
Inputs and
Outputs .Since the model is not directly assimilable to the reality, the inflows and the outflowsof the model are also not directly assimilable to the input and the output data. The
Inputs have to be translated by a mathematical function into
Entries , that are piecesof information integrated into the
Mathematical Model and that induce the generationof
Outcomes , also linked to the
Outputs extracted from the
Reality by a mathematicalfunction.Figure 3: Articulation of the different elements of ourcapabilities explorationAs it can be noticed in Figure 3, our exploration relies on the relationship betweenseveral diverse elements such as the
Real Animal , the
Avatar and the
MathematicalModel . The
Algorithm comes out of the discretization of the
Mathematical Model , e.g.the PDE system which mathematically translates what takes place in the
Avatar . Thissystem of PDEs contains parameters corresponding to biological-like factors that can5e learned from a database. The
P rogram corresponds to the code that managesthis learning step. It uses an iterative training process during which an optimizationalgorithm finds the values of the parameters that minimize the difference between themeasured and the predicted
Outputs . The
T ool finally corresponds to the
MathematicalModel parameterized with the values of the parameters obtained at the end of thelearning step.The presence of parameters that can be learned from data in the mathematicalmodel confers learning ability to the tool based on this model. Using a database, weobtained a tool able to reconstitute dynamics between inputs and outputs to performforecasts and extrapolations. Hence, the constructed tool can be considered as a
Sta-tistical Learning Tool .In this paper, we will present our modeling approach, the conception of our tooland the results of the applications of our approach on fictitious and real data.After this Introduction, putting this research work in its proper context, we willdetail in Section the conception of the mathematical model and its applicability.We tested the well-functioning and the capacities of our tool in two different ways.First, we established a fitting method taking into account the relations existing betweenthe model parameters and we generated a database to test this fitting method on it.This first application on mastered data allowed us to verify the ability of our tool to fitthe parameters. Those simulation tests are presented in Section . After those tests onfictitious data, we applied our approach to data collected on a farm and relative to thefeeding behavior and the growth of two groups of animals. The results are presented inSection . This application demonstrated the prediction capability of the tool in realconditions.We put in evidence the potential of our tool and the improvement conferred byit. To do so, we compared the capabilities of our tool with the ones of some LogisticModels, Mechanistic Models and Machine Learning algorithms. These comparisons willbe detailed in Section . In our approach, particular attention was paid to the construction of the
MathematicalModel embedded in the final predictive tool. Indeed, the designing of this model - thatis a PDE system - was the key element to achieve our objectives of lightness, accuracyand learning potency.
Through the conceptualization of the
Avatar , we set up a parsimonious summary of anybiological process. Indeed, we mathematically summarized the global intern dynamicsof the animal via several equations and mathematical operators which we assumednecessary and sufficient.We hypothesized that, when a molecule or a group of molecules enter the body ofa living organism, it circulates in the body through a network of vessels containing6 fluid. It integrates this fluid and uses it as a vector to evolve via convection anddiffusion mechanisms. In the network of vessels, the molecules may be in competitionwith other mechanisms which may delay its progression. The circulating molecules maythen be captured and accumulated in an organ or a specific tissue. During its storage,the molecules can be used and induce a change in some biological variables. Then, webuilt the PDE system which mathematically translates the previously set up summaryillustrated by Figure 4. S Φ b Φ f < < < <> > > > • Q Ψ f Ξ u O< rω, c − ω, c P rediction \ SimulationInputsEntries Outputs P Outcomes
Figure 4: Schematization of the Mathematical ModelWe modeled our
Avatar using variables, densities, and fields that are all unitlessand dimensionless. We also reduced the considered geometrical space to interval [0 ; 1] .We considered a Forward Flow Φ f , and a Backward Flow Φ b streaming in this one-dimensional geometrical space. As it was described in the introduction, these flows canbe seen as a very synthetic summary of blood circulation, a circulation in the nervoussystem or circulation in the digestive tract, according to the problematic.The involved Inputs in our tool can correspond to collected data relative to feedintakes, water intakes, and administered drugs. These Inputs are integrated in theMathematical Model via a function Q transforming these Inputs into information in-flows, called Entries . We modeled that part of the injected information circulates in aforward direction, via Φ f and the rest circulates backward, via Φ b . This informationcan evolve via convection and diffusion phenomena. We assumed that the circulatinginformation can be delayed, captured, stored and used to ultimately induce a modifi-cation in the Outcomes O . These Outcomes correspond to the model outflows. ThoseOutcomes are transformed by a mathematical function to be comparable with collectedoutputs.Therefore, Φ f ( t, x ) and Φ b ( t, x ) are, at each instant t , two space densities respectivelyassociated with a forward flux with a velocity ω and a backward flux with a velocity − ω . 7he spatial density Φ f ( t, x ) is supposed to be solution to: ∂ Φ f ∂t ( t, x ) + ω ∂ Φ f ∂x ( t, x ) − c ∂ (cid:20) χ ∂ (cid:104) Φ f + Φ b (cid:105) ∂x (cid:21) ∂x ( t, x )= 12 Q ( t, x ) − f F ( x )Φ f ( t, x ) − r Φ f ( t, x ) , (1)Similarly, Φ b ( t, x ) is supposed to be solution to: ∂ Φ b ∂t ( t, x ) − ω ∂ Φ b ∂x ( t, x ) − c ∂ (cid:20) χ ∂ (cid:104) Φ f + Φ b (cid:105) ∂x (cid:21) ∂x ( t, x )= 12 Q ( t, x ) − f F ( x )Φ b ( t, x ) + r Φ f ( t, x ) , (2)In these equations, the parameter c is the diffusion velocity of the information. Thespace-time density Q , corresponds to an external source of information. The function F is worth in certain areas of the involved geometrical space and in others. Thearea where this function is worth corresponds to the location of the entity capturingthe information. The parameter f determines the rate of fixed information. The pa-rameter r determines the fraction of the circulating information transferred from theForward Flow to the Backward Flow, which induces a delay in the progression of theinformation.The function χ is compactly supported in (0 , , mainly constant and worthing . Thisfunction integrated into the diffusion term makes diffusion vanish at the edges of the do-main. At each time t , the spatial density Ψ( t, x ) , associated with the fixed information,is solution to: ∂ Ψ ∂t ( t, x ) = f F ( x ) (cid:20) Φ b ( t, x ) + Φ f ( t, x ) (cid:21) − u Ψ( t, x ) . (3)The parameter u is the coefficient determining the usage rate of the fixed information.At each time t , the spatial density Ξ( t, x ) , associated with the used information, issolution to: ∂ Ξ ∂t ( t, x ) = u Ψ( t, x ) . (4)The parameter Ω corresponds to the area of action of the circulating information onthe Outcome . O ( t ) is the Outcome of the model, given by: O ( t ) = (cid:90) Ω Ξ( t, x ) dx. (5)In this mathematical model we imposed : ∀ t ∈ (0 , + ∞ ) , Φ f ( t,
0) = Φ b ( t, and Φ b ( t,
1) = Φ f ( t, (6)8hese conditions allow the circulating information to move back and forth between thetwo edges of the domain.The initial conditions Φ f (0 , x ) , Φ b (0 , x ) , Ψ(0 , x ) , Ξ(0 , x ) and O (0) are given for all x in (0 , . The previously presented mathematical model made of Equations (1), (2), (3), (4) and(5) can be used to simulate and predict an accumulative process. Hence, it can be usedto study data relative to a total production over a given period.The fourth equation of the model is the «usage» equation. This equation deter-mines the action of the injected information on the variable to predict. Therefore, thisequation has to adapt the different ways in which an intake or an injection may affecta biological variable.To model a logistical growth, we added a limiter in this equation. In this case, the«usage» equation becomes: ∂ Ξ ∂t ( t, x ) = u Ψ( t, x ) (cid:18) L − O ( t ) L (cid:19) (4b)With this version of the equation, data related to the change in weight of an animalcan be tackled. This equation is essentially the differential equation of Verhulst [36]: ∂y∂t ( t ) = r y ( t ) (cid:18) K − y ( t ) K (cid:19) (7)whose structure is equivalent. Indeed, in the case when nothing depends on x , the valueof u is very high and Ω is the whole interval [0 ; 1] , Ξ , Ψ and O are very similar. Hence,Equations (7) and (4b) are essentially the same.It may be also necessary to model variations to use our tool to treat, for example,data concerning drug effects. To do so, we have to be able to model an increase or adecrease in the Outcomes . Since it is the case of most biological variables, we assumedthat these outcomes may vary between an upper and a lower bound. Hence, we builttwo other equations: The equation, ∂ Ξ ∂t ( t, x ) = − (cid:16) Ξ( t, x ) − U pp (cid:17) − u Ψ( t, x ) (cid:16) Ξ( t, x ) − Low (cid:17) (4c)models that the fixed information Ψ orients the Outcomes O toward a state that islower than the steady state, and the equation, ∂ Ξ ∂t ( t, x ) = − u Ψ( t, x ) (cid:16) Ξ( t, x ) − U pp (cid:17) − (cid:16) Ξ( t, x ) − Low (cid:17) (4d)models that the fixed information Ψ orients the Outcomes O toward a state which isgreater than the steady-state. In these two cases, the Outcomes vary between a lowerbound
Low and an upper bound
U pp . 9he «usage» equation has to be defined a priori according to the variable to predictand the used inputs.We expect that the mathematical models made of Equations (1), (2), (3), (5) andEquations (4), 4b, 4c or 4d are sufficiently generic to be fitted on data relative to allthe different farm species.
For the discretization of the Mathematical Model, we first used the classical FiniteDifference method, with a given space step, to obtain semi-discrete in space equations.And, because the Mathematical Model is coded using R software, we used the R-function
Ode. D developed by Soetaert et al. [37] to manage the temporal discretizationof the semi-discrete equations. This R-function calls upon the fourth-order Runge Kuttamethod with a given time step (See [38]).In this first exploration, to find a compromise between precision and calculation time,we parameterized the mesh with a time step of . and a space step of . . The system of PDEs contains several parameters : ω , c , r , f , u and L . The diffusionparameter c is the less influent model parameter. Hence, we set it to . . All theother parameters are fitted from a database by using an optimization algorithm. To doso, we used the function directL developed by Johnson [39], which is embedded in R([40]) and applying the DIRECT algorithm developed by Finkel [41]. This algorithmsearches the optimal values of the parameters, that is the values that minimize the errorassociated with the model on a given training database.A detailed mathematical analysis of the model and its discretization will be per-formed in an upcoming paper. Nevertheless, we already know that the convection anddiffusion speeds must follow the CFL conditions (See [42] and [43]). Indeed, since weset the discretization steps (Section 2.3), ω must be smaller than and c must besmaller than . .To fit the model we have to specify lower and upper values for each parameterbetween which the optimization algorithm will search their optimal values. Therefore,a comprehensive study of the ranges of values of the different model parameters wasperformed and presented in the working paper [44]. We refer to it for the details of thisstudy. The objective of this section is to present the tests by simulation performed to verifythe ability of the tool to learn parameters from noisy biological data. To do so, westarted by generating a fictitious database from the parameterized mathematical model10ade of Equations (1), (2), (3), (4) and (5). Then, we used this database to studythe compensation effects existing between the parameters, to simulate the fitting of theparameters and then to verify if the model fits the data correctly.
We generated a fictitious database containing 50 individuals, which is 50
Output Curves .The objective was to obtain a database having the same characteristics as a real fielddatabase. To do so, we integrated noise and individual variations in it.The construction of this fictitious database is presented in Appendix I. We refer toit for the details.Figure 13 shows an example of the generated curves without and with noise. Wedivided the obtained database into two parts: A Training Database made of curvesand a Test Database made of curves.In the rest of this section, we supposed that we have an experimental-like databaseand a model containing four parameter values to determine: ω , c , r , f and u .Figure 5: Example of simulated curves without and with noise A study of the compensation effects existing between ω and r and between f and u , putin evidence the existence of relations existing between these two pairs of parameters.For example, the relation existing between the parameters f and u can be noticed inFigure (6).This study is presented in Appendix II. We refer to it for the details.We concluded from this study that, using a Nadaraya-Watson kernel regressions(See [45] and [46]), we obtained a non-parametric relationship linking ω and r in theform of, r = ˆ m ω ( ω ) + (cid:15) ω , (8)and another one linking f and u in the form of, u = ˆ m f ( f ) + (cid:15) f , (9)where ˆ m ω and ˆ m f corresponds to the Nadaraya-Watson estimators and (cid:15) ω and (cid:15) f arethe residuals. 11igure 6: The D representation of the value of an error indicator (the Relative ResidualSum of Squares). This value is calculating according to the values of f and u .Knowing the relationship existing between ω and r and the one existing between f and u , it is possible to fit ω and f and then deduce the values of r and u . Hence, theserelations permits to reduce the number of parameters to learn simultaneously and sofacilitate and reinforce the fitting process. We fitted the parameters to the Training Database and then we tested the accuracy ofthe obtained model by calculating the error made on the Test Database. ω and f and accuracy of the obtained predictive tool To perform several fittings we sampled the Training Database: we sampled curvesfrom the curves of the Training Database and we fitted the parameters to the sampled curves. By proceeding in this manner, we performed fittings of the parame-ters. To determine the values of ω , r , f and u , we fitted ω and f on the selected curvesof the Training Database and we deduced the values of r and u using Equalities (8) and(9).To optimize the parameters, we used the R-function directl to find the values of ω and f minimizing the function, f obj ( ω, f ) = 1 n n (cid:88) i =1 (cid:32) m (cid:88) j =1 (cid:18) ( y ij obs − y ij pred ( ω, f )) y ij obs (cid:19) (cid:33) . (10)After the fittings of the parameters, we obtained values of ω , r , f and u . Wecalculated the average value and the Relative Standard Deviation (RSD) of each pa-rameter (Table 1). We also looked at the fit of the model calculating from the TrainingDatabase the value of the Determination Coefficient ( R ) (Figure 7 and Table 1). Theresults show that the model fits the curves of the Training Database well.To validate the ability of the tool to learn parameters from noisy data, we calculatedthe accuracy of the model on the Test Database. To do so, we calculated the RelativeResidual Sum of Square ( RRSS ) and the Determination Coefficient associated with12igure 7: Examples of results given by the predictive tool in comparison with sometraining curves.Table 1: Average and Relative Standard Deviation of the parameters and the Determi-nation Coefficient calculated on the Training Database.Parameter Average Relative standard deviation ω . . f . . r . . u . . R .
97 0 . each curve contained in the Test Database and we obtained the distributions showed inFigure 8. The RRSS is low and the Determination Coefficient is high, indicating thatthe model fits the curves of the Test Database well.We compared the R and the RRSS associated with the Generator model ( R Gener and
RRSS
Gener ) - i.e. the model used to generate the fictitious Database - and the R and the RRSS associated with the Fitted Model ( R F it and
RRSS
F it ) (Figure 8 andTable 2).
RRSS
F it is low and this value is very similar to the value of
RRSS
Gener . The R F it is high and this value is also very similar to the value of R Gener . These indicatorsthus demonstrate that the model fitting method is highly satisfactory and the errorassociated with the adjusted model is limited to the amount of noise and individualdifferences initially integrated into the generated database.Table 2: Comparison between the indicators associated with the Generator Model andthe Fitted Model.
RRSS R Generator Model . Fitted Model . RRSS and of the R coefficient associated with the Generator M odel and the Fitted Model.
In this section, we present an application of our approach to field data. The database weused is confidential therefore only the dimensionless Inputs and Outputs are presented.
The objective of this application is to build a tool that can predict the logistic growthof animals according to their initial weights and their intakes all along a given period.
To mimic a logistic behavior, we used Equation (4b) as «usage» equation (Section 2.2).In this equation, L corresponds to the maximum weight attainable by the animals ofthe studied species. Experts have an idea of the value of L . Therefore, during themodel fitting, the value of this parameter was search in a restricted range of values.Therefore, in this application we used the mathematical model made of Equations(1), (2), (3), (4b) and (5). This model considered as a growth model contained fiveparameters to fit: ω , r , f , u and L . The database we used is made of two parts corresponding to two different groups ofanimals monitored during two different periods (Table 3). The first group contained individuals, monitored over a unit-period from t = 0 until t = 1 . For this group,the weight of the animals was measured at t = 0 and at t = 1 . The second groupcontained individuals, monitored from t = 0 until t = 2 . . For this group the weightof the animals was measured at t = 0 , t = 0 . , t = 1 . and at t = 2 . . For bothgroups, intakes of each individual were recorded over each time-step of . time-unit.Therefore, for each individual, information relative to those intakes are periodicallyinjected in the model with a time-step of . .14he dataset relative to the first group constitutes our Training Database and thedataset relative to the second group constitutes our Test Database.Table 3: Description of the data used.First group Second groupNumber of individuals t = 0 t = 0 Output measured at t = 1 t = 0 . t = 1 . t = 2 . Time step of the
Entries injections ∆ t In = 0 .
16 ∆ t In = 0 . As in Section 3.2 and Appendix II, we built a relationships between some parametersof the model by applying the same methodology. Using a Nadaraya-Watson kernelregressions, we obtained a non-parametric relation linking ω and r and another onelinking f and u . Knowing the relationships between these parameters, it is possible tofit ω and f and then deduce the value of r and u . Therefore in this application we onlyfitted ω , f and L and deduced the values of r and u .The parameters were fitted to the Training Database by minimizing the differencebetween the simulated and the real Outputs at time t = 1 . Hence, to fit the parameters,we used the algorithm DIRECT that minimized the function, f obj ( ω, f, L ) = 1 n n (cid:88) i =1 (cid:18) ( s i obs (1) − s j pred (1)) s i obs (1) (cid:19) , (11)where n is the number of individuals contained in the training database and O i obs (1) and O i pred (1) correspond respectively to the values of the observed and the predictedOutputs values for the i th individual at t = 1 .To performed several fitting procedures. We sampled the Training Database: werandomly selected individuals from the individuals before each fitting procedure andwe fitted the parameters on the data associated with the selected individuals. Therefore,we performed fittings and we obtained sets of values of ( ω, r, f, u, L ) . We calculated the average and the
RSD of each parameter (Table 4). The
RSD ofeach parameter is low, indicating that our fitting method permitted to identify oneset containing the parameter values that minimize the error associated with the FittedModel. The existence of a single optimal set of values of ( ω, r, f, u, L ) attests to theidentifiability of the model.We parameterized the model with the average values of the parameters.15e calculated the error associated with the model on the Training Database. Todo so, we calculated the Average Relative Error ( ARE ) between the measured andpredicted values of the Output at time t = 1 : ARE ( t ) = 1 n n (cid:88) i =1 (cid:115)(cid:18) ( s i obs ( t ) − s j pred ( t )) s i obs ( t ) (cid:19) (12)Table 4: Average values and RSD of the fitted parameters.
ARE calculated at time t = 1 on the Training Database. P arameter M ean RSDω .
24 0 . r .
91 0 . f .
01 0 . u .
49 0 . L .
70 0 . ARE (1) (%) 1 .
83 0 . The
ARE value calculated on the Training Database at time t = 1 , is worth . .This result is satisfactory, but the accuracy of the model must be calculated on a TestDatabase to ensure that the model does not overfit the training data.To do so, we calculated the ARE given by Equation (12), on the Test Database attime t = 0 . , t = 1 . and t = 2 . (Table 5 and Figure 9).Figure 9: Difference obtained between the measured ( + ) and predicted ( × ) values ofthe Output variable at different times t for the individuals in the Test Database.Table 5: Average Relative Error ( ARE ) calculated on the Test Database at differentinstants. t . .
52 2 . ARE(t) (%) . . . .6 Discussion of the results The error associated with the model is low on the Test Database. The errors madebefore and after time t = 1 remain low. Those results indicate that our tool can betrained on a very small database to link the inputs and the outputs and then it canaccurately predict the weight of the animals over a period . times longer than thetraining one.This extrapolation capability, obtained despite the very low quantity of training data,illustrates that our tools hold high potential for information extraction. As it will bedemonstrated below, this capability distinguishes our approach from other inferencemethods.Moreover, in addition to the information extraction potential, the extrapolation ca-pability helps to reduce the training-data-dependency of our tool. Indeed, we demon-strated that our tool can be applied outside the training data range and provide accurateextrapolations. Hence, we do not need to fit it on data covering the whole curve topredict and so we can use smaller Training Database. Therefore, this extrapolationcapability permits to reduce the duration of data collection, the duration of in situexperiments, and thus the computational and the experimental costs. According to [16] and [47], the current methods used to simulate and predict logisticgrowth processes, involve two main types of models: Phenomenological Models corre-sponding to «Black Box» models, and Mechanistic Models corresponding to «WhiteBox» models. In this section, we will compare some models belonging to these twomain categories with the Biomimetic Statistical Learning tool presented in this paper.
As defined in [16], the Phenomenological Models include Linear, Multiple Linear andNonlinear Regressions, Logistic Models and Neuronal Networks.
The models of Gompertz [48], dN ( t ) dt = a G .N ( t ) . ln (cid:18) K G N ( t ) (cid:19) , (13)and Verhulst [36], dN ( t ) dt = a V .N ( t ) . (cid:18) − N ( t ) K V (cid:19) , (14)are two models frequently used to model growth processes (e.g. see [49], [50], [51], [52],[53] and [54]).We fitted the parameters of these two models on our Training Database by using the17ame optimization algorithm that we used to fit the Biomimetic Model and by mini-mizing ARE (1) .To test and compare the accuracy of the different models, we calculated on the TestDatabase the Average Relative Accuracy,
ARA at different times t : ARA ( t ) = 1 − ARE ( t ) , (15)where ARE is given by Equation (12). We used the three parameterized models togenerate the growth curve of the individuals of the Test Database and we comparedthe measured and the predicted values at times t = 0 . , t = 1 . and t = 2 . .The results contained in Table 6 and the curves of Figures 10 and 11 show that the curvesgenerated from the Gompertz’s model featured a premature deceleration. However, theVerhulst’s model is associated with good accuracy over the whole studied period.The similarity of the results from the Verhulst and the Biomimetic Growth Mod-els was expected because our model includes an equation assimilable to the Verhulst’sequation (see Section 2.2). The real advantage of our biomimetic growth model is itsability to integrate input data. Indeed, the Verhulst equation only takes into accountthe initial conditions of the system under study, whereas our model also integrates in-takes throughout the studied period. The capability to integrate additional informationappears to help refine the results and increase the accuracy of our model. Moreover,since the Verhulst’s and the Gompertz’s models can not integrate input data over time,they are not able to perform Data Assimilation, contrary to our tool.Table 6: Parameter values and ARA (1) calculated on the Training Database.Model a K ARA(1)Gompertz a G = 0 . K G = 0 .
563 0 . Verhulst a V = 0 . K V = 1 .
563 0 . Biomimetic . Figure 10: The
ARA calculated on the Test Database at different times t and associatedwith different models. 18igure 11: Plot of the predicted growth curves of two individuals contained in the TestDatabase with the different models. Over the past decade, the use of Machine Learning (ML) algorithms and especiallyNeural Networks (NN) has been on the rise [55]. According to some studies ([56], [57],[58] and [59]), the popularity of these tools can be explained by the ease of their imple-mentation and the diversity of issues that these algorithms can handle. Nevertheless,these algorithms are based on relatively simple mathematical models that are cannoteasily take into account complex phenomena, such as delay and saturation. Hence, weapplied different Neural Networks on our Training Database to compare this kind ofML tool and our Biomimetic Growth Model. We tested six Neural Networks havingdifferent numbers of nodes and hidden layers, taken as inputs the initial weight of eachindividual and their periodically recorded intakes (Table 7).Table 7: The
ARA calculated on the Training Database (
ARA
T rain ), and on the TestDatabase (
ARA
T est ), at t = 1 , with different Neural Networks. The Neural Network( k ,..., k i ,..., k n ) corresponds to a Neural Network containing n hidden layers and the i th hidden layer contains k i nodes.Structure ARA
T rain (1) ( % ) ARA
T est (1) ( % )(4) 99.9 78.8(4,3) 99.8 90.5(6,5) 99.7 93.4(4,6,6,3) 99.9 94.8(5,7,7,7,4) 99.8 95.3(5,9,9,9,5) 99.9 93We fitted each tested Neural Network on our Training Database by using the R-function neuralnet developed by Fritsch et al. [60], and we calculated the accuracy ofthose Neural Networks on the T raining and on the Test Database.The results given in Table 7 show that all the tested Neural Network fit the curvesof the Training Database better than the ones of the Test Database. It shows that thetested Neural Network overfit the training curves, particularly when the structure of19he studied Neural Networks is composed of too many or too few nodes and hiddenlayers. The accuracy of the Neural Networks on the Test Database increases up to acertain number of nodes and hidden layers and then decreases when the complexity ofthe structure continues to increase. On the test database, the highest accuracy valueis reached using a Neural Networks containing hidden layers, but this value is lowerthan that obtained using our Biomimetic Model (Table 7 and Figure 10).Nevertheless, the accuracy of these ML tools is satisfactory and the real advantageof our Biomimetic tool over Neural Networks is its extrapolation capability. Indeed,as the Biomimetic Model, the studied Neural Networks were fitted only from the valueof the Output at t = 1 . In this case, the fitted classical Neural Networks can onlybe used to predict the Output at t = 1 . Hence, Neural Networks cannot interpolateor extrapolate, in contrast to our Biomimetic Model. Therefore, contrary to our tool,Neural Networks can not be used in a "Biological Small Data" context to reduce theexperimental and computational costs. They are also less suitable to perform DataAssimilation in this context. Mechanistic Growth Models are another kind of tool permitting to gathered biologicalinputs to predict the growth of plants or animals. Some models of this type have beendeveloped in [61], [62], [63] and [64]. These models integrate numerous Inputs, andnot all of which are available in our study. Hence, these models can not be appliedto our database. Therefore, we only compared the structure, the functioning and theobjectives of those Mechanistic Models with our Biomimetic one.As it is said in [16], [65], [17] and [66], the construction of Mechanistic GrowthModels generally focuses on the biological meaning of the overall model. Therefore,the construction of the explanatory mechanistic models takes time, requires a largequantity of zootechnical knowledge and results in complex models. As it is explainedin [67], [17] and [68], these models contain a large number of unknown parametersand include many factors, forcing the user to enter a large number of Input values,which are sometimes difficult or costly to obtain. Hence, the complex structure of thesemodels makes Mechanistic Realistic Models inappropriate for fitting data and DataAssimilation.
To conclude, we built a
Biomimetic Statistical Learning tool based on a PDE systemembedding the mathematical expression of biological determinants. This PDE systemcontains parameters that can be fitted to data. This PDE system was carefully designedto have a high learning potency and a great accuracy but also to remain light andflexible.In the particular «Biological Small Data» context, the performed applications showedthat this tool has higher accuracy than the existing tools. However, our
BiomimeticStatistical Learning tool distinguishes itself in light of its suitability to perform DataAssimilation even if the frequency of data collection and the quality of the collecteddata are low. 20he extrapolation capability of our tool, coupled with its high learning potency permitsto fit it on a few data but also to accurately applied it outside the range of the trainingdata. Hence, the quantity of collected data can be reduced as the costs relative toexperiments and data management.To sum up, our tool can be used to predict health and performance indicators ac-cording to the ingestion or the injection of molecules in an animal body and to performaccurate and inexpensive Livestock Data Assimilation .The pursuit of an optimal combination between the use of data and the use of priorknowledge via the use of PDEs seems to be an interesting way to build Artificial In-telligence (AI) tools. Those AI tools could have a strong learning ability and a weakTraining-Data-Dependency.Nevertheless, the results coming from the Biomimetic Model was obtained from acertain number of hypotheses. Some Model Selection methods could be applied to selectthe structure of the Mathematical Model, permitting to obtain a more satisfying modelin terms of
ARE and the number of parameters to learn. This suggested improvementwill be studied in a forthcoming work.
Acknowledgements
The authors are very grateful to D. Causeur, G. Durrieu and E. Fokoué for fruitfuldiscussions related to this article.
References [1] M. McPhee. “Mathematical modelling in agricultural systems : A case study ofmodelling fat deposition in beef cattle for research and industry”. In: 2009.[2] L. Puillet, O. Martin, D. Sauvant, and M. Tichit. “Introducing efficiency into theanalysis of individual lifetime performance variability: a key to assess herd man-agement”. In: animal doi : . url : https://hal.archives-ouvertes.fr/hal-01137029 .[3] O. Martin and D. Sauvant. “A teleonomic model describing performance (body,milk and intake) during growth and over repeated reproductive cycles through-out the lifespan of dairy cattle. 2. Voluntary intake and energy partitioning”. In: Animal doi : .[4] J. D. Nkrumah, J. Basarab, Z. Wang, C. Li, M. Price, E. Okine, D. H. Crews, andS. S. Moore. “Genetic and phenotypic relationships of feed intake and measuresof efficiency with growth and carcass merit of beef cattle”. In: Journal of animalscience
85 (2007), pp. 2711–20. doi : .[5] H. Nesetrilova. “Multiphasic growth models for cattle”. In: Czech Journal of An-imal Science
50 (2005), pp. 347–354. doi : .[6] J. Basarab, M. Price, J. L. Aalhus, E. Okine, W. M. Snelling, and K. L. Lyle.“Residual Feed intake and body composition in young growing cattle”. In: Cana-dian Journal of Animal Science
83 (2003), pp. 189–204. doi : .217] D. Auroux and J. Blum. “Back and forth nudging algorithm for data assimilationproblems”. In: Comptes Rendus Mathematique
Journal of Marine Systems
Monthly Weather Review
Remote Sensing
Agricultural Sciences inChina issn : 1671-2927. doi : https://doi.org/10.1016/S1671- 2927(11)60156- 9 . url : .[12] Bert Rijk. “Integration of sensor data in crop models for precision agriculture”.In: (2013).[13] S Janssen, CH Porter, AD Moore, IN Athanasiadis, I Foster, JW Jones, andJM Antle. “Towards a new generation of agricultural system models, data, andknowledge products: building an open web-based approach to agricultural data,system modeling and decision support. AgMIP”. In: Towards a New Generationof Agricultural System Models, Data, and Knowledge Products
91 (2015).[14] J. C. W. Locke, A. J. Millar, and M. S. Turner. “Modelling genetic networks withnoisy and varied experimental data: the circadian clock in Arabidopsis thaliana.”In:
Journal of theoretical biology
234 3 (2005), pp. 383–93.[15] Y. Qi, Z. Bar-Joseph, and J. Klein-Seetharaman. “Evaluation of different biolog-ical data and computational classification methods for use in protein interactionprediction”. In:
Proteins: Structure, Function, and Bioinformatics doi :
10 . 1002 / prot . 20865 . eprint: https : / / onlinelibrary .wiley.com/doi/pdf/10.1002/prot.20865 . url : https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.20865 .[16] M. A. Vázquez-Cruz, A. Espinosa-Calderón, A. R. Jiménez-Sánchez, and R.Guzmán-Cruz. “Mathematical Modeling of Biosystems”. In: Biosystems Engineer-ing: Biofactories for Food Production in the Century XXI . Cham: Springer Inter-national Publishing, 2014, pp. 51–76. doi :
10 . 1007 / 978 - 3 - 319 - 03880 - 3 _ 2 . url : https://doi.org/10.1007/978-3-319-03880-3_2 .[17] D. Bastianelli and D. Sauvant. “Modelling the mechanisms of pig growth.” In: Livestock Production Science (1997).[18] O. Martin and D. Sauvant. “A teleonomic model describing performance (body,milk and intake) during growth and over repeated reproductive cycles throughoutthe lifespan of dairy cattle. 1. Trajectories of life function priorities and geneticscaling.” In:
Animal (2010). 2219] A. C. Tan and D. Gilbert. “An empirical comparison of supervised machine learn-ing techniques in bioinformatics”. In:
Proceedings of the First Asia-Pacific bioin-formatics conference on Bioinformatics 2003-Volume 19 . Australian ComputerSociety, Inc. 2003, pp. 219–222.[20] J. Shavlik, L. Hunter, and D. Searls. “Introduction”. In:
Machine Learning issn : 1573-0565. doi : . url : https://doi.org/10.1007/BF00993376 .[21] T. Hubbard and A. Reinhardt. “Using neural networks for prediction of the sub-cellular location of proteins”. In: Nucleic Acids Research issn : 0305-1048. doi :
10 . 1093 / nar / 26 . 9 . 2230 . eprint: http : / / oup .prod.sis.lan/nar/article-pdf/26/9/2230/9471729/26-9-2230.pdf . url : https://dx.doi.org/10.1093/nar/26.9.2230 .[22] S. H. Dumpala, R. Chakraborty, and S. K. Kopparapu. k-FFNN: A priori knowl-edge infused Feed-forward Neural Networks . 2017. arXiv: .[23] E. Frénod. “A PDE-like Toy-Model of Territory Working”. In: Understanding In-teractions in Complex Systems - Toward a Science of Interaction . UnderstandingInteractions in Complex Systems - Toward a Science of Interaction. CambridgeScholar Publishing, 2017, pp. 37–47. url : https://hal.archives- ouvertes.fr/hal-00817522 .[24] A. Rousseau and M. Nodet. “Modélisation mathématique et assimilation de don-nées pour les sciences de l’environnement”. In: Bulletin de l’APMED
505 (2013),pp. 467–472.[25] W. J. Sacks, D. S. Schimel, and R. K. Monson. “Coupling between carbon cyclingand climate in a high-elevation, subalpine forest: a model-data fusion analysis”. en.In:
Oecologia issn : 0029-8549, 1432-1939. doi : . url : http://link.springer.com/10.1007/s00442-006-0565-2 (visited on 11/22/2018).[26] L. Wang, H. Zhang, K. C. L. Wong, H. Liu, and P. Shi. “Physiological-model-constrained noninvasive reconstruction of volumetric myocardial transmembranepotentials”. In: IEEE Transactions on Biomedical Engineering
Quarterly Journal of the Royal MeteorologicalSociety doi : . url : https://rmets.onlinelibrary.wiley.com/doi/abs/10.1256/003590002321042135 .[28] G. Kim and A. P. Barros. “Space–time characterization of soil moisture from pas-sive microwave remotely sensed imagery and ancillary data”. In: Remote Sensingof Environment issn : 0034-4257. doi : https://doi.org/10.1016/S0034- 4257(02)00014- 7 . url : .[29] W. L. Crosson, C. A. Laymon, R. Inguva, and M. P. Schamschula. “Assimilat-ing remote sensing data in a surface flux–soil moisture model”. In: HydrologicalProcesses
16 (2002), pp. 1645–1662. doi : .2330] D. S. Mackay, S. Samanta, R. R. Nemani, and L. E. Band. “Multi-objective param-eter estimation for simulating canopy transpiration in forested watersheds”. In: Journal of Hydrology issn : 0022-1694. doi : https://doi.org/10.1016/S0022-1694(03)00130-6 . url : .[31] D. J. Barrett. “Steady state turnover time of carbon in the Australian terrestrialbiosphere”. In: Global Biogeochemical Cycles doi : . eprint: https://agupubs.onlinelibrary.wiley.com/doi / pdf / 10 . 1029 / 2002GB001860 . url : https : / / agupubs . onlinelibrary .wiley.com/doi/abs/10.1029/2002GB001860 .[32] D. Barrett, M. Hill, L. Hutley, J. Beringer, J. H. Xu, G. Cook, J. Carter, andR. J. Williams. “Prospects for improving savanna biophysical models by usingmultiple-constraints model-data assimilation methods”. In: Australian Journal ofBotany doi : .[33] P. J. Rayner, M. Scholze, W. Knorr, T. Kaminski, R. Giering, and H. Widmann.“Two decades of terrestrial carbon fluxes from a carbon cycle data assimilationsystem (CCDAS)”. In: Global Biogeochemical Cycles doi : . eprint: https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2004GB002254 . url : https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2004GB002254 .[34] W. J. Sacks, D. S. Schimel, R. K. Monson, and B. H. Braswell. “Model-datasynthesis of diurnal and seasonal CO2 fluxes at Niwot Ridge, Colorado”. In: GlobalChange Biology doi : . eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1365- 2486.2005.01059.x . url : https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1365-2486.2005.01059.x .[35] P. Ailliot, E. Frénod, and V. Monbet. “Long term object drift in the ocean withtide and wind.” In: SIAM Journal on Multiscale Modeling and Simulation: ASIAM Interdisciplinary Journal url : https://hal.archives-ouvertes.fr/hal-00129093 .[36] P. F. Verhulst. “Notice sur la loi que la population suit dans son accroissement”.In: Corresp. Math. Phys.
10 (1838), pp. 113–126. url : https://ci.nii.ac.jp/naid/10015246307/en/ .[37] K. Soetaert, T. Petzoldt, and R. Woodrow Setzer. “Solving Differential Equationsin R: Package deSolve”. In: Journal of Statistical Software issn : 1548-7660. doi : . url : .[38] W. Enright. “The Numerical Analysis of Ordinary Differential Equations: RungeKutta and General Linear Methods”. In: SIAM Review doi :
10 . 1137 / 1031147 . eprint: https : / / doi . org / 10 . 1137 / 1031147 . url : https://doi.org/10.1137/1031147 .[39] S. G. Johnson. “The NLopt nonlinear-optimization package”. 2008.2440] R Core Team. R: A Language and Environment for Statistical Computing . RFoundation for Statistical Computing. Vienna, Austria, 2014. url : .[41] D. E. Finkel. DIRECT Optimization Algorithm . North Carolina State University,2003.[42] R. Courant, K. Friedrichs, and H. Lewy. “Über die partiellen Differenzengle-ichungen der mathematischen Physik”. In:
Mathematische annalen
Wolfram MathWorld–AWolfram Web Resource. (2014).[44] H. Flourent. “Study of the ranges of values of a Biomimetic Statistical Learn-ing Tool parameters”. Working paper. 2019. url : https : / / hal . archives -ouvertes.fr/hal-02067374 .[45] E. A. Nadaraya. “On estimating regression”. In: Theory of Probability & Its Ap-plications
Sankhy¯a: The Indian Journal ofStatistics, Series A (1964), pp. 359–372.[47] R. Guzmán-Cruz, R. Castaneda-Miranda, J. Garcia-Escalante, L. Solis-Sánchez,D. Alaniz-Lumbreras, J. Mendoza-Jasso, A. Lara-Herrera, G. Ornelas-Vargas, E.Gonzalez-Ramirez, and R. Montoya-Zamora. “Evolutionary Algorithms in Mod-elling of Biosystems”. In: 2011. isbn : 978-953-307-171-8. doi : .[48] B. Gompertz. “XXIV. On the nature of the function expressive of the law of humanmortality, and on a new mode of determining the value of life contingencies. In aletter to Francis Baily, Esq. FRS &c”. In: Philosophical transactions of the RoyalSociety of London
115 (1825), pp. 513–583.[49] C. P. Winsor. “The Gompertz Curve as a Growth Curve”. In:
Proceedings ofthe National Academy of Sciences issn : 0027-8424. doi :
10 . 1073 / pnas . 18 . 1 . 1 . eprint: . url : .[50] N. K. Sakomura, F. A. Longo, E. O. Oviedo-Rondon, C. Boa-Viagem, and A. Fer-raudo. “Modeling energy utilization and growth parameter description for broilerchickens”. In: Poultry Science
Life sciences
Journal of Biological Chemistry
The chemical basis of growth and senescence . JB LippincottCompany, 1923.[54] P. Román-Román and F. Torres-Ruiz. “Modelling logistic growth by a new dif-fusion process: Application to biological systems”. In:
Biosystems
Com-mun. ACM
55 (2012), 78–87. doi : .[56] M. T. Gorczyca, H. F. M. Milan, A. S. C. Maia, and K. G. Gebremedhin. “Machinelearning algorithms to predict core, skin, and hair-coat temperatures of piglets”.In: Computers and Electronics in Agriculture
151 (2018), pp. 286 –294. issn :0168-1699. doi : https : / / doi . org / 10 . 1016 / j . compag . 2018 . 06 . 028 . url : .[57] J. J. Valletta, C. Torney, M. Kings, A. Thornton, and J. Madden. “Applications ofmachine learning in animal behaviour studies”. In: Animal Behaviour
124 (2017),pp. 203 –220. issn : 0003-3472. doi : https://doi.org/10.1016/j.anbehav.2016.12.005 . url : .[58] C. Ma, H. H. Zhang, and X. Wang. “Machine learning for Big Data analyticsin plants”. In: Trends in Plant Science issn : 1360-1385. doi : https://doi.org/10.1016/j.tplants.2014.08.004 . url : .[59] R. H. L. Ip, L. M. Ang, K. P. Seng, J. C. B., and J. E. Pratley. “Big data and ma-chine learning for crop protection”. In: Computers and Electronics in Agriculture
151 (2018), pp. 376 –383. issn : 0168-1699. doi : https://doi.org/10.1016/j . compag . 2018 . 06 . 008 . url : .[60] S. Fritsch, F. Guenther, and Ma. Suling. neuralnet: Training of neural networks .R package version 1.32. 2012. url : http : / / CRAN . R - project . org / package =neuralnet .[61] D. Bastianelli, D. Sauvant, and A. Rerat. “Mathematical modeling of digestionand nutrient absorption in pigs.” In: Journal of animal science (1996).[62] J. Mach and Z. Kristkova. “Modelling The Cattle Breeding Production in theCzech Republic”. In:
AGRIS on-line Papers in Economics and Informatics
Animal : an international journal of animal bioscience doi : .[64] E. C. T. Zúñiga, I. L. L. Cruz, and A. R. García. “Parameter estimation for cropgrowth model using evolutionary and bio-inspired algorithms”. In: Applied SoftComputing
23 (2014), pp. 474 –482. issn : 1568-4946. doi : https://doi.org/10.1016/j.asoc.2014.06.023 . url : .[65] L. O. Tedeschi, D. G. Fox, R. D. Sainz, L. G. Barioni, S. R. de Medeiros, andC. Boin. “Mathematical models in ruminant nutrition”. en. In: Scientia Agricola
62 (2005), pp. 76 –91. issn : 0103-9016. url : .2666] D. E. Beever, A. J. Rook, J. France, M. S. Dhanoa, and M. Gill. “A review ofempirical and mechanistic models of lactational performance by the dairy cow”.In: Livestock Production Science issn : 0301-6226. doi : https : / / doi . org / 10 . 1016 / 0301 - 6226(91 ) 90061 - T . url : .[67] D. Wallach, B. Goffinet, J. E. Bergez, P. Debaeke, D. Leenhardt, and J. N. Auber-tot. “Parameter estimation for crop models”. In: Agronomy journal
Publication-EuropeanAssociation For Animal Production
78 (1995), pp. 223–223.27 ppendix I: Generation of the Learning Database
To test the learning capability of the model we generated a Learning Database contain-ing 50 individuals, that is 50
Output Curves . The objective is to obtain a databasehaving the same characteristics as a real field database. To do that we integrated intothis fictitious database noise and individual variability.
I.1 Integration of individual variability
The model parameters are constants to determine. Nevertheless, to introduce individualvariability in the generated data, we considered (only in this Section) the parametersas biological-like factors following a Normal distribution. To simulate individual differ-ences, we assigned to each parameter a Normal distribution centered on an arbitrarilychosen value and with a relative variance of . (See Table 8). From these Nor-mal probability laws, we generated values of the parameters ω , r , f and u . Theirrespective statistical and probabilistic distributions are given in Figure 12.Figure 12: Distributions of the parameters ω , r , f and u I.2 Generation of fictitious Inputs
The
Inputs integrated into the model correspond to the injected volume (
V olQ ) andthe moment of the injection ( c t ). These parameters can take on any value between and , therefore we applied a Uniform distribution over the interval [0; 1] to these twotypes of Inputs (Table 8).From the values of the parameters and the fictitious
Inputs , we generated Output Curves . 28 .3 Addition of a random noise
Continuing with the objective of obtaining an experimental-like database, we addednoise to the
Output Curves . To do so, we added a random component following aGaussian distribution centered on and with a variance of . to the generated curves(Table 8).Figure 13 shows some examples of generated curves without and with noise. Wedivided the obtained database into two datasets: A T raining Database made of curves and a T est Database made of curves.In the rest of this Section, we assumed that we have an experimental-like databaseand a model containing four parameter values to determine.Table 8: The distributions followed by the parameters and the Inputs .Parameter Probability law ω N (10 , . r N (35 , . f N (800 , . u N (125 , V olQ U (0 , c t U (0 , N oise N (0 , . Figure 13: Example of simulated curves without and with noise29 ppendix II: Study of the compensation effects existingbetween the model parameters
Among the parameters ω , r , f and u , some parameters offset each other.Velocity ω , can be offset by any delay r , the information undergoes. For example,a low convection speed associated with a low delay may induce kinetics equivalent tothat induced by a high convection speed associated with a long delay.The fixation f , and the use of the information u , are also two counterbalanced pro-cesses. For instance, a high fixation rate followed by a low usage of the information caninduce the same effect on the Outcome as a low fixation rate followed by an importantuse of the fixed information.Therefore, relations exist between the parameters of those two couples. The objec-tive of this part is to use the fictitious
T raining Database to study these relations.
II.1 Study of the relationship between ω and r First, we demonstrated the relationship existing between ω and r by calculating theerror made on the T raining Database by the model parametrized with different ( ω, r ) pairs. To do so, we ranged the domain ω × r and we calculated the Relative ResidualSum of Squares ( RRSS ) (16) associated with the models parametrized with differenttested ( ω, r ) pairs: RRSS ( ω, r ) = n (cid:88) i =1 (cid:32) m (cid:88) j =1 (cid:18) ( y ij obs − y ij pred ( ω, r )) y ij obs (cid:19) (cid:33) , (16)where n corresponds to the number of individuals contained in the T raining Database and m the number of points on the curves. y ij obs and y ij pred correspond respectively tothe observed and the predicted value of the j th point of the i th individual. Therefore RRSS corresponds to the sum of the squared relative differences between the predictedcurves and the initially generated curves.Figures 14 and 15 give the values of the RRSS according to the values of ω and r . The existence of a series of equivalent pairs - that is a series of pairs leading to thesame value of RRSS - can be seen in Figure 14(a). There is an area where the
RRSS values are lower (Figure 15), and corresponding to the curve EC in Figure 14(b). Weassumed that the optimal ( ω Opt , r Opt ) pair, inducing the lowest RRSS, belongs to thiscurve. Therefore, we set out to determine the equation of the curve EC . II.2 Search for the ( ω Opt , r
Opt ) pairs inducing the lowest RRSS
To find the equation of the curve EC we sought for different values of ω , the valueof r minimizing the RRSS value. To do that, for each tested value of ω we used30igure 14: The value of the RRSS according to ω and r (a: left), and the schema ofthe different Equivalent Couples (EC) (b: right)Figure 15: The D representation of the value of the RRSS according to ω and r the optimization algorithm DIRECT to find the value of r minimizing the objectivefunction, f obj ( r ) = 1 n n (cid:88) i =1 (cid:32) m (cid:88) j =1 (cid:18) ( y ij obs − y ij pred ( ω, r )) y ij obs (cid:19) (cid:33) , (17)corresponding to the average RRSS.To obtain several fitted values of r for each tested value of ω , we sampled the T raining Database : we sampled curves from the test curves and we fitted r onthose selected curves. We ultimately obtained three values of r for each tested valueof ω (Figure 16). Using a Nadaraya-Watson kernel regression (See [45] and [46]), weobtained a non-parametric relationship linking ω Opt and r Opt in the form of: r opt = ˆ m ( ω opt ) + (cid:15), (18)31here ˆ m corresponds to the Nadaraya-Watson estimator.Figure 16: The Nadaraya-Watson kernel regression linking the ( ω Opt , r Opt ) parameterpair.Knowing the relationship between ω Opt and r Opt , it is possible to deduce one of thesetwo parameters according to the value of the other parameter. Hence, this relationshipreduces the number of parameters that need to be learned simultaneously.
II.3 Study of the relationship between f and u There is also a compensation effect between f and u : a high value of f can be compen-sated by a low value of u , and vice versa.As above for ω and r , we sought to determine the relationship existing between f and u to be able to deduce one of these two parameters according to the other one andfurther reduce the number of parameters to learn simultaneously.As above, we ranged the domain f × u and calculates the RRSS of the modelsparameterized with different pairs of values for ( f, u ) (Figures 17 and 18). This studydemonstrates a series of equivalent pairs. There is an area where the RRSS values arelower (Figure 18) and corresponding to the EC curve in Figure 17(a). We assumedthat the optimal ( f Opt , u
Opt ) pair inducing the lowest RRSS , belongs to this curve.Therefore, we set out to determine the equation of this curve.
II.4 Search for the ( f Opt , u
Opt ) pairs that lead to the lowest RRSS