Archive | 2021
Spline and Kernel Mixed Nonparametric Regression for Malnourished Children Model in West Nusa Tenggara
Abstract
Article history: Received : 23-12-2020 Reviced : 16-01-2021 Accepted : 21-01-2021 Health sector development is essential to improve human life quality, especially in West Nusa Tenggara (NTB) Province. Based on data from the NTB Provincial Health Office from 2011 to 2016, children under five suffering from malnutrition continued to increase, caused by several factors that affected the incident. Therefore, appropriate analysis is needed to model children who suffer from malnutrition in NTB Province in 2016, consisting of 10 districts based on the variables that influence it. The analysis in this study was carried out using a nonparametric regression mixed-model spline truncated and kernel. The estimation of the nonparametric regression curve depends on the optimal knot points and bandwidths parameter. Therefore, in determining the optimal knot points and bandwidths obtained from Generalized Cross-Validation (GCV). Based on the analysis that has been done, we obtained a nonparametric regression mixed-model spline truncated and kernel optimal knot points, such as ; 13846154 . 10 11 \uf03d k ; 19692308 . 72 12 \uf03d k ; 25846154 . 44 13 \uf03d k 01461538 . 91 14 \uf03d k for each variable and optimum bandwidths, such as ; 5561442 . 0 1 \uf03d h ; 0220133 . 1 2 \uf03d h ; 7163110 . 0 3 \uf03d h ; 2240464 . 1 4 \uf03d h and 2900146 . 1 5 \uf03d h , with 003038719 . 0 the value of GCV. The mixed model acquired has a good model by considering the values of 2 R and MSE. Besides, the MAPE value indicated a high degree of accuracy, so that the model obtained has an excellent forecast. Keyword: Bandwidth; Kernel; Knot; Malnutrition; Truncated Spline This is an open access article under the CC BY-SA license. DOI: https://doi.org/10.30812/varian.v4i2.1003 ——————————\uf075—————————— A. INTRODUCTION Building healthiness was organized to improve awareness, desire, and capability to live healthy for everyone to create high public health. To build it as soon as possible, a sound health information system is needed, especially in West Nusa Tenggara province (NTB). Improving the quality of human life in society is one of the most critical parts to make a better future. One of the indicators to monitor the health society level is to see the status of baby nutrition. The nutrition status is a description of balance condition in a particular variable form. If the condition is disturbed, so it tends to be an inference of body growth. Many found that mothers give birth with low weight, even the dead baby is caused by a mother who has malnutrition before giving birth. A baby with low weight can affect deficient nutrients, even malnourished (BPS, 2016). According to the healthcare center of NTB province year 2016, there are many malnourished cases found in West Nusa Tenggara province. Data in 2011-2015 show that malnourished cases were found to be running down, but they increased to 403 cases in 2016. According to (Ramadani et al., 2013), some factors are causing malnourished in baby, such as the percentage of unexclusive baby care; low weight baby (<2500g); unhealth house categorized; households with clean water access; the active integrated center of 100 |Jurnal Varian| Vol.4, No.2, April 2021, Hal. 99-108 the ministry; and utilizing health facility. Another factor are the percentage of incomplete immunization given; the percentage of getting A vitamin; the percentage of household utilizing health facility; the percentage of baby with health service; the percentage of poor villagers; and the percentage of first age marriage < 15 years old (Maulani et al., 2016). The malnourished percentage is one of the simple parameters to know the baby s nutrition status (WHO, 2010). One way to monitor the malnourished percentage is modeling to check out the relationship between malnourished and affection factors. Modeling the relationship between a dependent variable with one or more independent variables can be presented through statistics into the regression model. A regression analysis model is a statistics method related to systematic relationship pattern between the variables (Daoud, 2017). According to (Hardle et al., 2004), there are two approaches that can be used to determine the regression curve, namely the parametric regression approach and nonparametric regression approach. Parametric regression has an assumption to be filled up like normal distribution normal and constant variance. In applying parametric regression, a deviation to the assumption often happens like the normal distribution. Therefore, to avoid tight and robust assumptions, the statistics technique is not linked to the tight assumption and certain regression. One of the alternative ways to solve it is the nonparametric regression approach. This approach is used when the first information relates to a curve regression limit or not (Eubank, 1999). Nonparametric approach methods that are often used are truncated spline and kernel estimator. Generally, growth patterns for babies tend to have changed at certain ages. The pattern has a form that cannot be determined so that when estimated using parametric regression, the results are inaccurate. Therefore, data case related to the percentage of children s babies suffering from malnutrition uses nonparametric regression (Pratiwi, 2017). Initial analysis shows that the pattern formed between the data on the percentage of malnourished children under five with several variables that influence it fluctuates at certain intervals. This characteristic is following the spline approach, which has high flexibility without the subjectivity of the researcher (Eubank, 1999). Besides, other variables that also influence it do not show any particular pattern in the data, so this pattern is compatible with the kernel approach, which can model data without certain patterns. Besides, this approach has also received special attention from researchers because it has a relatively fast convergence speed compared to other approaches (Hardle et al., 2004). Two basic assumptions need to be considered when the nonparametric regression model is explored. The first assumption is the patterns in each independent multivariable are considered to have the same pattern. The second assumption is that researchers only use one form of the model estimator for each independent variable. In applying it in various cases, data patterns often differ from each of the independent variables. Therefore, if only one estimator is used to estimate the nonparametric regression curve, the estimator generated does not match the data pattern. As a result, the result regression model s estimation is less precise and tends to produce large errors (Budiantara et al., 2015). Based on the description explained, this study was conducted to model the percentage of malnourished children baby in the NTB Province using a nonparametric mixed truncated spline and kernel regression model. B. LITERATURE REVIEW 1. Nonparametric Regression Nonparametric regression is one of the approaches used to determine the relationship pattern between dependent and independent variables whose regression curve is unknown, or there is no complete past information about the shape of the data pattern. This approach has high flexibility because it is expected to find its own form of regression curve estimation without being influenced by the researcher s subjectivity factor. The general nonparametric regression model is as equetion 1. Muhammad Sopian Sauri, Nonparametric Regression Mixed...101 \uf028 \uf029 n i x f y i i i , , 2 , 1 , \uf04c \uf03d \uf02b \uf03d \uf065 (1) i y is the dependent variable, i x is the independent variable, \uf028 \uf029 i x f is a regression function of unknown shape, and i \uf065 is an error that is assumed to be random with zero mean and constant variance (Eubank, 1999). 2. Mixed Estimator Nonparametric Regression Truncated Spline and Kernel Consider data \uf028 \uf029 i i i y z t , , and the relations between independent variables ) ( , i i z t and a dependent variable \uf028 \uf029 i y are assumed to follow the nonparametric regression model. In general, the nonparametric regression model is defined as equetion 2. \uf028 \uf029 n i z t y i i i i , , 2 , 1 , , \uf04c \uf03d \uf02b \uf03d \uf065 \uf06d (2) The shape of the regression curve \uf028 \uf029 i i z t , \uf06d is assumed to be unknown and smooth, meaning continuous and differentiable. A random error i \uf065 has a normal distribution with zero mean and constant variance. The regression curve \uf028 \uf029 i i z t , \uf06d is assumed to be additive, meaning it can be written as equetion 3. \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 i i i i z g t f z t \uf02b \uf03d , \uf06d (3) The main problem is sounding nonparametric mixed curve regression. It is how to get estimation curve regression form as defined below, with a vector of bandwidth parameters h and a vector of knot points k . \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 i i i i z g t f z t h k h,k ˆ ˆ ˆ , \uf02b \uf03d \uf06d (4) Regarding obtain an estimator of mixed spline truncated and kernel regression, regression \uf028 \uf029 i t fk is approached using the function of the truncated spline with knot points \uf028 \uf029 k k k k , , , 2 1 \uf04c \uf03d k , and then, regression curve \uf028 \uf029 i z gh is approached using kernel s function. For example, given a basis for truncated spline space as follow, with I being an indicator function. \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 \uf07b \uf07d k k m k t I k t k t I k t k t I k t t t t \uf0b3 \uf02d \uf0b3 \uf02d \uf0b3 \uf02d , , , , , , , , 1 2 2 1 1 2 \uf04c \uf04c (5) Regression curve \uf028 \uf029 i t fk can be written as follows, with k m \uf066 \uf066 \uf066 \uf071 \uf071 \uf071 , , , , , , , 2 1 1 0 \uf04c \uf04c being unknown parameters. \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 \uf028 \uf029 k i m k i k i m i m i i i k t I k t k t I k t t t f \uf0b3 \uf02d \uf02b \uf02b \uf0b3 \uf02d \uf02b \uf02b \uf02b \uf02b \uf03d \uf066 \uf066 \uf071 \uf071 \uf071 ..... ... 1 1 1 1 0 k (6) Moreover, the estimation of the kernel’s regression curve \uf028 \uf029 i z g can be presented as the following formula.