Application of Quantile Regression : Modeling Body Mass Index in Ethiopia

Received: March 26, 2018 Revised: May 10, 2018 Accepted: May 15, 2018 Abstract: Background: Child malnutrition is the leading public health problem in developing countries. It is a major cause of child morbidity and mortality. Under-five children are the most vulnerable group for malnutrition. Body Mass Index (BMI) is a measure of nutritional status and is defined as the ratio of weight (kg) to squared height (m). Studying the determinants of under-five children’s BMI is an important issue that needs to be addressed. This study applies quantile regression to study the determinants of under-five children BMI in Ethiopia.


INTRODUCTION
Having healthy individuals in the population equates to the wealth of a country.Nutrition is the vital precondition for good health.Body Mass Index (BMI) is used as a screening tool to indicate whether a person is underweight, overweight, obese or a healthy weight for their height.However, BMI is not a direct measure of body fatness.If a person's BMI is out of the healthy BMI range the risks of illness or death may increase significantly.For children, BMI is dependent on age and sex and is often referred to as BMI-for-age.A high amount of body fat in persons or children can lead to weight related diseases and other health issues and being underweight can also put one at risk for health issues [1,2].
The nutrition of infants and young children is triggering great concern in any society.About 45 percent of deaths of children under the age of five are linked to malnutrition [3].In 2015 more than half of stunted under-five children lived in Asia while more than one-third lived in Africa.Sub-Saharan Africa has one of the highest levels of child malnutrition.In Ethiopia, 29 percent of children under the age of five are underweight, and 9 percent are severely underweight.According to the 2016 EDHS report, overall 38 percent of children under the age of five are stunted, 10 percent are wasted, 24 percent are underweight, and 1 percent are overweight.This indicates that Ethiopia is among those countries with the highest rate of malnutrition in Sub-Saharan Africa.
Globally, an estimated 101 million children under-five year of age, or 16%, were underweight (i.e.weight for age below -2SD) in 2011, a 36 percent decrease from an estimated 159 million in 1990.Although the prevalence of stunting and underweight among children under five years of age has decreased worldwide since 1990, overall progress is insufficient and millions of children remain at risk [4,5].Therefore, malnutrition is a considerable health problem that needs due attention because reducing malnutrition in children is equivalent to improving the health status of these children.This is equivalent to improving the health status of future generations of that society and is indispensable for the economic growth and development of the society under consideration.
Children's BMI under the age of five at or above the 95th percentile, between the 85th and 95th percentile and between the 5th and 85th percentile were classified as obese, overweight and normal (healthy weight) respectively [6].The cutoff point for underweight of less than the 5th percentile is based on recommendations by the World Health Organization Expert Committee on Physical Status.The percentiles are age-specific for children but not for adults [7].

METHODS AND MATERIALS
For this study, the 2016 Ethiopian Demographic and Health Survey was used.The survey was carried out by the Central Statistical Agency of Ethiopia.For the survey 645 clusters, 202 in urban areas and 443 in rural areas, were selected.The survey was conducted in 16650 residential households, 5232 in urban areas and 11418 in rural areas.The sample was expected to generate an estimated 16663 completed interviews with women aged 15-49, 5514 in urban areas and 11149 in rural areas, and 14195 completed interviews with men aged 15-59, with 4472 in urban areas and 9723 in rural areas [8].

Study Variable
The response variable in this study is under-five children's BMI, which is a continuous variable.The explanatory variables used in this study are:-child's age, sex of child, weight of child at birth, mother's current age, mother's BMI, educational attainment of mother, mother's work status, religion, region, wealth index, place of residence (rural or urban), and current marital status.The socio-economic and demographic factors used in this study were supported by several researchers as most likely to be referred to as intermediate variables for the determinants of children's nutritional status [18].
The main objective of this study is applying quantile regression to identify factors associated with different quantiles of under-five children's BMI as a function of age and other relevant factors.It will assist policy makers to know and understand the areas they need to focus on in order to enhance the planning and evaluation of health policies to prevent children's deaths and to enhance children's health, diet and growth.

Statistical Methods
Quantiles are a generalization of percentiles for continuous random variables.Quantiles are cut points dividing the range of a probability distribution into contiguous intervals with equal probabilities or dividing the observations in a sample in the same way.In SAS the QUANTREG procedure models the effect of covariates on the conditional quantiles of a response variable by means of quantile regression.Ordinary least squares regression models the relationship between one or more covariates X and the conditional mean of the response variable given X = x or E [Y|X=x].Quantile regression, which was introduced by Koenker and Bassett in 1978, extends the regression model to conditional quantiles of the response variable, such as 0.25 quantile or 25 th percentile, 0.5 quantile or 50 th percentile, 0.75 quantile or 75 th percentile and so on other than just the conditional mean of the response variable.Quantile regression is desired if conditional quantiles are of interest.It is also particularly useful when the rate of change in the conditional quantile, expressed by the regression coefficients, depends on the quantile [9].
Quantile regression, which includes median regression as a special case, provides a complete picture of the covariate effect when a set of percentiles is modeled.So it can capture important features of the data that might be missed by models that average over the conditional distribution [9 -11].Quantile regression methods are applied to continuous-response data with no zero values and possibly utilize them in the context of count data [12,13].
Suppose Y is the response variable, and X is the p-dimensional predictor: Let F Y (Y/X=x) = P(Y Y/X=x) denote the conditional cumulative distribution function of Y given X=x.Then the τ th conditional quantile of Y is defined as: ( where the quantile level τ ranges between 0 and 1.In particular, the median is The quantile regression model is described by the conditional τ th quantiles of the response Y for given values of predictors x 1 ,x 2 ,...,x k .It is a natural extension of the traditional mean model in Eq (1): where is the unknown parameter vector.
Eq (2) specifies the changes in the conditional quantiles.Since any τ th quantile can be used, it is possible to model any predetermined position of the distribution and to achieve a more complete understanding of how the response distribution is affected by the predictors.Thus, it allows us to choose positions on the response distribution for their specific inquiries.
For a random sample {y 1 ,...,y n } of Y, it is well known that the sample median minimizes the sum of absolute deviations Eq. ( 3).

(3)
Likewise, the general τ th sample quantile ξ(τ), which is the analogue of Q(τ), is formulated as the minimizer: where ρ τ (Z)=Z(τ-I(Z<0)), 0<τ<1, and where I(•) denotes the indicator function.The loss function ρ τ assigns a weight of τ to positive residuals y i -ξ and a weight of 1-τ to negative residuals.Using this loss function, the linear conditional quantile function extends the τ th sample quantile ξ(τ) to the regression setting in the same way that the linear conditional mean function extends the sample mean.

OLS regression estimates the linear conditional mean function E(Y|X
The estimated parameter minimizes the sum of squared residuals in the same way that the sample mean minimizes the sum of squares: Quantile regression also estimates the linear conditional quantile function, (τ|X = x) = x'β(τ), by solving: Eq. ( 5) For any quantile τ (0,1) the quantity (τ) is called the τ th regression quantile.The case τ = 0.5, which minimizes (Y/X=x) = {y: (y|x) } the sum of absolute residuals, corresponds to median regression, which is also known as L 1 regression.The set of regression quantiles {β(τ):τ (0,1)} is referred to as the quantile process.
Quantile regression minimizes: Eq. ( 6) Where ∑ i τ|e i | is a sum that gives the asymmetric penalties τ|e i | for under prediction and (1 -τ)|e i | for over prediction.
The SAS QUANTREG procedure computed the quantile function Q(τ|X = x) and conducts statistical inference on the estimated parameters (τ).
The τ th quantile regression estimator (τ) minimizes over β τ the objective function is: Eq. ( 7) where 0<τ<1,i:y i ≥x' i β for under prediction, i:y i < x' i β for under prediction.We have β τ instead of β, because different choices of τ estimates different values of β.

Since the τ
th conditional quantile of Y given x is given by Q τ (y i |x i ) = x' i β τ , its estimate is given by .As one increases τ continuously from 0 to 1, one traces the entire conditional distribution of Y, conditional on x.Note that various quantile regression estimates are correlated.The parameter estimates in quantile regression models have the same interpretation as those of any other linear model as rates of changes.Therefore, in a similar way to the OLS model, the β i(τ) coefficient of the quantile regression model can be interpreted as the rate of change of the τ th quantile of the dependent variable distribution per unit change in the value of the i th regressor [14].

Advantages of Quantile Regression
Quantile regression is a useful model if the interest is on conditional quantile functions.The main advantage of quantile regression in comparison to the ordinary least squares regression, is that the estimates of quantile regression are more robust against outliers.Nevertheless, the main use of quantile regression is based on different measures.These measures are central tendency and statistical dispersion and these can be useful to obtain a more all-inclusive analysis of the relationship between variables [10].Because quantile regression does not assume a particular distribution for the response, nor does it assume a constant variance for the response, unlike ordinary least squares regression, quantile regression offers considerable model robustness.The BMI considered for this study is the continuous outcome.It also allows us to study the impact of predictors on different quantiles of the response distribution, and thus provides a complete picture of the relationship between the dependent and explanatory variables.Quantile regression is also flexible because it does not involve a link function and distributional assumption (such as the normal or poisson distribution) that relates the variance and the mean of the response variable.

RESULTS
The quantile regression model was applied to the 2016 Ethiopian DHS data and the results of the application are discussed herein.SAS QUANTREG procedure was used for model fitting.As shown in Table 2 shows that 51% of the children were males and the remaining 49% of the children were females.The majority of the children were from Oromia (15.4%) followed by Somali (13.4%) and SNNPR (12.5%) regions.More than three-fourths of the children live in rural areas (81.9%).About 53.8% of them were from underprivileged (poor) families.42.1 percent of the children have average weight at birth.Children whose birth weight is less than 2.5 kilograms, or children reported to be "very small" or "smaller than average", have a higher than average risk of early childhood death [8].The sample is taken in which the study observes their weight and height measures.

Quantile Regression Analysis
Table 3 shows the estimates and significant effect of the parameters across quantile levels.It was found that current age of mother, mother's BMI, region (Addis Ababa, SNNPR and Somali), and child size at birth (average and large) were found to have significant effect on under-five children's BMI at 0.05 quantile.At 0.5 quantile current age of children, current age of mother, mother's BMI, region (Addis Ababa, Afar, Dire Dawa, Gambela, SNNPR, Somali), place of residence, wealth index (poor and middle) and weight of child at birth (average and large) were found to have significant effect on under-five children's BMI.Similarly, at 0.85 quantile current age of child, mother's current age, mother's BMI, region (Addis Ababa, Dire Dawa, Oromia, SNNPR and Somali) and weight of child at birth (average and large) were found significantly affecting under-five children's BMI.The findings using quantile regression across quantile levels (0.25 quantile, 0.75 quantile and 0.95 quantile) were also indicated Table 3.At 0.5 quantile, intercept = 14.61, which is the predicted value of the 0.5 quantile under-five children's BMI when all the explanatory variables are zero.3(0.25 quantile) = 0.07 indicates the rate of change of the 0.5 quantile (Q 2 ) of the dependent variable distribution per unit change in the value of the third regressor (mother's BMI), keeping all the other explanatory variables constant.In other words, the Q 2 regression coefficient indicates that 50% of the under-five children's BMI will increase by 0.07 for every one-unit change in mother's BMI, setting all the other explanatory variables constant.Q 2 is a value that has 50% of the observations smaller or equal to it.
At 0.75 quantile, intercept = 15.93, which is the predicted value of the 0.75 quantile of under-five children's BMI when all the explanatory variables are zero.2(0.75 quantile) = -0.01indicates the rate of change of the 0.75 quantile (Q 3 ) of the dependent variable distribution per unit change in the value of the second regressor (current age of mother), keeping all the other explanatory variables constant.In other words, the Q 3 regression coefficient indicates that 75% of the under-five children BMI will decrease by -0.01 for every one unit change in current age of mother, setting all the other explanatory variables constant.Q 3 is a value that has 75% of the observations smaller or equal to it; in other words, 25% of the observations are greater than it.
The rate of change of the coefficients across quantile levels for other significant predictors can be interpreted in the same way as above.

Graphical Assessment of the Explanatory Variables
Figs.
(1 to 5) present a concise summary of the quantile regression results of the study variables.Each plot depicts one coefficient in the quantile regression model, the shaded area depicting a 95% pointwise confidence band.In the first panel of the Figure, the intercept of the model may be interpreted as the estimated conditional quantile function of the under-five children BMI across quantile levels.It has a positive effect in the upper quantiles rather than the lower quantiles; the graph indicates a positively upward sloped line across the quantiles.The second plot shows that the effect of current age of child on under-five children's BMI has a negative effect, especially in the upper rather than the lower quantiles.The third plot shows the effect of mother's current age in the model.The plot shows a negative effect across quantile levels.The fourth plot shows that the effect of mother's BMI in the model has a positive effect in the upper quantiles; the graph indicates a positively upward sloped line across the quantiles.5).Quantile processes with 95% confidence bands for educational level, marital status and size of a child at birth.The effect of female child compared to male in the model has a negative effect across the quantiles.Type of place of residence has a positive effect in the middle quantiles.Quantile plot related to region, wealth index, working status, wealth index, religion and weight of child at birth were also presented in Fig. (1 -5).The values on the Y-axis in each graph indicate the estimated value of the variables across quantiles levels.

DISCUSSION
This paper used quantile regression for the analysis of under-five children's BMI using the 2016 Ethiopian Demographic and Health Survey.The estimates across quantile levels allow us to study the impact of predictors on different quantiles of the response variable, and thus provide a complete picture of the relationship between the dependent and independent variables.It has also been observed that children under the age of five (except at 0.05 quantile), current age of mother (except at 0.05 and 0.95 quantile), mother's BMI, region (SNNPR and Somali) and weight of child at birth (average and large) were found to be important variables significantly affecting under-five children's BMI at all quantile levels.

CONCLUSION
The findings of this study indicate that studying BMI is still an important issue among children under five years of age in Ethiopia.In addition, the findings of the study show that not only education but also environmental and socioeconomic factors were found to have significant effects on under-five children's BMI.Improving the nutritional status of mothers will consequently improve the nutritional status of their children.Policy makers need to focus on the influence of significant factors across all quantile levels to develop strategies of enhancing normal or healthy weight status of under-five children in Ethiopia.
publication are those of the author(s) and not necessarily those of AAS, NEPAD Agency, Welcome Trust or the UK government.

Fig. ( 2 ).
Fig. (2).Quantile processes with 95% confidence bands for sex of a child and place of residence.

Fig. (
Fig. (5).Quantile processes with 95% confidence bands for educational level, marital status and size of a child at birth.

value Estimate p-value Estimate p-value Estimate p-value Estimate p-value Estimate p-value
At 0.25 quantile, intercept = 13.27, which is the predicted value of the 0.25 quantile under-five children BMI when all the explanatory variables are zero.1(0.25 quantile) = -0.19indicates the rate of change of the 0.25 quantile (Q 1 ) of the dependent variable distribution per unit change in the value of the first regressor (current age of child), keeping all the other explanatory variables constant.In other words, the Q 1 regression coefficient indicates that 25% of the under-five children's BMI will decrease by -0.19 for every one unit change in current age of a child, setting all the other explanatory variables constant.Q 1 is a value that has 25% of the observations smaller or equal to it.