An Explicit Research Methodology Suitable for Studying Factors Influencing Under-Five Child Mortality in Four East African Nations

All published articles of this journal are available on ScienceDirect.

RESEARCH ARTICLE

An Explicit Research Methodology Suitable for Studying Factors Influencing Under-Five Child Mortality in Four East African Nations

The Open Public Health Journal 10 Jul 2025 RESEARCH ARTICLE DOI: 10.2174/0118749445407557250706090014

Abstract

Introduction

Under-five mortality remains a significant public health challenge in East Africa, exacerbated by socioeconomic, environmental , and health systems inequities. This paper aims to provide a transparent and replicable methodology for studying the determinants of child mortality.

Method

A pragmatic and deductive research approach was followed, utilizing the research onion model. This model-based framework combines fixed effects models, additive regression, Cox models, Bayesian meta-analysis, and time series analysis, fitted to panel data with surveys from the DHS and the World Bank.

Results

Due to the methodological nature of this research, no empirical results are reported. However, the article provides processes for variable selection, data harmonization, model justification, and validation that should be carried out in further empirical research.

Discussion

The proposed method encourages transparency, reproducibility, and generalizability across data structures. It seeks to reinforce evidence-based child survival policy-making, strengthening the methodological quality of international health comparisons.

Conclusions

Cross-sectional and longitudinal under-five mortality studies have been conducted due to the use of structured methodology, and they inform future research and intervention strategies toward the attainment of SDG 3.2 in East Africa.

Keywords: Under-five mortality, Panel data, dhs, Bayesian meta-analysis, Fixed effects, Research onion, East africa.

1. INTRODUCTION

Under-five mortality is a key indicator associated with the safety of the community and is used as one of the indicators to improve family health, population safety, socioeconomic status, and life quality of a given community, as measured by life expectancy of a particular population or community [1]. It has been used as an improvement factor in child survival, a key aspect of family health, which has been among the targets of a country's development programme in recent decades. During the early 1990s, many countries reported high under-five mortality rates and have defied the odds, showing a substantial decline, including Uganda [2], [3], [4]. In 2015, Bangladesh had managed to reduce the under-five mortality rate by two-thirds since 1990 [5]. In sub-Saharan Africa, Rwanda has seen a 60 percent improvement in child survival. Several studies have indicated a substantial decline. A reduction of 67 per 1,000 live births was adopted at the 1990 World Summit for Children [5]. Many researchers have already documented the importance of conducting studies on determinants of under-five mortality around the world, which are crucial for policy interventions and achieving Sustainable Development Goal 3.2 by 2030 [5], [6].

Numerous studies have investigated the factors influencing under-five mortality using Demographic and Health Survey (DHS) data, often following the foundational analytical approach of Mosley and Chen [7], which categorizes determinants into proximate and distal categories. For example, logistic regression models have been used to study rural-urban disparities [3], while Bayesian survival models have been employed to estimate the effects of household and community factors [4]. More advanced techniques, such as multilevel modelling, survival analysis, Bayesian meta-analysis, and dynamic panel regression, are now being utilised to handle data complexity and heterogeneity [8], [9]. However, few studies have attempted to combine these methodologies into a coherent methodological framework that is visible, reproducible, and tailored to comparative regional studies. Furthermore, while Saunders et al.'s [10] research on the onion model provides a structured decision-making framework for study design, its use in epidemiological studies, particularly child mortality research in East Africa, is limited. This paper aims to fill that gap by explicitly employing the research onion framework to guide methodological decisions for investigating under-five mortality using DHS and panel data from four East African nations.

Many studies have been well-documented in understanding the factors associated with under-five mortality, utilizing different statistical models and applying Demographics and Health Surveys (DHS) data from around the world, which adopted the conceptual framework by Chan and Mosley [10], including studies that use panel data to study factors associated with under-five mortality. However, there is limited research on the determinants of under-five child mortality that explicitly defines a methodology framework to study factors associated with under-five mortality in public health studies and adopts the research ‘onion’ model.

Research methodology is an important part of the study, playing a crucial role in ensuring systematic consistency between the selection of tools, techniques, or methods and the underlying philosophy [10], [11], [12]. The primary objective of this paper is to provide a brief overview of the ‘onion model’ research and offer explicit guidance for health science analysts or researchers to study factors associated with under-five mortality in four East African countries.

2. METHOD AND MATERIAL

The research ‘onion’ model provides a summary of the key issues that researchers need to consider before undertaking any research. We look at the six layers or stages by Saunders et al. [10], [11], [12]. Fig. (1) below shows the research onion model layers.

Fig. (1).

Adapted from Saunders et al. [10], this figure illustrates six methodological layers that guide the structuring of a research design.

On the outer layer, it is the research philosophy that determines the fundamental framework for the entire research process [13]. It refers to the researchers' worldview, or the lens through which they understand the nature of reality [10], [12]. The most common research philosophies include positivism (objective reality that can be measured and observed, often in quantitative research); interpretivism (seeking to understand social reality from the perspective of those involved, often in qualitative research); realism (combining objective and subjective perspectives); and pragmatism (focusing primarily on practical outcomes, which may guide the choice of research method) [10], [11], [12], [13], [14].

The next layer is called the research approach, which describes how the research will develop theory or answer questions [14]. It is the approach that begins with a theory, proposes a hypothesis, and finally collects data to test that hypothesis [12], [15]. In contrast, the inductive approach begins with data collection, seeks patterns, and develops a theory based on these patterns. The abductive method is an iterative approach between data and theory that can be especially beneficial in mixed-methods research [10], [11].

Next is the research methodological choice layer, which comprises the overarching strategy of the research [15]. In this decision, one chooses between mono-method (only one data type is collected, either quantitative or qualitative), mixed-methods (both qualitative and quantitative techniques), or multi-method approaches (multiple techniques of the same type, such as several qualitative methods) [11], [12], [15].

The research strategy layer, where the researcher decides on the specific plan to answer the research question [12], [16], [17]. This might be achieved through means including surveys, case studies, experiments, ethnography, grounded theory, action research, or archival research, all of which have specific strengths that determine their applicability to the research objectives and questions [17].

The fifth layer is the time horizon, which explains the temporal scope of the study [12], [17]. A cross-sectional study captures data at a single point in time (a snapshot), while a longitudinal study tracks changes over a longer period [18].

The last layer of the research onion relates to data collection and analysis [10], [18]. Researchers tend to select the specific tools and techniques that they will use in data collection and analysis in this step [10], [11], [12], [18]. Quantitative research will utilize structured questionnaires, statistical tests, descriptive statistics, and statistical models [18]. [19], [20], [21]. Qualitative research employs interviews, focus groups, and thematic analysis, for example. Mixed-methods research involves using both qualitative and quantitative data collection and methods of analysis [19], [21].

3. RESULTS

The research onion model is a conceptual framework developed by Saunders et al. [10]. This study adopts this framework to guide researchers in their methodology for studying factors associated with under-five mortality. This conceptual framework includes six concentric layers, each representing a different component of the research process:

The outer layer represents the philosophical orientation that underpins the research, where problem-solving is treated in the highest sense as pragmatic and adaptable [16]. This second layer encompasses the research paradigm, concerned with testing theories that rely on previous literature [10], [16]. Methodologically, the third layer relates to the choice of quantitative methods for both data collection and data analysis. Layer Four is the method of record review, which involves performing some level of review of records or documents. The fifth report layer reflects time horizons: the studies were either executed at one time (snapshot or cross-sectional) or over time (longitudinal) [16]. The innermost layer of the program focuses on the methods of data collection and analysis, with a strong emphasis on the application of secondary data and statistical models.

Researchers can now use a redesigned methodological architecture that aligns the six key levels of study design with under-five mortality studies. Fig. (2) shows an updated onion model that considers the intricacy of panel and survey data from several East African countries. It provides insight into each layer of the onion, from philosophical considerations to analytical tools.

Fig. (2).

Onion model to study factors associated with child mortality.

3.1. Layer 1: Philosophical World View

In this research, a pragmatic worldview is adopted (as shown in Fig. 2), which allows for choosing methods and strategies based on what best solves the research problem. This perspective promotes the application of several quantitative methods to validate hypotheses and generate manageable solutions [18], [21], [22]. Because this study focuses on the statistical investigation of East African child mortality using survey data and panel data, it follows that a pragmatic approach can align with both the complexity of the research questions and the necessity of an evidence-based, rigorous approach to the data itself [21].

3.2. Layer 2: Research Approach

This study employs a deductive research approach. We start by stating hypothesis such saying certain socioeconomic environmental factors like sanitation, maternal education and economic conditions, play a significant role in child mortality rates [18], [22], [23]The researcher then puts these hypotheses to the test, by analysing data from several surveys across several East African countries and uses statistical models to confirm or reject the suggested relationships [23], [24], [25], [26].

3.3. Layer 3: Methodological Choice

The study, therefore, maintains a strictly quantitative methodological approach. This is a statistical approach that uses numeric data to test relationships among several factors and child mortality [18], [19]. We can then use regression models, analysis of variance (ANOVA), and hybrid models to analyze the data and estimate the effect of maternal education level, sanitation conditions, and economic conditions on child mortality.

3.4. Layer 4: Strategy

An archival strategy is adopted, as this study considers secondary data that is freely available online [18]. Multiple demographic and health surveys obtained from the DHS program and panel data from the World Bank website, as well as academic sources, will be used in this study. These datasets possess some strengths and limitations [9], [25], [27]. To analyse such data efficiently, in the research strategy, some statistical techniques have been adopted:

• Quantifying the relationships between child mortality and independent variables, such as sanitation, maternal education, and economic status, using regression models. Such models can then estimate the strength and direction of these relationships [23].

• ANOVA (Analysis of Variance) is used to assess if the means of child mortality comparing different groups (such as regions or income groups) are significantly different from each other or not.

• Hybrid models combine several statistical techniques, thereby improving the robustness of the analysis. Hybrid models combine techniques such as additive regression and fixed effects regression models to enhance prediction accuracy and offer a comprehensive understanding of the underlying factors contributing to child mortality [23].

We also conduct time-series analysis to account for cross-sectional and temporal variation in the data. However, this strategy enables the study to detect patterns of change over time and across countries, allowing it to account for both differences between and trends within countries.

3.5. Layer 5: Time-Horizon

The study combines both cross-sectional and longitudinal time horizons to examine child mortality in East Africa. The study employs a cross-sectional time horizon, which is achieved by using more than one survey data set collected at different points in time. The cross-sectional perspective enables the examination of disparities in child mortality across geographical and demographic strata at a single time point, resulting in a snapshot of the factors associated with mortality.

The study utilizes panel data, which enables the assessment of changes in child mortality over time within the same group of countries. The longitudinal approach is particularly useful in identifying trends and measuring changes in the impact of socioeconomic and environmental factors over time by analyzing data for the same countries at several time points. This timeframe is significant because sanitation, health expenditures, and gross domestic product have long-term impacts on child mortality, which helps explain the phenomenon.

3.6. Layer 6: Techniques and Procedures

This study encapsulates a set of techniques and procedures to ensure thorough analysis of the data:

• This data is collected from multiple sources using a wide array of survey and panel data, which also includes several variables pertinent to child mortality.

• On the other hand, regression models, ANOVA, and hybrid methods are used for statistical analysis. These methods enable one to quantify the associations and evaluate potentially causal factors associated with child mortality.

• Time-series analysis is used to track trends in child mortality over time.

• To achieve a more comprehensive understanding of the determinants of child mortality, Bayesian Meta-Analysis will be employed to pool the results from various surveys, considering the uncertainty in the data.

3.7. Operationalization of Variables

The current study adopts a multi-layered technique based on the research onion model, with the innermost layer focusing on data collection and analysis for dependent and independent variables. The considered data comes from two sources: aggregated panel values from the World Bank and disaggregated values retrieved from DHS data in four East African nations. As a result, the outcome and explanatory variables must be defined differently to correspond to the subsequent analytical technique.

3.8. Dependent Variable

At the core of the data analysis layer, the dependent variable in the panel data is the under-five mortality rate, defined as the number of deaths among children under the age of five per 1,000 live births. It is a continuous outcome variable calculated annually and provided at the national level. In contrast, the DHS data's dependent variable is the child survival status, which is a dummy variable equal to one if the child died before the age of five and zero if the child lived. It is extracted from the DHS dataset's birth history module, allowing for the use of logistic and survival models in the study.

3.9. Independent Variables

The study employs many independent variables, which are consistent with the methodological choices stated in Layer 3 of the onion (quantitative strategy) and guided by the research strategy in Layer 4 (archival analysis of secondary datasets). Independent variables are included in both datasets, selected based on established literature linking them to child health outcomes, as shown in Table 1 below.

Table 1.
Variables that can be considered when studying factors associated with Under-five Mortality using both Panel and DHS data.
Variables Description Type
Age of a mother Age group of the mother Categorical
The sex of a child Sex of the child Categorical
Mothers’ education level Maternal education level Categorical
Breastfeeding status Breastfeeding status of the child Binary
Household Wealth Wealth index quintiles Ordinal
Sanitation Access to improved sanitation (%) Continuous
Health expenditure Per capita health expenditure Continuous
Gross Domestic Product GDP per capita Continuous
Country Country indicator Categorical
Time Time in years Continuous

To ensure comparability and analytical rigour, properly defined and recoded variables are harmonised across datasets [24], [25]. For example, if the definition of a variable varies between DHS survey years or nations, conventional recoding and categorisation methods are used. Furthermore, panel data variables are log-transformed or differenced as needed to indicate temporal change, and DHS binary or categorical variables are recoded to guarantee consistency across survey waves. This approach supports the development of a model that more broadly generalises the findings and enables the use of macro-level panel regression models as well as micro-level logistic, survival, and hybrid models across various East African contexts.

4. DISCUSSION

This paper provides a collection of statistical models suitable for assessing under-five mortality using DHS and panel data from East African countries [26], [27]. For a beam of DHS data, an extended Cox proportional risks model combined with frailty (the Cox frailty model) is better because it can handle time-to-event results while accounting for unobserved heterogeneity between clusters or provinces [27]. A frailty term is commonly thought to follow a normal distribution and capture the random elements responsible for intra-cluster correlation. The model is superior for survival outcomes such as age at child death, which is commonly provided in DHS birth records [26], [27]. Diagnostic tools such as Schoenfeld residuals and log-minus-log survival plots should be employed to evaluate the proportional hazards hypothesis [28].

Fixed effects analysis is recommended for panel data analysis to account for unobserved, time-invariant differences between nations, which are common in policy and health event analysis [29], [30]. GAMs are used to account for nonlinear interactions between variables, such as the relationship between GDP and Under-five mortality rate [31], [32]. Using a combination of modelling techniques, such as fixed effects and GAMs, is recommended, as it has the advantage of accommodating both smooth functional interaction and fixed heterogeneity [33]. A Bayesian meta-analysis approach is proposed to include effect measures from many DHS survey rounds and states. These settings account for study variation and uncertainty, utilizing informative or weakly informative priors [34]. The posterior estimate is obtained using MCMC techniques, and because it is estimable, convergence will be detected using diagnostics such as trace plots and the Gelman-Rubin statistic [34].

For missing records, multiple imputation using chained equations is recommended, particularly for DHS datasets such as maternal education and wealth index. Furthermore, DHS survey weights sets are employed to account for uneven probabilities of selection and non-response, resulting in a nationally representative and unbiased estimate [35]. Short-lived gaps in panel records may be filled using country-specific historical trends [8]. Major predictors, such as maternal age, urban/rural status, and regional inequalities, should also be excluded or adjusted in the model to minimize the challenges associated with omitted variables. These models can be utilized in future research projects, offering a powerful and adaptable statistical framework for investigating the drivers of under-five mortality in East Africa.

5. LIMITATIONS

Several significant constraints exist when using machine learning on observational data, such as the Demographic and Health Surveys used in this study. First, such data cannot be used to eliminate biases caused by unmeasured confounding [35]. Cultural practices, healthcare quality, intra-household decision-making, and many other determinants are not captured in regular DHS surveys, yet they have a significant impact on child survival results. The omission of such variables might result in biased parameter estimations, as the connections that appear to be observed may be driven by such unobserved patterns rather than true causalities [27], [35]. Second, data harmonisation between DHS survey years is also linked with significant complexity [24], [25]. The questionnaire structure, variable definitions, and sampling frames vary over time. For example, the definitions of greater or access to sanitation may differ between survey rounds, resulting in insufficiently comparable measurements.

Furthermore, country-specific sample tactics, such as stratification procedures, cluster selection, and urban-rural oversampling, may alter the results, meaning that variation is due to the survey design rather than the measured variables [26]. These difficulties have a significant impact on the model's robustness and generalisability since the results may be influenced by methodological discrepancies rather than similarities and differences. Thus, failing to address these restrictions diminishes the validity of statistical inference and renders the conclusions inapplicable to other contexts or policy applications [36], [37].

CONCLUSION

This study presents a comprehensive and semi-structured methodology for analysing under-five mortality in East Africa using the research onion framework. Underpinned by the pragmatic paradigm and deductive approach, the integration of philosophical perspectives, research strategies, and statistical methodologies resulted in an integrated and transparent instrument that other public health researchers may readily adopt. By embracing various quantitative models, such as hybrid regression, Bayesian meta-analysis, extended Cox models, and time series, the current methodology provides the essential flexibility for application to a wide range of data types. Specifically, the inclusion of both cross-sectional and panel data provides a comprehensive chance to investigate historical trends and country-specific patterns in more detail. However, the methodology acknowledges significant drawbacks, such as a high risk of unmeasured confounding due to cross-country diversity in sampling strategies and the challenge of harmonising diverse surveys. In the future, this methodological paradigm can be used to analyze DHS datasets from Kenya, Rwanda, Tanzania, and Uganda. The investigations will utilize extended Cox and hybrid regression models to illuminate the impact of socioeconomic variables on child mortality. Furthermore, they will conduct model diagnostics, uncertainty measurement, and simulation experiments with groups of children, resulting in policy-relevant findings and monitoring of SDG 3.2.

AUTHOR CONTRIBUTION

It is hereby acknowledged that all authors have accepted responsibility for the manuscript's content and consented to its submission. They have meticulously reviewed all results and unanimously approved the final version of the manuscript.

LIST OF ABBREVIATIONS

ANOVA = Analysis of Variance
DHS = Demographic and Health Survey
FE = Fixed Effects
GDP = Gross Domestic Product
GAM = Generalized Additive Model
MCMC = Markov Chain Monte Carlo
SDG = Sustainable Development Goals
U5MR = Under-Five Mortality Rate

ETHICS APPROVAL AND CONSENT TO PARTICIPATE

Not applicable.

HUMAN AND ANIMAL RIGHTS

Not Applicable.

CONSENT FOR PUBLICATION

Not applicable.

AVAILABILITY OF DATA AND MATERIAL

All data generated or analyzed during this study are included in this published article.

FUNDING

None.

CONFLICT OF INTEREST

The authors declare no conflict of interest, financial or otherwise.

ACKNOWLEDGEMENTS

Declared none.

REFERENCES

1
Tulchinsky TH, Varavikova EA. Measuring, monitoring, and evaluating the health of a population. The New Public Health 2014; 91.
2
Ayiko R, Antai D, Kulane A. Trends and determinants of under-five mortality in Uganda. East Afr J Public Health 2009; 6(2): 136-40.
3
Ettarh R, Kimani J. Determinants of under-five mortality in rural and urban Kenya. Rural Remote Health 2012; 12(1): 1812.
4
Nasejje JB, Mwambi HG, Achia TNO. Understanding the determinants of under-five child mortality in Uganda including the estimation of unobserved household and community effects using both frequentist and Bayesian survival analysis approaches. BMC Public Health 2015; 15(1): 1003.
5
Gribble JN, Preston SH. Goals of the World Summit for Children and their implications for health policy in the 1990s. The Epidemiological Transition: Policy and Planning Implications for Developing Countries: Workshop Proceedings 1993.
6
Kumar S, Kumar N, Vivekadhish S. Millennium development goals (MDGS) to sustainable development goals (SDGS): Addressing unfinished agenda and strengthening sustainable development and partnership. Indian J Community Med 2016; 41(1): 1-4.
7
Mosley WH, Chen LC. An analytical framework for the study of child survival in developing countries. Popul Dev Rev 1984; 10: 25-45.
8
Moler-Zapata S, Kreif N, Ochalek J, Mirelman AJ, Nadjib M, Suhrcke M. Estimating the health effects of expansions in health expenditure in Indonesia: a dynamic panel data approach. Appl Health Econ Health Policy 2022; 20(6): 881-91.
9
Hsiao C. Benefits and limitations of panel data. Econom Rev 1985; 4(1): 121-74.
10
Saunders M, Lewis P, Thornhill A. Research methods for business students. 2003.
11
Saunders M, Lewis P, Thornhill A. Research methods for business students 2009.
12
Saunders MN, Lewis P, Thornhill A, Bristow A. Understanding research philosophy and approaches to theory development. 2015. Available from: https://www.researchgate.net/citation/309102603_Understanding_research_philosophies_and_approaches
13
Mbanaso UM, Abrahams L, Okafor KC. Research philosophy, design and methodology. Research Techniques for Computer Science, Information Systems and Cybersecurity 2023.
14
Bianchi L. Exploring ways of defining the relationship between research philosophy and research practice. Jo Emergent Science 2021; 20: 32.
15
Arbale H, Mutisya DN. Book Review: “Research Methods for Business Students” (Eighth Edition) by Mark N. K. Saunders, Philip Lewis, and Adrian Thornhill (Pearson Education, 2019). African Quarter Social SciRev 2024; 1(2): 8-21.
16
Melnikovas A. Towards an explicit research methodology: adapting research onion model for futures studies. J Futures Stud 2018; 23(2)
17
Armstrong M. Design layers. The Students' Guide to Learning Design and Research 2020.
18
Alturki R. Research onion for smart IoT-enabled mobile applications. Sci Program 2021; 2021(1): 1-9.
19
Willson S, Miller K. Data collection 2014.
20
Sahay A. Peeling Saunder’s research onion. Research Gate Art 2016; 3(2): 1-5.
21
Yilmaz K. Comparison of quantitative and qualitative research traditions: Epistemological, theoretical, and methodological differences. Eur J Educ 2013; 48(2): 311-25.
22
Pearl J. The deductive approach to causal inference. J Causal Inference 2014; 2(2): 115-29.
23
Machila N, Sompa M, Muleya G, Pitsoe V. Teachers understanding and attitudes towards inductive and deductive approaches to teaching social sciences. Multid J Lang Soc Sci Edu 2018; 1(2): 120-37.
24
Tomescu-Dubrow I, Wolf C, Slomczynski KM, Jenkins JC, Eds. Survey Data Harmonization in the Social Sciences 2023.
25
Romero HL, Dijkman RM, Grefen PWPJ, van Weele AJ, de Jong A. Measures of process harmonization. Inf Softw Technol 2015; 63: 31-43.
26
Boerma JT, Sommerfelt AE. Demographic and health surveys (DHS): Contributions and limitations. World Health Stat Q 1993; 46(4): 222-6.
27
Ayele DG, Zewotir TT, Mwambi H. Survival analysis of under-five mortality using Cox and frailty models in Ethiopia. J Health Popul Nutr 2017; 36(1): 25.
28
Hess KR. Graphical methods for assessing violations of the proportional hazards assumption in cox regression. Stat Med 1995; 14(15): 1707-23.
29
Allison PD. Fixed effects regression models 2009.
30
Giesselmann M, Schmidt-Catran AW. Interactions in fixed effects regression models. Sociol Methods Res 2022; 51(3): 1100-27.
31
Wood SN. Generalized additive models: An introduction with R 2017.
32
Marra G, Wood SN. Practical variable selection for generalized additive models. Comput Stat Data Anal 2011; 55(7): 2372-87.
33
von Stosch M, Glassey J. Benefits and challenges of hybrid modeling in the process industries: An introduction. Hybrid Modeling in Process Industries 2018; 1-12.
34
Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian data analysis 1995.
35
Zhang Z. Missing data imputation: Focusing on single imputation. Ann Transl Med 2016; 4(1): 9.
36
Frank KA. Impact of a confounding variable on a regression coefficient. Sociol Methods Res 2000; 29(2): 147-94.
37
Kesidou E, Narasimhan R, Ozusaglam S, Wong CY. Dynamic openness for network-enabled product and process innovation: A panel-data analysis. Int J Oper Prod Manage 2022; 42(3): 257-79.