An Analysis of Health Insurance Data Using the Directed Acyclic Graph: An Application in Nigeria

Pannapa Changpetch1, *, Mary I. Akinyemi2
1 Department of Mathematics, Faculty of Science, Mahidol University, Bangkok, Thailand
2 Department of Mathematics, University of Lagos, Akoka, Lagos, Nigeria

Article Metrics

CrossRef Citations:
Total Statistics:

Full-Text HTML Views: 1228
Abstract HTML Views: 525
PDF Downloads: 284
ePub Downloads: 206
Total Views/Downloads: 2243
Unique Statistics:

Full-Text HTML Views: 595
Abstract HTML Views: 266
PDF Downloads: 194
ePub Downloads: 101
Total Views/Downloads: 1156

Creative Commons License
© 2020 Changpetch & Akinyemi.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at Department of Mathematics, Faculty of Science, Mahidol University, Bangkok, Thailand; E-mail:



In this study, we used the total amount of insurance claims from patients in Nigeria as the data to investigate the direct and indirect effects of the diagnoses.


We applied the Directed Acyclic Graph (DAG) with the total amount of the claims for each month for 89 diagnoses using datasets drawn from private insurance companies in Nigeria from January 2015 to September 2016, which provided 21 records for each diagnosis.


The result from DAG showed three pairs of direct effects: (1) Absolute Neutrophil Count (ANC) had a direct effect on appendectomy, (2) Sexually Transmitted Infections (STIs) had a direct effect on caesarean section, and (3) Glaucoma had a direct effect on insomnia.


The most interesting result pertained to the third pair of diagnoses which is pertinent to research worldwide. We not only explored the relationship in a scientific way, but also the direction of the effect provided a basis for recommendations for healthcare in Nigeria and worldwide.

Keywords: Diagnosis, Direct effect, Directed acyclic graph, Glaucoma, Insomnia, Insurance data.


In the present study, the Directed Acyclic Graph (DAG) was used for the first time to analyze health insurance data. The objective was to find relationships between 89 diagnoses in reference to health insurance data from Nigeria. Unlike regression analysis, a classical method was used to explore relationships between variables, and DAG had the key advantage of establishing the direction of the relationships identified.

Established techniques that had been used to analyze insurance data included neural networks [2-6], decision tree [7, 8], association rules [9], Bayesian network [10], and genetic algorithms [11, 12]. However, these techniques were used primarily for detecting fraud, not for investigations related to healthcare. In addition, researchers applied other techniques such as naïve Bayes to evaluate the risk to people carrying life insurance [13], association rules, neural segmentation to detect patterns in pathology services ordered, and to classify general practitioners into groups based on the nature and style of their practice [14].

In our application of DAG to health insurance data, the most interesting result in terms of the 89 diagnoses is that glaucoma has a direct effect on insomnia. Research studies have been undertaken worldwide [15-25] focusing on relationships between these two symptoms, but most theories have yet to be subjected to adequate scientific testing [26]. In addition to identifying this relationship, we considered possible reasons for it, including the possibility that the medication used to treat glaucoma may cause insomnia, which is covered in the discussion section.


This study is based on monthly health insurance data from the information technology department of a local Health Management Organization (HMO) in Nigeria. Covering the period from January 2015 to September 2016, the original data consisted of private insurance company and federal (Nigerian Health Insurance Scheme (NHIS)) records. However, we limited our analysis to the data drawn from the private sector because this sector had a much larger number of claims than the federal sector. It should be noted, however, that we did not have access to demographic information, such as the age and sex of the people who had made the claims.

We summarized the data by finding the total monetary amount of the claims made each month for each diagnosis. Numerous diagnoses had claims totaling less than 100,000 Nigerian Naira for the focal period, and we eliminated these diagnoses from the analysis. For the focal period, one U.S. dollar was equal to 727,415 Naira and 727,776 Naira, in 2015 and 2016, respectively. The diagnoses for each of which all the claims made fell short of 100,000 Nigerian Naira and a few claims with unclear definitions were eliminated. 89 diagnoses remained for the analysis. The variables of interest in our analysis were the total monetary amount of the claims made each month for each of the 89 diagnoses, for which we had monthly data over the 21-months period from January 2015 to September 2016. Therefore, there were 21 observations for 89 variables. The 89 variables considered in the study are listed in Table 1.


In this study, we focused on exploring the relationship between diagnoses based on insurance data in a scientific way. We were interested in both; discovering any relationship that existed, and determining the directions of those relationships, specifically in regard to the direct and indirect effects of one diagnosis on another.

In this section, we constructed a Directed Acyclic Graph (DAG) (also referred to as a Bayesian network [27, 28]) to help identify the direct and indirect effects between the total monetary amount of the claims each month for each of 89 diagnoses on each other as found in an investigation of private insurance data in Nigeria for the focal period. A number of algorithms exist for constructing DAGs, falling essentially into three categories: Constraint-based algorithms, Greedy Search (GS) algorithms [29], and score-based algorithms [30].

Table 1. Variables representing the total monetary amount of the claims each month for 89 diagnoses.
Variable Total Amount of Claims each Month for Diagnosis Variable Total Amount of Claims each Month for Diagnosis Variable Total Amount of Claims each Month for Diagnosis
x1 Abdominal pain x31 Diarrhoea x61 Mss pain/myalgia
x2 Allergy x32 Dyspepsia x62 Muscle spasms
x3 Amenorrhea x33 Ear wax impaction x63 Myopia
x4 Anaemia x34 Enteritis x64 Neonatal jaundice
x5 ANC (absolute neutrophil count) x35 Family planning x65 Neonatal sepsis
x6 Appendicectomy (appendectomy) x36 Food poisoning x66 Normal delivery
x7 Appendicitis x37 Gastritis x67 Otitis media
x8 Arthralgia x38 Gastroenteritis x68 Ovarian cyst
x9 Arthritis x39 Gingivitis x69 Paronychia/whitlow
x10 Asthma x40 Glaucoma x70 Pelvic inflammatory disease
x11 Boil/furunculosis x41 Haemorrhoids x71 Peptic ulcer disease
x12 Breast lump x42 Heartburn/GERD x72 Pharyngitis
x13 Bronchitis x43 Helminthiasis x73 Pneumonia
x14 Burns x44 Hepatitis x74 Presbyopia
x15 Caesarean section x45 Hernia x75 Pterygium
x16 Carbuncle x46 Hyperemesis gravidarum x76 Road traffic accident
x17 Cellulitis x47 Hyperlipidemia x77 RTI (respiratory tract infection)
x18 Cervical spondylosis x48 Hypertension x78 Sepsis
x19 Chicken pox x49 Immunization x79 Sprain/fracture
x20 Circumcision/ear piercing x50 Impetigo x80 Sexually transmitted infection (STI)
x21 Cold/catarrah x51 Infertility x81 Stress
x22 Colitis x52 Injury x82 Tension headache
x23 Conjunctivitis x53 Insomnia x83 Threatened abortion
x24 Constipation x54 Lipoma NOS x84 Tonsilitis
x25 Consultation/review x55 Lower respiratory tract infection (LRTI) x85 Typhoid fever/enteric fever
x26 Coryza/allergic rhinitis x56 Lumbar spondylosis x86 Urinary tract infection (UTI)
x27 Cyesis x57 Malaria fever/plasmodiasis x87 Upper respiratory tract infection (URTI)
x28 Dental caries x58 Measles x88 Uterine myoma/fibroid
x29 Dermatitis x59 Menorrhagia x89 Vaginal candidiasis
x30 Diabetes mellitus x60 Miscarriage - -
Fig. (1). Directed acyclic graph for 89 variables.

We employed an algorithm implemented by the Bayesialab software ( Note that all the variables are discretized in the Bayesialab implementation. We tried all the search algorithms available in the Bayesialab software—which included Maximum Spanning Tree, Taboo, EQ, SopEQ, and Taboo Order—and found that they all yielded the same results.


Our analysis on the total monetary amount of claims each month for each of the 89 diagnoses are represented by x1–x89. The results from DAG are shown in Fig. (1).

In Fig. (1), a direct effect is identified for only three pairs of the 89 diagnoses. In the first pair, ANC (absolute neutrophil count) (x5) had a direct effect on appendectomy (x6). In the second pair, STIs (sexually transmitted infections) (x80) had a direct effect on caesarean section (x15). In the third pair, glaucoma (x40) had a direct effect on insomnia (x53).

The relationship between the first pair, ANC and appendectomy, is easily understood given that a high ANC value is an inflammation marker that indicates the need for an appendectomy procedure [31]. As stated by Al-Gaithy, ANC and the preoperative evaluation of white blood cells (WBCs) are certainly the most widely used references in determining the severity of acute appendicitis [32].

For the second pair, STIs and caesarean section, the relationship can be explained by the recommendation from the Centers for Disease Control and Prevention (CDC) according to which pregnant women who have been diagnosed with an STI should give birth by caesarean section. As indicated in the CDC’s treatment guidelines for sexually transmitted diseases, “Cesarean section is recommended for all women in labor with active genital herpes lesions or early symptoms, such as vulvar pain and itching” [33, 34].

For the relationship between the third pair, glaucoma and insomnia, related studies connect Obstructive Sleep Apnea (OSA), a very common sleep disorder, with a number of eye diseases, including glaucoma, the second leading cause of blindness worldwide [26].


Several studies link OSA to glaucoma. McNab [15] and Robert et al. [16] noticed the occurrence of primary open angle glaucoma or POAG, the most common type of glaucoma, in groups of patients with OSA and floppy eyelid syndrome. Further, researchers have shown that this kind of glaucoma co-occurs with OSA [17-24]. In these studies, 20 to 57% of the sample of patients with POAG or Normal-Tension Glaucoma (NTG), another common kind of glaucoma, were also diagnosed with OSA. In other studies of patients with OSA, researchers have estimated that 2 to 27% of this population have POAG or NTG, as compared with an estimate of 2% in the general population [24, 26]. In addition, Seixas et al. investigated the relationship between visual impairment, insomnia, and anxiety/depression symptoms among Russian immigrants [25]. The results show that after the data were adjusted for the effect of anxiety/depression symptoms, those with a visual impairment were twice as likely as those without a visual impairment to report insomnia.

Most of the previous work has shown that many patients with OSA also have glaucoma. However, none of the work has shown the direction of this relationship.

In this study, the new technique, DAG, is used for the first time in the literature to explore relationships and their directions between diagnoses based on health care insurance data from the private sector in Nigeria in a scientific way. We found that glaucoma had a direct effect on insomnia. With this direction, a factor that may cause a direct effect from glaucoma to insomnia is the medication used for patients with certain kinds of glaucoma. It should be noted, too, that beta-blockers, which are often used to treat glaucoma [35], can cause insomnia [36-38].


Given the evidence of a relationship between glaucoma and insomnia found in the present study, future research should focus on investigating (i) The prevalence rate of insomnia among patients with glaucoma in Nigeria, (ii) The medication used to treat glaucoma in Nigeria, and (iii) The effects of this medication. Based on this research, physicians who treat patients with glaucoma have a basis for educating patients in regard to possible adverse reactions [35]. Research in this area could be extended to include a consideration of the medication used to treat glaucoma in Nigeria and worldwide.


Not applicable.


Not applicable.


Not applicable.


The data supporting the findings of the article is available from corresponding author [P.C] upon reasonable request.




The authors declare no conflict of interest, financial or otherwise.


Declared none.


[1] Li J, Huang KY, Jin J, Shi J. A survey on statistical methods for health care fraud detection. Health Care Manage Sci 2008; 11(3): 275-87.
[2] Cooper C. Turning information into action. Computer associates: The software that manages ebusiness report 2003. Retrieved from
[3] Hall C. Intelligent data mining at IBM: New products and applications. Intell Software Strat 1996; 7(5): 1-11.
[4] He H, Wang J, Graco W, Hawkins S. Application of neural networks to detection of medical fraud. Expert Syst Appl 1997; 13(4): 329-36.
[5] Ortega PA, Figueroa CJ, Ruz GA. 2006; A medical claim fraud/abuse detection system based on data mining: A case study in Chile. Proceedings of the International Conference on Data Mining 224-31.Las Vegas, NV. 2006; pp.
[6] Shapiro AF. The merging of neural networks, fuzzy logic, and genetic algorithms. Insur Math Econ 2002; 31: 115-31.
[7] Bonchi F, Giannotti F, Mainetto G, Pedreschi D. A classification-based methodology for planning auditing strategies in fraud detection. Proceedings of 1999; SIGKDD99: 175-84. [New York, NY.].
[8] Williams GJ, Huang Z. Mining the knowledge mine: The Hot Spots methodology for mining large real world databases.Advanced topics in artificial intelligence (lecture notes in artificial intelligence) 1997; Vol. 1342: 340-8.
[9] Viveros MS, Nearhos JP, Rothman MJ. 1996; Applying data mining techniques to a health insurance information system. Proceedings of the 22nd VLDB Conference 286-94.Mumbai, India. 1996; pp.
[10] Ormerod T, Morley N, Ball L, Langley C, Spenser C. Using ethnography to design a Mass Detection Tool (MDT) for the early discovery of insurance fraud. CHI’03 Extended Abstracts on Human Factors in Computing Systems 2003; 650-1.
[11] He H, Hawkins S, Graco W, Yao X. Application of genetic algorithms and k-nearest neighbour method in real world medical fraud detection problem. J Adv Comput Intell Intelligent Inform 2000; 4(2): 130-7.
[12] Evolutionary Hot Spots data mining: An architecture for exploring for interesting discoveries Williams G. Lecture Notes in Computer Science15741999; : 184-93.
[13] Jurek A, Zakrzewska D. 2008; Improving naïve Bayes models of insurance risk by unsupervised classification. Proceedings of the International Multiconference on Computer Science and Information Technology 137-44.Wisia, Poland. 2008; pp.
[14] Viveros MS, Nearhos JP, Rothman MJ. 1996; Applying data mining techniques to a health insurance information system. Proceedings of the 22nd In: Vijayaraman T M, Alejandro P, Buchmann C, Mohan C, Sarda N L, Eds. International Conference on Very Large Data Bases 286-94.
[15] McNab AA. Floppy eyelid syndrome and obstructive sleep apnea. Ophthal Plast Reconstr Surg 1997; 13(2): 98-114.
[16] Robert PY, Adenis JP, Tapie P, Melloni B. Eyelid hyperlaxity and Obstructive Sleep Apnea (O.S.A.) syndrome. Eur J Ophthalmol 1997; 7(3): 211-5.
[17] Mojon DS, Hess CW, Goldblum D, Böhnke M, Körner F, Mathis J. Primary open-angle glaucoma is associated with sleep apnea syndrome. Ophthalmologica 2000; 214(2): 115-8.
[18] Mojon DS, Hess CW, Goldblum D, et al. Normal-tension glaucoma is associated with sleep apnea syndrome. Ophthalmologica 2002; 216(3): 180-4.
[19] Onen SH, Mouriaux F, Berramdane L, Dascotte JC, Kulik JF, Rouland JF. High prevalence of sleep-disordered breathing in patients with primary open-angle glaucoma. Acta Ophthalmol Scand 2000; 78(6): 638-41.
[20] Marcus DM, Costarides AP, Gokhale P, et al. Sleep disorders: A risk factor for normal-tension glaucoma? J Glaucoma 2001; 10(3): 177-83.
[21] Mojon DS, Hess CW, Goldblum D, et al. High prevalence of glaucoma in patients with sleep apnea syndrome. Ophthalmology 1999; 106(5): 1009-12.
[22] Bendel RE, Kaplan J, Heckman M, Fredrickson PA, Lin SC. Prevalence of glaucoma in patients with obstructive sleep apnoea: A cross-sectional case-series. Eye (Lond) 2008; 22(9): 1105-9.
[23] Geyer O, Cohen N, Segev E, et al. The prevalence of glaucoma in patients with sleep apnea syndrome: Same as in the general population. Am J Ophthalmol 2003; 136(6): 1093-6.
[24] Sergi M, Salerno DE, Rizzi M, et al. Prevalence of normal tension glaucoma in obstructive sleep apnea syndrome patients. J Glaucoma 2007; 16(1): 42-6.
[25] Seixas A, Ramos AR, Gordon-Strachan GM, Fonseca VA, Zizi F, Jean-Louis G. Relationship between visual impairment, insomnia, anxiety/depressive symptoms among Russian immigrants. J Sleep Med Disord 2014; 1(2): 1009.
[26] Waller EA, Bendel RE, Kaplan J. Sleep disorders and the eye. Mayo Clin Proc 2008; 83(11): 1251-61.
[27] Pearl J. Causal inference in statistics: An overview. Stat Surv 2009; 3: 96-146.
[28] Spirtes P, Glymour C, Scheines R. Causation, prediction, and search 2nd ed. 2000.
[29] Scutari M. Learning Bayesian networks with the bnlearn R Package. J Stat Softw 2010; 35(3): 1-22.
[30] Conrady S, Jouffe L. Bayesian networks and BayesiaLab: A practical introduction for researchers 2015.
[31] Andersson RE. Meta-analysis of the clinical and laboratory diagnosis of appendicitis. Br J Surg 2004; 91(1): 28-37.
[32] Al-Gaithy ZK. Clinical value of total white blood cells and neutrophil counts in patients with suspected appendicitis: Retrospective study. World J Emerg Surg 2012; 7(1): 32.
[33] ACOG Practice Bulletin. Clinical management guidelines for obstetrician-gynecologists. No. 82 June 2007. Management of herpes in pregnancy. Obstet Gynecol 2007; 109(6): 1489-98.
[34] Workowski KA, Bolan GA. Centers for disease control and prevention. Sexually transmitted diseases treatment guidelines. Clin Infect Dis 2015; 61(Suppl. 8): S759-62.
[35] Inoue K. Managing adverse effects of glaucoma medications. Clin Ophthalmol 2014; 8: 903-13.
[36] Scheer FAJL, Morris CJ, Garcia JI, et al. Repeated melatonin supplementation improves sleep in hypertensive patients treated with beta-blockers: A randomized controlled trial. Sleep (Basel) 2012; 35(10): 1395-402.
[37] Fares A. Night-time exogenous melatonin administration may be a beneficial treatment for sleeping disorders in beta blocker patients. J Cardiovasc Dis Res 2011; 2(3): 153-5.
[38] Stoschitzky K, Sakotnik A, Lercher P, et al. Influence of beta-blockers on melatonin release. Eur J Clin Pharmacol 1999; 55(2): 111-5.