# Mathematical analysis in the prediction of Diabetes Mellitus in Pima Indian population.

1. INTRODUCTION

The work aims at a clinical study of actual patients acquiring diabetes mellitus both by hereditary factors as well as by phenotypic characteristics. Data (http://www.diabetes.org ...) have been collected from about 768 Indian Origin females who were tested for the presence of diabetes mellitus of which 268 were tested to be positive. The eight major factors, which were reasonable for the cause of diabetes, were experimentally measured. They are PRG (No of times pregnant), PLASMA (Plasma glucose concentration in Salvia), BP (Diastolic blood pressure), THICK (Forceps skin fold thickness), INSULIN (Two hours serum insulin), Body (Body mass index (weight/height), PEDIGREE (Diabetes pedigree function), AGE (in years), RESPONSE

(1: Diabetic 0: Non Diabetic). A statistical model (Richard and Lippmann 1991) has been developed as explained below and significant conclusions about the dominance of each factor in the cause(http://www.diabetes.ca) of the disorder and statistical coefficients are calculated to substantiate the conclusions.

2.MATERIALS AND METHODS

SAMPLE-DATA(http://niddk.nith.gov)--FROM PIMA INDIAN DIABETES DATA BASE(768)FROM

2.1 ASSOCIATION OF ATTRIBUTES

From the theory of attributes two attributes are said to be associated, if they appear together in a large number of cases than expected of them. When two attributes are present or absent together in the data, and if the actual frequency is more than the expected frequency it is called positive association (http:///www.methodisthealth.com) and if less than the expected frequency it is called negative association and if no association occurs they are said to be independent. Yule's coefficient of association is tested in the data if Y is found to be +1 there is perfect association and if Y = -1 there is perfect negative association(Pillai and Bagavathi 2001).

Yule's coefficient = (AB) ([alpha][beta]) - (A[beta]) ([alpha][beta])/(A[beta]) + A[beta] ([alpha][beta]) Y = -0 7511

The attack of diabetes is negatively associated with females of age below 22 years.

Y = 0.4632 Y = is positive indicating that diabetes attribute is positively associated with age above 40 years.

2.3.POSTERIOR PROBABILITY METHOD (Hung et al 1996)

NULL HYPOTHESIS:

"All the eight factors contribute to the existence of diabetes and they equally contribute to the cause of Diabetes. Let P ([X.sub.1]) X P ([X.sub.2])........P ([X.sub.8]) = P ([X.sub.1]N [X.sub.2]N [X.sub.3]..........N [X.sub.8]).P ([E.sub.1]/A) IS THE POSTERIOR PROBABILITY(Richard and Lippmann 1991) OF A DIABETIC PATIENT DUE TO ONE OF THE FACTORS. LET [E.sub.i] WHERE [E.sub.1],, [E.sub.2], [E.sub.3]......... [E.sub.8] BE A SET OF INDEPENDENT EVENTS which are factors like PRG, PLASMA, BP, THICK, INSULIN, BMI, DPF, AGE, and RESPONSE. FIRST THE MEAN AND PROBABILITIES OF EACH EVENT IS CALCULATED.

Let a be the main event under consideration (diabetic or not) now to check by bayesian principle of conditional probability (Lippman 1987):

P [([E.sub.1]/A).sup.*] P (A/[E.sub.1])

P ([E.sub.1]/A) =

[8.summation over (i = 1)] ([E.sub.1]) * P (A/[E.sub.1])

P ([X.sub.1]) i.e. [P.sub.1] = 0.0106 [P.sub.2] = 0.3367 [P.sub.3] = 0.1923 [P.sub.4] = 0.0570 [P.sub.5] = 0 .2224 [P.sub.6] = 0.0891 [P.sub.7] = 0.00128 [P.sub.8] = 0.0923

PRODUCT P ([X.sub.1]) X P ([X.sub.2]) X P ([X.sub.3]).... P ([X.sub.8]) = 5.959 X [10.sup.-10] = 0 SUM = P ([X.sub.1]) + P ([X.sub.2]) + P ([X.sub.3]) + ... P ([X.sub.8]) = 0 .99 Since P ([E.sub.1]/A) n ([E.sub.2]/A)...........n ([E.sub.8]/A) = 0

No individual has all 8 factors so every factor is a positive contributor to the Diabetes disorder independently, which proves NULL HYPOTHESES

3.RESULTS & CONCLUSIONS

The work aimed at a clinical study of actual patients acquiring diabetes mellitus to form a mathematical model both by hereditary factors as well as by phenotypic characteristics. From TABLE-2 and TABLE-3 the correlation between each of the variable with every other variable has been found, particularly the relation of each variable with variable 9 (diabetic response) is around .9 or unity, which shows that

* Each factor contributes to the cause of diabetes.

* Each factor is related to every other factor which shows that all 8 Parameters are interdependent for causing diabetes mellitus * The maximum correlation between variable 8 (AGE) and diabetic response factor was found. Hence age is a major factor in causing diabetes.

* So no individual has all 8 factors. So every factor is a positive contributor to the diabetes disorder independently which proves our Null hypothesis.

From Yule's coefficient the attack of diabetes is negatively associated with females of age below 22 years and positively associated with age above 40 years

ACKNOWLEDGEMENTS

We are grateful to my Diabetologist at Trichy and Prof. Dr. RM. Pitchappan, Senior Professor and Head, Department of Immunology, School of Biological Sciences, Centre for Excellence in Genomic Sciences, Madurai Kamaraj University, Madurai and consultant All India Institute of Medical Sciences for support and encouragement

REFERENCES

(1.) Gupta, S.C. and Kapoor,V.K. 2001. Fundamentals of Applied Statistics. Sultan Chand & sons, third edition, Jan 2001.

(2.) http://www.diabetes.org home page for the American Diabetes Association.

(3.) .http://www.diabetes.ca/about diabetes/index.html Contains facts about type 1 and type 2 diabetes.

(4.) http://www.methodisthealth.com/diabetes/1.signs.html statistics about type 1 diabetes.

(5.) http://www.niddk.nih.gov/Home page for the National Institute of Diabetes and Digestive and Kidney Diseases.

(6.) Hung, M.S., HU, M.Y.,Shankar, M.S. and Patuwo, B.E.1996. Estimating posterior probabilities in classification problems with neural networks. International Journal of computational Intelligence and Organization.

(7.) Knowler, W.C., Pettitt, D.J., Savage, P.J. and Bennett, P.H. 1981. Incidence in Pima Indians: Contributions to obesity and prenatal diabetes. American Journal of Epidemiology, 113 (2): 144-156.

(8.) Lippmann, R.P. 1987.An Introduction to computing with neural nets. IEEE ASSP Magazine 4:2-22.

(9.) Pillai,R.S.N. and Bagavathi, V.2001. Stastics. S.Chand & Company Ltd, 2001

(10.) Richard,M.D. and Lippmann,R. 1991.Neural network classifiers estimate Bayesian a posterior probabilities. Neural Computation, 3: 461-483.

* R. Jamuna and ** K.Meena

* PG Department of Computer Science,Seethalakshmi Ramasami College, Trichy 620 002,India e-mail: rjamuna2002@yahoo.co.in

** PG & Research Department of Computer Science, Srimathi Indira Gandhi College, Trichy--620 002.
```TABLE-1:
NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES

PRG PLASMA BP THICK INSULIN BMI DPF AGE RESPONSE

6 148 72 35 0 33.6 0.627 50 1
1 85 66 29 0 26.6 0.351 31 0
8 183 64 0 0 23.3 0.672 32 1
1 89 66 23 94 28.1 0.167 21 0
0 137 40 35 168 43.1 2.288 33 1
5 116 74 0 0 25.6 0.201 30 0

CORRELATION COEFFICIENTS AMONG ATTRIBUTES (htt://www.diabetes.org)

TABLE--2: SPEARMAN'S CORRELATION COEFFICIENT(Gupta and Kapoor 2001])

Spearman's Var 1 Var 2 Var 3 Var 4 Var 5
Coefficient PRG PLASMA BP THICK INSULIN

Var 1 1.000 1.000 0.999 0.999 1.000
Var 2 1.000 1.000 1.000 1.000 1.000
Var 3 0.999 1.000 1.000 1.000 1.000
Var 4 0.999 1.000 1.000 1.000 1.000
Var 5 1.000 1.000 1.000 1.000 1.000
Var 6 0.986 0.986 0.986 0.986 0.987
Var 7 0.886 0.886 0.886 0.886 0.886
Var 8 0.960 0.961 0.961 0.961 0.961
Var 9 0.961 0.960 0.961 0.960 0.961

Spearman's Var 6 Var 7 Var 8 Var 9
Coefficient BMI DPF AGE RESPONSE

Var 1 0.986 0.886 0.960 0.961
Var 2 0.986 0.886 0.961 0.960
Var 3 0.986 0.886 0.961 0.961
Var 4 0.986 0.886 0.961 0.960
Var 5 0.987 0.886 0.961 0.961
Var 6 1.000 0.949 0.992 0.998
Var 7 0.949 1.000 0.978 0.961
Var 8 0.992 0.978 1.000 0.988
Var 9 0.984 0.961 0.988 1.000

TABLE :3 -PEARSON'S CORRELATION COEFFICIENT

Pearson's Var 1 Var 2 Var 3 Var 4 Var 5
Coefficient PRG PLASMA BP THICK INSULIN

Var 1 1.000 0.954 0.965 0.963 0.976
Var 2 0.954 1.000 0.986 0.979 0.977
Var 3 0.965 0.986 1.000 0.985 0.985
Var 4 0.963 0.979 0.985 1.000 0.962
Var 5 0.976 0.977 0.985 0.962 1.000
Var 6 0.956 0.920 0.938 0.913 0.963
Var 7 0.781 0.779 0.803 0.777 0.809
Var 8 0.956 0.909 0.933 0.909 0.957
Var 9 0.961 0.922 0.935 0.920 0.952

Pearson's Var 6 Var 7 Var 8 Var 9
Coefficient BMI DPF AGE RESPONSE

Var 1 0.956 0.781 0.956 0.961
Var 2 0.920 0.779 0.909 0.922
Var 3 0.938 0.803 0.933 0.935
Var 4 0.913 0.777 0.909 0.920
Var 5 0.963 0.809 0.957 0.952
Var 6 1.000 0.889 0.976 0.968
Var 7 0.889 1.000 0.851 0.876
Var 8 0.976 0.851 1.000 0.981
Var 9 0.968 0.876 0.981 1.000

TABLE--4: YULE'S COEFFICIENTS FOR AGE BELOW 21 YEARS

AGE Diabetic Nondiabetic Total

Age below 22 years (AB) ([alpha]B) (B)
5 59 64
Not below 22 years (Ab) ([alpha][beta]) ([beta])
263 441 704
(A) ([alpha]) ([alpha])
Total 268 500 768

AB = 5 [alpha][beta] = 59 A[beta] = 263 [alpha][beta] = 441

TABLE--5: YULE'S COEFFICIENTS FOR AGE ABOVE 40 YEARS

AGE Diabetic Nondiabetic Total

Age above 40 years 107 98 205
Not above 40 years 161 402 563

Total 268 500 768

TABLE--6 :MEAN AND PROBABILITY OF EACH FACTOR IN PIMA INDIAN DATA

PARAMETERS [X.sub.1] [X.sub.2] [X.sub.3] [X.sub.4]

MEAN 3.84 120.8 69.1 20.5
PROBABILITIES 0.005 0.158 0.09 0.027

PARAMETERS [X.sub.5] [X.sub.6] [X.sub.7]

MEAN 79.9 31.9 0.47
PROBABILITIES 0.104 0.042 0.0006

PARAMETERS [X.sub.8] [X.sub.9]

MEAN 33.2 0.348
PROBABILITIES 0.043 0.004
```
COPYRIGHT 2006 A.K. Sharma, Ed & Pub
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2006 Gale, Cengage Learning. All rights reserved.

Author: Printer friendly Cite/link Email Feedback Jamuna, R.; Meena, K. Bio Science Research Bulletin -Biological Sciences Jul 1, 2006 1830 Datamining by logistic regression techniques in Pima Indian diabetes database. Genetic algorithms for decision support in the case of Fuzzy cognitive maps.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters