# Mathematical analysis in the prediction of Diabetes Mellitus in Pima Indian population.

1. INTRODUCTIONThe work aims at a clinical study of actual patients acquiring diabetes mellitus both by hereditary factors as well as by phenotypic characteristics. Data (http://www.diabetes.org ...) have been collected from about 768 Indian Origin females who were tested for the presence of diabetes mellitus of which 268 were tested to be positive. The eight major factors, which were reasonable for the cause of diabetes, were experimentally measured. They are PRG (No of times pregnant), PLASMA (Plasma glucose concentration in Salvia), BP (Diastolic blood pressure), THICK (Forceps skin fold thickness), INSULIN (Two hours serum insulin), Body (Body mass index (weight/height), PEDIGREE (Diabetes pedigree function), AGE (in years), RESPONSE

(1: Diabetic 0: Non Diabetic). A statistical model (Richard and Lippmann 1991) has been developed as explained below and significant conclusions about the dominance of each factor in the cause(http://www.diabetes.ca) of the disorder and statistical coefficients are calculated to substantiate the conclusions.

2.MATERIALS AND METHODS

SAMPLE-DATA(http://niddk.nith.gov)--FROM PIMA INDIAN DIABETES DATA BASE(768)FROM

2.1 ASSOCIATION OF ATTRIBUTES

From the theory of attributes two attributes are said to be associated, if they appear together in a large number of cases than expected of them. When two attributes are present or absent together in the data, and if the actual frequency is more than the expected frequency it is called positive association (http:///www.methodisthealth.com) and if less than the expected frequency it is called negative association and if no association occurs they are said to be independent. Yule's coefficient of association is tested in the data if Y is found to be +1 there is perfect association and if Y = -1 there is perfect negative association(Pillai and Bagavathi 2001).

Yule's coefficient = (AB) ([alpha][beta]) - (A[beta]) ([alpha][beta])/(A[beta]) + A[beta] ([alpha][beta]) Y = -0 7511

The attack of diabetes is negatively associated with females of age below 22 years.

Y = 0.4632 Y = is positive indicating that diabetes attribute is positively associated with age above 40 years.

2.3.POSTERIOR PROBABILITY METHOD (Hung et al 1996)

NULL HYPOTHESIS:

"All the eight factors contribute to the existence of diabetes and they equally contribute to the cause of Diabetes. Let P ([X.sub.1]) X P ([X.sub.2])........P ([X.sub.8]) = P ([X.sub.1]N [X.sub.2]N [X.sub.3]..........N [X.sub.8]).P ([E.sub.1]/A) IS THE POSTERIOR PROBABILITY(Richard and Lippmann 1991) OF A DIABETIC PATIENT DUE TO ONE OF THE FACTORS. LET [E.sub.i] WHERE [E.sub.1],, [E.sub.2], [E.sub.3]......... [E.sub.8] BE A SET OF INDEPENDENT EVENTS which are factors like PRG, PLASMA, BP, THICK, INSULIN, BMI, DPF, AGE, and RESPONSE. FIRST THE MEAN AND PROBABILITIES OF EACH EVENT IS CALCULATED.

Let a be the main event under consideration (diabetic or not) now to check by bayesian principle of conditional probability (Lippman 1987):

P [([E.sub.1]/A).sup.*] P (A/[E.sub.1])

P ([E.sub.1]/A) =

[8.summation over (i = 1)] ([E.sub.1]) * P (A/[E.sub.1])

P ([X.sub.1]) i.e. [P.sub.1] = 0.0106 [P.sub.2] = 0.3367 [P.sub.3] = 0.1923 [P.sub.4] = 0.0570 [P.sub.5] = 0 .2224 [P.sub.6] = 0.0891 [P.sub.7] = 0.00128 [P.sub.8] = 0.0923

PRODUCT P ([X.sub.1]) X P ([X.sub.2]) X P ([X.sub.3]).... P ([X.sub.8]) = 5.959 X [10.sup.-10] = 0 SUM = P ([X.sub.1]) + P ([X.sub.2]) + P ([X.sub.3]) + ... P ([X.sub.8]) = 0 .99 Since P ([E.sub.1]/A) n ([E.sub.2]/A)...........n ([E.sub.8]/A) = 0

No individual has all 8 factors so every factor is a positive contributor to the Diabetes disorder independently, which proves NULL HYPOTHESES

3.RESULTS & CONCLUSIONS

The work aimed at a clinical study of actual patients acquiring diabetes mellitus to form a mathematical model both by hereditary factors as well as by phenotypic characteristics. From TABLE-2 and TABLE-3 the correlation between each of the variable with every other variable has been found, particularly the relation of each variable with variable 9 (diabetic response) is around .9 or unity, which shows that

* Each factor contributes to the cause of diabetes.

* Each factor is related to every other factor which shows that all 8 Parameters are interdependent for causing diabetes mellitus * The maximum correlation between variable 8 (AGE) and diabetic response factor was found. Hence age is a major factor in causing diabetes.

* So no individual has all 8 factors. So every factor is a positive contributor to the diabetes disorder independently which proves our Null hypothesis.

From Yule's coefficient the attack of diabetes is negatively associated with females of age below 22 years and positively associated with age above 40 years

ACKNOWLEDGEMENTS

We are grateful to my Diabetologist at Trichy and Prof. Dr. RM. Pitchappan, Senior Professor and Head, Department of Immunology, School of Biological Sciences, Centre for Excellence in Genomic Sciences, Madurai Kamaraj University, Madurai and consultant All India Institute of Medical Sciences for support and encouragement

REFERENCES

(1.) Gupta, S.C. and Kapoor,V.K. 2001. Fundamentals of Applied Statistics. Sultan Chand & sons, third edition, Jan 2001.

(2.) http://www.diabetes.org home page for the American Diabetes Association.

(3.) .http://www.diabetes.ca/about diabetes/index.html Contains facts about type 1 and type 2 diabetes.

(4.) http://www.methodisthealth.com/diabetes/1.signs.html statistics about type 1 diabetes.

(5.) http://www.niddk.nih.gov/Home page for the National Institute of Diabetes and Digestive and Kidney Diseases.

(6.) Hung, M.S., HU, M.Y.,Shankar, M.S. and Patuwo, B.E.1996. Estimating posterior probabilities in classification problems with neural networks. International Journal of computational Intelligence and Organization.

(7.) Knowler, W.C., Pettitt, D.J., Savage, P.J. and Bennett, P.H. 1981. Incidence in Pima Indians: Contributions to obesity and prenatal diabetes. American Journal of Epidemiology, 113 (2): 144-156.

(8.) Lippmann, R.P. 1987.An Introduction to computing with neural nets. IEEE ASSP Magazine 4:2-22.

(9.) Pillai,R.S.N. and Bagavathi, V.2001. Stastics. S.Chand & Company Ltd, 2001

(10.) Richard,M.D. and Lippmann,R. 1991.Neural network classifiers estimate Bayesian a posterior probabilities. Neural Computation, 3: 461-483.

* R. Jamuna and ** K.Meena

* PG Department of Computer Science,Seethalakshmi Ramasami College, Trichy 620 002,India e-mail: rjamuna2002@yahoo.co.in

** PG & Research Department of Computer Science, Srimathi Indira Gandhi College, Trichy--620 002.

TABLE-1: NATIONAL INSTITUTE OF DIABETES AND DIGESTIVE AND KIDNEY DISEASES PRG PLASMA BP THICK INSULIN BMI DPF AGE RESPONSE 6 148 72 35 0 33.6 0.627 50 1 1 85 66 29 0 26.6 0.351 31 0 8 183 64 0 0 23.3 0.672 32 1 1 89 66 23 94 28.1 0.167 21 0 0 137 40 35 168 43.1 2.288 33 1 5 116 74 0 0 25.6 0.201 30 0 CORRELATION COEFFICIENTS AMONG ATTRIBUTES (htt://www.diabetes.org) TABLE--2: SPEARMAN'S CORRELATION COEFFICIENT(Gupta and Kapoor 2001]) Spearman's Var 1 Var 2 Var 3 Var 4 Var 5 Coefficient PRG PLASMA BP THICK INSULIN Var 1 1.000 1.000 0.999 0.999 1.000 Var 2 1.000 1.000 1.000 1.000 1.000 Var 3 0.999 1.000 1.000 1.000 1.000 Var 4 0.999 1.000 1.000 1.000 1.000 Var 5 1.000 1.000 1.000 1.000 1.000 Var 6 0.986 0.986 0.986 0.986 0.987 Var 7 0.886 0.886 0.886 0.886 0.886 Var 8 0.960 0.961 0.961 0.961 0.961 Var 9 0.961 0.960 0.961 0.960 0.961 Spearman's Var 6 Var 7 Var 8 Var 9 Coefficient BMI DPF AGE RESPONSE Var 1 0.986 0.886 0.960 0.961 Var 2 0.986 0.886 0.961 0.960 Var 3 0.986 0.886 0.961 0.961 Var 4 0.986 0.886 0.961 0.960 Var 5 0.987 0.886 0.961 0.961 Var 6 1.000 0.949 0.992 0.998 Var 7 0.949 1.000 0.978 0.961 Var 8 0.992 0.978 1.000 0.988 Var 9 0.984 0.961 0.988 1.000 TABLE :3 -PEARSON'S CORRELATION COEFFICIENT Pearson's Var 1 Var 2 Var 3 Var 4 Var 5 Coefficient PRG PLASMA BP THICK INSULIN Var 1 1.000 0.954 0.965 0.963 0.976 Var 2 0.954 1.000 0.986 0.979 0.977 Var 3 0.965 0.986 1.000 0.985 0.985 Var 4 0.963 0.979 0.985 1.000 0.962 Var 5 0.976 0.977 0.985 0.962 1.000 Var 6 0.956 0.920 0.938 0.913 0.963 Var 7 0.781 0.779 0.803 0.777 0.809 Var 8 0.956 0.909 0.933 0.909 0.957 Var 9 0.961 0.922 0.935 0.920 0.952 Pearson's Var 6 Var 7 Var 8 Var 9 Coefficient BMI DPF AGE RESPONSE Var 1 0.956 0.781 0.956 0.961 Var 2 0.920 0.779 0.909 0.922 Var 3 0.938 0.803 0.933 0.935 Var 4 0.913 0.777 0.909 0.920 Var 5 0.963 0.809 0.957 0.952 Var 6 1.000 0.889 0.976 0.968 Var 7 0.889 1.000 0.851 0.876 Var 8 0.976 0.851 1.000 0.981 Var 9 0.968 0.876 0.981 1.000 TABLE--4: YULE'S COEFFICIENTS FOR AGE BELOW 21 YEARS AGE Diabetic Nondiabetic Total Age below 22 years (AB) ([alpha]B) (B) 5 59 64 Not below 22 years (Ab) ([alpha][beta]) ([beta]) 263 441 704 (A) ([alpha]) ([alpha]) Total 268 500 768 AB = 5 [alpha][beta] = 59 A[beta] = 263 [alpha][beta] = 441 TABLE--5: YULE'S COEFFICIENTS FOR AGE ABOVE 40 YEARS AGE Diabetic Nondiabetic Total Age above 40 years 107 98 205 Not above 40 years 161 402 563 Total 268 500 768 TABLE--6 :MEAN AND PROBABILITY OF EACH FACTOR IN PIMA INDIAN DATA PARAMETERS [X.sub.1] [X.sub.2] [X.sub.3] [X.sub.4] MEAN 3.84 120.8 69.1 20.5 PROBABILITIES 0.005 0.158 0.09 0.027 PARAMETERS [X.sub.5] [X.sub.6] [X.sub.7] MEAN 79.9 31.9 0.47 PROBABILITIES 0.104 0.042 0.0006 PARAMETERS [X.sub.8] [X.sub.9] MEAN 33.2 0.348 PROBABILITIES 0.043 0.004

Printer friendly Cite/link Email Feedback | |

Author: | Jamuna, R.; Meena, K. |
---|---|

Publication: | Bio Science Research Bulletin -Biological Sciences |

Date: | Jul 1, 2006 |

Words: | 1830 |

Previous Article: | Datamining by logistic regression techniques in Pima Indian diabetes database. |

Next Article: | Genetic algorithms for decision support in the case of Fuzzy cognitive maps. |