Diabetes mellitus is a chronic disease that occurs when one’s pancreas no longer able to produceenough insulin. The long-term hyperglycemia during diabetes causes chronic damage and
dysfunction of various tissues, especially the eyes, kidneys, heart, blood vessels, and nerves.
Nowadays, diabetes is a major public health challenge and a worldwide problem. This paper will
introduce how to use medical data to predict an individual's diabetes with machine learning tools.
The extreme gradient boosting which is the final model suggests that top five important features
which lead to high probability of diabetes are ‘DiabetesPedigreeFunction’ ‘Pregnancies’ ‘BMI’
‘Glucose’ and ‘Insulin’.