Application of Machine Learning Models for Patients Health Insurance Cost Prediction
Main Article Content
Abstract
The use of machine learning models to forecast health insurance costs based on personal characteristics is examined in this study. Age, sex, BMI, number of children, smoking status, and region were among the demographic variables included in the dataset. It was investigated how well several machine learning methods, such as Random Forest, Gradient Boosting, and Linear Regression, estimated insurance costs. After preprocessing the dataset by scaling numerical features and encoding categorical variables, k-fold cross-validation was employed to train and evaluate the regression models. The coefficient of determination (R2), mean absolute error (MAE), and root mean squared error (RMSE) were used to evaluate performance. According to experimental results, Gradient Boosting performed better than Random Forest and Linear Regression.
Downloads
Article Details
Section

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
How to Cite
References
Obermeyer, Z., & Emanuel, E. J. (2016). Predicting the Future — Big Data, Machine Learning, and Clinical Medicine. The New England Journal of Medicine, 375(13), 1216-1219. DOI: https://doi.org/10.1056/nejmp1606181
Wager, S., & Athey, S. (2018). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. Journal of the American Statistical Association, 113(523), 1228-1242. DOI: https://doi.org/10.1080/01621459.2017.1319839
Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447- 453. DOI: https://doi.org/10.1126/science.aax2342
Goldstein, B. A., Navar, A. M., Pencina, M. J., & Ioannidis, J. P. (2017). Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. Journal of the American Medical Informatics Association, 24(1), 198-208.
DOI: https://doi.org/10.1093/jamia/ocw042
Choi, E., Schuetz, A., Stewart, W. F., & Sun, J. (2016). Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association, 24(2), 361-370. DOI: https://doi.org/10.1093/jamia/ocw112
Rajkomar, A., Oren, E., Chen, K., et al. (2018). Scalable and accurate deep learning for electronic health records. npj Digital Medicine, 1, 18.
DOI: https://doi.org/10.1038/s41746-018-0029-1
Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep Learning for Healthcare: A Review, Opportunities, and Challenges. Briefings in Bioinformatics, 19(6), 1236-1246. DOI: https://doi.org/10.1093/bib/bbx044
Ng, K., Sun, J., Hu, J., Wang, F., & Shen, Y. (2017). Personalized predictive modeling and risk factor identification using patient similarity. AMIA Annual Symposium Proceedings, 2015, 1176-1185. https://pubmed.ncbi.nlm.nih.gov/26306255/
Paul Thomas, Yabin. (2024). Application Of Data Mining In Health Care. International Research Journal of Modernisation in Engineering, Technology, and Science. 06. 2582-5208. DOI: https://www.doi.org/10.56726/IRJMETS7375510
Futoma, J., Simons, M., Panch, T., Doshi-Velez, F., & Celi, L. A. (2017). Predicting disease progression with a model combining sequence and non-sequence data. International Conference on Machine Learning (ICML). https://proceedings.mlr.press/v56/Futoma16.html
Liu, Y., Chen, P. H. C., Krause, J., & Peng, L. (2019). How to Read Articles That Use Machine Learning: Users’ Guides to the Medical Literature. JAMA, 322(18), 1806- 1816. DOI: https://doi.org/10.1001/jama.2019.16489
Davenport, T., & Kalakota, R. (2019). The Potential for Artificial Intelligence in Healthcare Future Healthcare Journal, 6(2), 94-98.
DOI: https://doi.org/10.7861/futurehosp.6-2-94
Shah, N. D., Steyerberg, E. W., & Kent, D. M. (2018). Big Data and Predictive Analytics: Recalibrating Expectations. Journal of the American Medical Association, 320(1), 27-28. DOI: https://doi.org/10.1001/jama.2018.5602
Beam, A. L., & Kohane, I. S. (2018). Big Data and Machine Learning in Health Care. JAMA, 319(13), 1317-1318.
DOI: https://doi.org/10.1001/jama.2017.18391
Chen, J. H., & Asch, S. M. (2017). Machine Learning and Prediction in Medicine — Beyond the Peak of Inflated Expectations. The New England Journal of Medicine, 376(26), 2507-2509. DOI: https://doi.org/10.1056/nejmp1702071
Rutter, J. L., & Boudreault, D. J. (2019). Artificial Intelligence in Health Care: Benefits and Challenges of Machine Learning Approaches. Applied Clinical Informatics, 10(5), 844-846. DOI: https://doi.org/10.3346/jkms.2020.35.e379