Application of Machine Learning Models for Patients Health Insurance Cost Prediction

Dr. Annwesha Banerjee Majumder; Dr. Sumit Das; Aniruddha Biswas; Trisita Ghosh; Raj Poddar; Suchetana Chakraborty

doi:10.35940/ijsce.D3685.15040925

PDF

Published: 30-09-2025

DOI: https://doi.org/10.35940/ijsce.D3685.15040925

Keywords:

Gradient Boosting, Linear Regression, Mean Squared Error, Random Forest, Root Mean Squared Error

Dr. Annwesha Banerjee Majumder

Assistant Professor, Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Dr. Sumit Das

Associate Professor, Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Aniruddha Biswas

Assistant Professor, Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Trisita Ghosh

Assistant Professor, Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Raj Poddar

Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Suchetana Chakraborty

Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Abstract

The use of machine learning models to forecast health insurance costs based on personal characteristics is examined in this study. Age, sex, BMI, number of children, smoking status, and region were among the demographic variables included in the dataset. It was investigated how well several machine learning methods, such as Random Forest, Gradient Boosting, and Linear Regression, estimated insurance costs. After preprocessing the dataset by scaling numerical features and encoding categorical variables, k-fold cross-validation was employed to train and evaluate the regression models. The coefficient of determination (R2), mean absolute error (MAE), and root mean squared error (RMSE) were used to evaluate performance. According to experimental results, Gradient Boosting performed better than Random Forest and Linear Regression.

Downloads

Download data is not yet available.

Issue

Vol. 15 No. 4 (2025): Volume-15 Issue-4, September 2025

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

CC-BY-NC-ND 4.0

How to Cite

[1]

Dr. Annwesha Banerjee Majumder, Dr. Sumit Das, Aniruddha Biswas, Trisita Ghosh, Raj Poddar, and Suchetana Chakraborty, “Application of Machine Learning Models for Patients Health Insurance Cost Prediction”, IJSCE, vol. 15, no. 4, pp. 11–16, Sep. 2025, doi: 10.35940/ijsce.D3685.15040925.

References

Obermeyer, Z., & Emanuel, E. J. (2016). Predicting the Future — Big Data, Machine Learning, and Clinical Medicine. The New England Journal of Medicine, 375(13), 1216-1219. DOI: https://doi.org/10.1056/nejmp1606181

Wager, S., & Athey, S. (2018). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. Journal of the American Statistical Association, 113(523), 1228-1242. DOI: https://doi.org/10.1080/01621459.2017.1319839

Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447- 453. DOI: https://doi.org/10.1126/science.aax2342

Goldstein, B. A., Navar, A. M., Pencina, M. J., & Ioannidis, J. P. (2017). Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. Journal of the American Medical Informatics Association, 24(1), 198-208.

DOI: https://doi.org/10.1093/jamia/ocw042

Choi, E., Schuetz, A., Stewart, W. F., & Sun, J. (2016). Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association, 24(2), 361-370. DOI: https://doi.org/10.1093/jamia/ocw112

Rajkomar, A., Oren, E., Chen, K., et al. (2018). Scalable and accurate deep learning for electronic health records. npj Digital Medicine, 1, 18.

DOI: https://doi.org/10.1038/s41746-018-0029-1

Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep Learning for Healthcare: A Review, Opportunities, and Challenges. Briefings in Bioinformatics, 19(6), 1236-1246. DOI: https://doi.org/10.1093/bib/bbx044

Ng, K., Sun, J., Hu, J., Wang, F., & Shen, Y. (2017). Personalized predictive modeling and risk factor identification using patient similarity. AMIA Annual Symposium Proceedings, 2015, 1176-1185. https://pubmed.ncbi.nlm.nih.gov/26306255/

Paul Thomas, Yabin. (2024). Application Of Data Mining In Health Care. International Research Journal of Modernisation in Engineering, Technology, and Science. 06. 2582-5208. DOI: https://www.doi.org/10.56726/IRJMETS7375510

Futoma, J., Simons, M., Panch, T., Doshi-Velez, F., & Celi, L. A. (2017). Predicting disease progression with a model combining sequence and non-sequence data. International Conference on Machine Learning (ICML). https://proceedings.mlr.press/v56/Futoma16.html

Liu, Y., Chen, P. H. C., Krause, J., & Peng, L. (2019). How to Read Articles That Use Machine Learning: Users’ Guides to the Medical Literature. JAMA, 322(18), 1806- 1816. DOI: https://doi.org/10.1001/jama.2019.16489

Davenport, T., & Kalakota, R. (2019). The Potential for Artificial Intelligence in Healthcare Future Healthcare Journal, 6(2), 94-98.

DOI: https://doi.org/10.7861/futurehosp.6-2-94

Shah, N. D., Steyerberg, E. W., & Kent, D. M. (2018). Big Data and Predictive Analytics: Recalibrating Expectations. Journal of the American Medical Association, 320(1), 27-28. DOI: https://doi.org/10.1001/jama.2018.5602

Beam, A. L., & Kohane, I. S. (2018). Big Data and Machine Learning in Health Care. JAMA, 319(13), 1317-1318.

DOI: https://doi.org/10.1001/jama.2017.18391

Chen, J. H., & Asch, S. M. (2017). Machine Learning and Prediction in Medicine — Beyond the Peak of Inflated Expectations. The New England Journal of Medicine, 376(26), 2507-2509. DOI: https://doi.org/10.1056/nejmp1702071

Rutter, J. L., & Boudreault, D. J. (2019). Artificial Intelligence in Health Care: Benefits and Challenges of Machine Learning Approaches. Applied Clinical Informatics, 10(5), 844-846. DOI: https://doi.org/10.3346/jkms.2020.35.e379

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

Issue

Section

How to Cite

References

Most read articles by the same author(s)