Enhancing Accuracy of MBTI Personality Prediction Using Deep Ensemble Models and Data Augmentation Techniques

Main Article Content

Devraj Patel
Dr. Sunita Vikrant Dhavale
Dr. Bhushan B. Mhetre

Abstract

Personality traits prediction from text has broad applications in various fields such as recruitment, job performance analysis, adaptive learning and personalised systems. Although traditional psychological assessments are widely used today, they may be subjective and impractical for large-scale deployment because they require the physical presence of a psychologist. This study presents an automated personality prediction model utilising text data. To address class imbalance, a significant factor that degrades model performance on the personality text dataset, a two-tier oversampling strategy has been implemented. The primary contribution of this study is to systematically evaluate the efficacy of various Deep Learning Architectures, including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Bidirectional LSTMs, for MBTI prediction. Additionally, we have explored various ensemble learning approaches by combining separable CNNs, LeNet-5, and LSTM and BiLSTM models, thereby further improving prediction accuracy and generalisation. The experimental results show that integrating the proposed oversampling technique ensemble with the ensemble learning framework achieves higher accuracy, exceeding 87%, and outperforms previous models based solely on a single architecture or machine learning methods. The proposed method enables large-scale personality assessments to be deployed anywhere, at any time, reducing the need for the physical presence of psychologists.

Downloads

Download data is not yet available.

Article Details

Section

Articles

How to Cite

[1]
Devraj Patel, Dr. Sunita Vikrant Dhavale, and Dr. Bhushan B. Mhetre , Trans., “Enhancing Accuracy of MBTI Personality Prediction Using Deep Ensemble Models and Data Augmentation Techniques”, IJRTE, vol. 14, no. 5, pp. 8–18, Jan. 2026, doi: 10.35940/ijrte.D8315.14050126.
Share |

References

Ryan, G., Katarina, P., Suhartono, D.: MBTI personality prediction using machine learning and SMOTE for balancing data based on statement sentences. Information 14(4) (2023),DOI: https://doi.org//10.3390/info14040217.

Soto, C. J. (2018). Big Five personality traits. In M. H. Bornstein, M. E. Arterberry, K. L.Fingerman, & J. E. Lansford (Eds.), The SAGE encyclopedia of lifespan human development (pp. 240-241). Thousand Oaks, CA: Sage. URL: https://www.researchgate.net/publication/324115204_Big_Five_personality_traits.

Kus Hanna Rahmi (2024). The Dark Triad Personality: The Impact and How to Manage at Work. International Journal of Research and Innovation in Social Science (IJRISS), 8(02), 2074-2082. DOI: https://doi.org//10.47772/IJRISS.2024.802147.

Utami, E., Hartanto, A.D., Adi, S., Oyong, I., Raharjo, S.: Profiling analysis of disc personality traits based on Twitter posts in Bahasa Indonesia. Journal of King Saud University - Computer and Information Sciences 34(2), 264–269 (2022)

DOI: https://doi.org/10.1016/j.jksuci.2019.10.008.

Jia, R., Bahoo, R., Cai, Z., Jahan, M.: The hexaco personality traits of higher achievers at the university level. Frontiers in Psychology Volume 13 - 2022 (2022) DOI: https://doi.org/10.3389/fpsyg.2022.881491.

Majima, Seiyu & Markov, Konstantin. (2022). Personality Prediction from Social Media Posts using Text Embedding and Statistical Features. 235-240. DOI: https://doi.org/10.15439/2022F133.

Ontoum, S., Chan, J.H.: Personality type based on Myers-Briggs Type Indicator with text posting style by using traditional and deep learning. arXiv preprint arXiv:2201.08717 (2022) DOI: https://doi.org/10.48550/arXiv.2201.08717.

Lin, H.: Dlp-personality detection: a text-based personality detection framework with psycholinguistic features and pre-trained features. Multimedia Tools and Applications 83, 1–20 (2023) DOI: https://doi.org/10.1007/s11042-023-17015-z.

Pradnyana, G.A., Anggraeni, W., Yuniarno, E.M., Purnomo, M.H.: Enhancing MBTI personality trait prediction from imbalanced social media data using hybrid query expansion ranking and GloVe-BiLSTM. In: 2023 IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–6 (2023). DOI: https://doi.org/10.1109/FUZZ52849.2023.10309718.

Kumar, Akshi & Beniwal, Rohit & Jain, Dipika. (2023). Personality Detection using Kernel-based Ensemble Model for Leveraging Social Psychology in Online Networks. ACM Transactions on Asian and Low-Resource Language Information Processing. 22. DOI: https://doi.org/10.1145/3571584.

Kumar, A., Jain, D.: Emombti-net: Introducing and leveraging a novel emoji dataset for personality profiling with large language models. Lecture Notes in Networks and Systems Proceedings of International Conference on Computing and Communication Networks (2024)DOI: https://doi.org/10.21203/rs.3.rs-4768237/v1

Shahnazari, K., & Ayyoubzadeh, S.M. (2025). Who Are You Behind the Screen? Implicit MBTI and Gender Detection Using Artificial Intelligence. ArXiv, https://arxiv.org/abs/2503.09853.

Bama S, Hema M S, Esakkirajan S, Nageswara Guptha M, A hierarchical transformer network with label attention for personality prediction by MBTI classification, Applied Soft Computing, Volume 178,113267 (2025). DOI: https://doi.org/10.1016/j.asoc.2025.113267.

Bronchal, L.: MBTI Dataset. Kaggle (2018). Accessed: Oct. 21, 2025. URL: https://www.kaggle.com/datasets/datasnaek/mbti-type.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. In: Computer Science, Linguistics (2019). https://api.semanticscholar.org/ CorpusID:160025533.

Holtzman, A., Buys, J., Du, L., Forbes, M., Choi, Y.: The curious case of neural text degeneration. arXiv preprint arXiv:1904.09751 (2019).https://arxiv.org/abs/1904.09751.

Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., Artzi, Y.: Bertscore: Evaluating text generation with Bert. arXiv (2019).https://arxiv.org/abs/1904.09675.

C. Yang and C. Ding, “Learning word embedding with better distance weighting and window size scheduling,” 2024,https://arxiv.org/abs/2404.14631.

Qaiser, S., Ali, R.: Text mining: Use of tf-idf to examine the relevance of words to documents. International Journal of Computer Applications 181 (2018) DOI: https://doi.org/10.5120/ijca2018917395.

Singgalen, Yerik. (2024). Implementation of Global Vectors for Word Representation (GloVe) Model and Social Network Analysis through Wonderland Indonesia Content Reviews. Jurnal Sistem Komputer dan Informatika (JSON). 5. 559−569. DOI: https://doi.org/10.30865/json.v5i3.7569

Park, J.S., Kim, J.: MBTI personality type prediction model using WWT analysis based on the CNN ensemble and GAN. Human-Centric Computing and Information Sciences 13-14 (2023)

DOI: https://doi.org/10.22967/HCIS.2023.13.014.

Zhang, Jingsi & Yu, Xiaosheng & Lei, Xiaoliang & Wu, Chengdong. (2022). A novel deep LeNet-5 convolutional neural network model for image recognition. Computer Science and Information Systems. 19. 36-36. DOI: https://doi.org/10.2298/CSIS220120036Z,

Guo, Y., Li, Y., Wang, L., Rosing, T.: Depthwise convolution is all you need for learning multiple visual domains. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI’19/IAAI’19/EAAI’19. AAAI Press, (2019).

DOI: https://doi.org/10.1609/aaai.v33i01.33018368.

Kosan, M.A., Karacan, H., Urgen, B.A.: Predicting personality traits with semantic structures and LSTM-based neural networks. Alexandria Engineering Journal 61(10),8007–8025 (2022)DOI: https://doi.org/10.1016/j.aej.2022.01.050.

Kaiser, L., Gomez, A., Chollet, F.: Depthwise separable convolutions for neural machine translation. ArXiv abs/1706.03059 (2017) DOI: https://doi.org/10.48550/arXiv.1706.03059.

Ngartera, L. and Diallo, C. (2024) A Comparative Study of Optimisation Techniques on the Rosenbrock Function. Open Journal of Optimization, 13, 51-63. DOI: https://doi.org/10.4236/ojop.2024.133004.

Most read articles by the same author(s)

1 2 3 4 5 6 7 8 9 10 > >>