Real-time Face Identification from Video in an Uncontrolled Environment using CNN

Patel Bhautika R.; Desai Apurva A

doi:10.35940/ijitee.A3710.15030226

PDF

Published: 28-02-2026

DOI: https://doi.org/10.35940/ijitee.A3710.15030226

Keywords:

Video-Based Face Recognition, Deep Learning, Convolutional Neural Network, Multi-TaskCascaded Convolutional Neural Network

Patel Bhautika R.

Assistant Professor, Smt. Tanuben & Dr. Manubhai Trivedi College of Information Science, Surat (Gujarat), India.

https://orcid.org/0000-0003-4617-4172

Desai Apurva A

Professor and Head, Department of Computer Science, Veer Narmad South Gujarat University, Surat (Gujarat), India.

Abstract

We have witnessed a significant amount of fraud and security issues in modern life. Numerous biometric characteristics, such as the eyes, face, fingers, and palms, are used to address these problems. Among these, facial recognition is considered one of the least intrusive methods and is frequently used to identify or verify an individual. Face recognition is one of the most effective applications of computer vision, and has achieved considerable attention in recent years. Deep learning networks have achieved state-of-the-art performance in still-image-based face recognition. Video-based face recognition is a more complex task than still-image-based face recognition due to video quality, pose variation, occlusion, and illumination, and it also entails processing a large volume of data. We address these challenges by developing an efficient deep learning model trained, tested, and evaluated on the YouTube Face Dataset, designed for unconstrained face recognition in videos. In this paper, a deep learning face detection algorithm, Multi-task Cascaded Convolutional Neural Network (MTCNN), is employed to detect and localise faces in videos. Feature extraction and face recognition have been performed by using a convolutional neural network (CNN). This model has been proposed for accurate face detection and recognition from unconstrained video and performs better on the YouTube face dataset. The test accuracy of the proposed model is 93.11%. This work has been conducted to improve face recognition accuracy in the presence of intra-video variations.

Downloads

Download data is not yet available.

Issue

Vol. 15 No. 3 (2026): Volume-15 Issue-3, February 2026

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

CC-BY-NC-ND 4.0

How to Cite

[1]

Patel Bhautika R. and Desai Apurva A , Trans., “Real-time Face Identification from Video in an Uncontrolled Environment using CNN”, IJITEE, vol. 15, no. 3, pp. 1–7, Feb. 2026, doi: 10.35940/ijitee.A3710.15030226.

References

Bhautika, R. Patel, and A. Desai Apurva. "Face Identification through Facial Skeletal Features." Futuristic Trends in Networks and Computing Technologies: Select Proceedings of Fourth International Conference on FTNCT 2021. Singapore: Springer Nature Singapore, (2022): 921-933. URL: https://doi.org/10.1007/978-981-19-5037-7_66

Zheng, Jingxiao, et al. "An automatic system for unconstrained video-based face recognition." IEEE Transactions on Biometrics, Behaviour, and Identity Science 2.3 (2020): 194-209. DOI: https://doi.org/10.1109/TBIOM.2020.2973504

Goodfellow, Ian, et al. Deep learning. Vol. 1. No. 2. Cambridge: MIT Press, 2016. DOI: https://doi.org/10.4258/hir.2016.22.4.351

Singhal, Soniya, Madasu Hanmandlu, and Shantaram Vasikarla. "Video-based face recognition with new classifiers." Journal of Modern Physics 12.03 (2021): 361. DOI: https://doi.org/10.4236/jmp.2021.123026

Yamashita, Rikiya, et al. "Convolutional neural networks: an overview and application in radiology." Insights into imaging 9.4 (2018): 611-629. DOI: https://doi.org/10.1007/s13244-018-0639-9

Michalski, Paweł, Bogdan Ruszczak and Michał Tomaszewski. "Convolutional neural networks implementations for computer vision." International Scientific Conference BCI 2018 Opole. Cham: Springer International Publishing, 2018. DOI: https://doi.org/10.1007/978-3-319-75025-5_10

Chou, Kuan-Yu, et al. "Multi-task cascaded and densely connected convolutional networks applied to human face detection and facial expression recognition system." 2019 International Automatic Control Conference (CACS). IEEE, 2019. DOI: https://doi.org/10.1109/CACS47674.2019.9024357

Albiero, Vitor, et al. "img2pose: Face alignment and detection via 6dof, face pose estimation." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2021. DOI: https://doi.org/10.48550/arXiv.2012.07791

Li, Jian, et al. "DSFD: dual shot face detector." Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019. DOI: https://doi.org/10.1109/CVPR.2019.00520

Zhu, Chenchen, et al. "CMS-RCNN: contextual multi-scale region-based CNN for unconstrained face detection." Deep learning for biometrics. Cham: Springer International Publishing, 2017. 57-79. DOI: https://doi.org/10.1007/978-3-319-61657-5_3

Yang, Shuo, et al. "Wider face: A face detection benchmark." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. DOI: https://doi.org/10.1109/CVPR.2016.596

Zhang, Shifeng, et al. "S3FD: Single shot scale-invariant face detector." Proceedings of the IEEE international conference on computer vision. 2017. DOI: https://doi.org/10.1109/ICCV.2017.30

He, Yonghao, et al. "Lffd: A light and fast face detector for edge devices." arXiv preprint arXiv:1904.10633 (2019).

DOI: https://doi.org/10.48550/arXiv.1904.10633

Shepley, Andrew Jason. "Deep learning for face recognition: a critical analysis." arXiv preprint arXiv:1907.12739 (2019). DOI: https://doi.org/10.48550/arXiv.1907.12739

Jiwei, Xue, et al. "Research on video face retrieval method based on deep learning and key frame." Proceedings of the 2020 4th International Conference on Digital Signal Processing. 2020. DOI: https://doi.org/10.1145/3408127.3408199

Ghosh, Anirudha, et al. "Fundamental concepts of convolutional neural network." Recent trends and advances in artificial intelligence and the Internet of Things. Cham: Springer International Publishing, 2019. 519-567. DOI: https://doi.org/10.1007/978-3-030-32644-9_36

Kumar, Deepak, et al. "Automated panning of video devices." 2017 International Conference on Signal Processing and Communication (ICSPC). IEEE, 2017. DOI: https://doi.org/10.1109/CSPC.2017.8305867

Ranjan, Rajeev, et al. "Deep learning for understanding faces: Machines may be just as good, or better, than humans." IEEE Signal Processing Magazine 35.1 (2018): 66-83. DOI: https://doi.org/10.1109/MSP.2017.2764116

Ferraz, Carolina Toledo, and José Hiroki Saito. "A comprehensive analysis of local binary convolutional neural network for fast face recognition in surveillance video." Proceedings of the 24th Brazilian Symposium on Multimedia and the Web. 2018. DOI: https://doi.org/10.1145/3243082.3267444

Chen, Jianrong, et al. "A novel deep multi-modal feature fusion method for celebrity video identification." Proceedings of the 27th ACM International Conference on Multimedia. 2019 DOI: https://doi.org/10.1145/3343031.3356067.

Yi, Xin, and Xingcheng Luo. "A system for real-time detecting and recognising a person." Proceedings of the 4th International Conference on Communication and Information Processing. 2018. DOI: https://doi.org/10.1145/3290420.3290452

Ding, Changxing, and Dacheng Tao. "Trunk-branch ensemble convolutional neural networks for video-based face recognition." IEEE transactions on pattern analysis and machine intelligence 40.4 (2017): 1002-1014. DOI: https://doi.org/10.1109/TPAMI.2017.2700390

Acuña-Escobar, Diego, Julio Ibarra-Fiallo y Monserrate Intriago-Pazmiño. "Real-time face identification from video surveillance cameras." Proceedings of the Second International Conference on Data Science, E-Learning and Information Systems. 2019. DOI: https://doi.org/10.1145/3368691.3368737

Lin, Guifang, and Wei Shen. "Research on convolutional neural network based on improved Relu piecewise activation function." Procedia Computer Science 131 (2018): 977-984. DOI: https://doi.org/10.1016/j.procs.2018.04.239

Wolf, Lior, Tal Hassner, and Itay Maoz. "Face recognition in unconstrained videos with matched background similarity." CVPR 2011. IEEE, 2011. DOI: https://doi.org/10.1109/CVPR.2011.5995566, works remain significant, see the declaration

Zhang, Kaipeng, et al. "Joint face detection and alignment using multitask cascaded convolutional networks." IEEE signal processing letters 23.10 (2016): 1499-1503. DOI: https://doi.org/10.1109/LSP.2016.2603342

Hancock, John T., and Taghi M. Khoshgoftaar. "Survey on categorical data for neural networks." Journal of Big Data 7.1 (2020): 28. DOI: https://doi.org/10.1186/s40537-020-00305-w

Banerjee, Chaity, Tathagata Mukherjee, and Eduardo Pasiliao Jr. "An empirical study on generalizations of the ReLU activation function." Proceedings of the 2019 ACM Southeast Conference. 2019. DOI: https://doi.org/10.1145/3299815.3314450

Yu, Tong, and Hong Zhu. "Hyper-parameter optimization: A review of algorithms and applications." arXiv preprint arXiv:2003.05689 (2020). DOI: https://doi.org/10.48550/arXiv.2003.05689

Joshi, Sharmad, et al. "Analysis of preprocessing techniques, Keras tuner, and transfer learning on cloud street image data." 2021 IEEE International Conference on Big Data (Big Data). IEEE, 2021. DOI: https://doi.org/10.1109/BigData52589.2021.9671878

Abadi, Martín, et al. "TensorFlow: a system for Large-Scale machine learning." 12th USENIX symposium on operating systems design and implementation (OSDI 16). 2016. DOI: https://doi.org/10.48550/arXiv.1605.08695

Article Sidebar

Main Article Content