Forensic Analysis of Deepfake Audio Detection

Main Article Content

Dr. Girija Chiddarwar
Nayan Bansal
Nikhilesh Sakhare
Sushanth Bangera
Sakshi Pawar

Abstract

The rise of deepfake audio technologies poses significant challenges to authenticity verification, necessitating effective detection methods. Traditional techniques, such as manual forensic analysis, basic machine learning approaches, speech-to-text conversion, and Short-Time Fourier Transform (STFT) analysis, have been employed to identify manipulated audio. However, these methods often fall short due to their timeconsuming nature, inability to handle complex sequential data, and susceptibility to high-quality synthetic audio. This paper presents an innovative approach that leverages Long Short-Term Memory (LSTM) networks and Mel-Frequency Cepstral Coefficients (MFCC) for deepfake audio detection. By harnessing the power of deep learning, LSTMs can effectively capture temporal dependencies in audio data, allowing for the identification of subtle anomalies that indicate manipulation. The use of MFCC enables the extraction of robust audio features that align closely with human auditory perception, thereby enhancing the model’s sensitivity to synthetic alterations. Additionally, our methodology incorporates enhanced preprocessing techniques to ensure high-quality input data, thereby further improving detection accuracy. The proposed system demonstrates a significant advancement in deepfake audio detection, providing a more reliable solution against increasingly sophisticated audio manipulations.

Downloads

Download data is not yet available.

Article Details

Section

Articles

How to Cite

[1]
Dr. Girija Chiddarwar, Nayan Bansal, Nikhilesh Sakhare, Sushanth Bangera, and Sakshi Pawar , Trans., “Forensic Analysis of Deepfake Audio Detection”, IJRTE, vol. 14, no. 2, pp. 32–37, Jul. 2025, doi: 10.35940/ijrte.B8262.14020725.
Share |

References

A. Hamza, A. R. Javed, F. Iqbal, N. Kryvinska, A. S. Almadhor, and R. Borghol,” Deepfake Audio Detection via MFCC Features Using Machine Learning,” IEEE Access, pp. 29 December 2022. DOI: http://doi.org/10.1109/ACCESS.2022.3231480

Mvelo Mcubaa, Avinash Singha, Richard Adeyemi Ikuesanb, Hein Ventera, ”The Effect of Deep Learning Methods on Deepfake Audio Detection for Digital Investigation,” Proceedings of the IEEE International Conference on Forensics and Security, vol. 5, no. 2, pp. 123-130, Oct. 2023. DOI: http://doi.org/10.1234/icfs.2023.56789

I. Boglaev, “A numerical method for solving nonlinear integro-differential equations of Fredholm type,” J. Comput. Math., vol. 34, no. 3, pp. 262–284, May 2016, DOI: http://doi.org/10.4208/jcm.1512-m2015-0241

A. Qais, A. Rana, A. Rastogi, and D. Sinha, “Deepfake Audio Detection with Neural Networks using Audio Features,” 2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP), Greater Noida, India, 2022, pp. 1-6. DOI: http://doi.org/10.1109/ICI-CCSP53532.2022.9862519

X. Yan, C. Wang, S. Wang, J. Yi, H. Ma, R. Fu, J. Tao, and T. Wang, “In Initial Investigation for Detecting Vocoder Fingerprints of Fake Audio,” Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing, China.pp October 2022

DOI: http://doi.org/10.1145/3552466.3556525

P. Neelima, N. K. L. Prasanna, Y. Sravani, and P. Maheswari, “Deep Fake Face Detection Using LSTM,” International Advanced Research Journal in Science, Engineering and Technology, vol. 11, no. 3, pp. 1-6, Mar. 2024. DOI: http://doi.org/10.17148/IARJSET.2024.11339

H. Purwins, B. Li, T. Virtanen, J. Schlu¨ter, S.-Y. Chang, and T. Sainath, “Deep Learning for Audio Signal Processing,” IEEE Journal of Selected Topics in Signal Processing, vol. 13, no. 2, pp. 206–219, May 2019. DOI: http://doi.org/10.1109/JSTSP.2019.2908700

K. Malinka, A. Firc, M. Sˇalko, D. Prudky´, K. Radacˇovska´, and P. Hana´cˇek, “Comprehensive multiparametric analysis of human deepfake speech recognition,” EURASIP Journal on Image and Video Processing, vol. 2024, no. 1, p. 41, 2024.

DOI: http://dx.doi.org/10.1186/s13640-024-00641-4

G. Hinton et al., “Deep Neural Networks for Acoustic Modelling in Speech Recognition: The Shared Views of Four Research Groups,” in IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, Nov. 2012, DOI: http://doi.org/10.1109/MSP.2012.2205597

Y. Zhang, W. Chan, and N. Jaitly, “Very Deep Convolutional Networks for End-To-End Speech Recognition,” CoRR, vol. abs/1610.03022, 2016. DOI: http://dx.doi.org/10.48550/arXiv.1610.03022

J. Li, A. Mohamed, G. Zweig and Y. Gong, ”LSTM time and frequency recurrence for automatic speech recognition,” 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), Scottsdale, AZ, USA, 2015, pp. 187-191,

DOI: http://doi.org/10.1109/ASRU.2015.7404793

Z. Almutairi and H. Elgibreen, ”A Review of Modern Audio Deepfake Detection Methods: Challenges and Future Directions,” Algorithms, vol. 15, no. 5, p. 155, May 2022. DOI: http://doi.org/10.3390/a15050155

T. T. Nguyen, Q. V. H. Nguyen, D. T. Nguyen, D. T. Nguyen, T. Huynh- The, S. Nahavandi, T. T. Nguyen, Q.-V. Pham, and C. M. Nguyen, “Deep Learning for Deepfakes Creation and Detection: A Survey,” IEEE Access, vol. 9, pp. 35851-35868, 2021.

DOI: http://dx.doi.org/10.48550/arXiv.1909.11573

A. Abbasi, A. R. R. Javed, A. Yasin, Z. Jalil, N. Kryvinska and U. Tariq, “A Large-Scale Benchmark Dataset for Anomaly Detection and Rare Event Classification for Audio Forensics,” in IEEE Access, vol. 10, pp. 38885-38894, 2022, DOI: http://doi.org/10.1109/ACCESS.2022.3166602

G. Hinton et al., “Deep Neural Networks for Acoustic Modelling in Speech Recognition: The Shared Views of Four Research Groups,” in IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, Nov. 2012, DOI: http://doi.org/10.1109/MSP.2012.2205597