Comparative Analysis of Deep fake Video and Audio Detection

Harish Chandra Prasad; Arti Gautam Dinker

doi:10.70454/JRIST.2025.10102

Authors

Harish Chandra Prasad School of Infmn&Commn Technology Gautam Buddha University, Greater Noida, Uttar Pradesh, India. Author
Dr. Arti Gautam Dinker School of Information & Communication Technology Gautam Buddha University, Greater Noida, Uttar Pradesh, India Author

DOI:

https://doi.org/10.70454/JRIST.2025.10102

Keywords:

Deep fake Detection, ResNet50, Efficient Net B0, MFCC, Random Forest Classifier

Abstract

Deepfake ("deep learning + fake" = DF) refers to the forged videos and audios generated using AI algorithms. While they can be a source of entertainment,theycanalsobeharmfulinvariousways.Manipulatingbothau- diosandvideosforharmfulpurposeshasbeen a concerning issue from the past more than 10 years. The ability todetect these videos and audios through AI detectors is a motivatingfactor in achieving the best results for the project. Thispapercontainscomparativestudyoftheexistingresearchondeepfakeissues showcasingAccuracy,F1Score,Barplotsandgraphsofthesame.Whileexplo- rationofdeepfakevideoshaveseveralapproachesanddatasetsavailable,theau- diodeepfakes has been relatively neglected. In this work, we propose theidea ofjointdeepfakevideoandaudiodetectionusingahybrid deep learning model ensembling ResNet50 and EfficientNet B0. The dataset comprising real and synthetic voice recordings was selected from the SceneFake repository on Kaggle. Key audio features, including Mel- frequency cepstral coefficients (MFCCs), spectral centroid, chroma, zero- crossing rate, and root mean square energy(RMSE),wereextractedusingthelibrosalibrary. RandomForestClassi- fierwastrainedtodetectaudiowhileDFDVdatasetwasutilizedtoextractfacial frames from videos.

References

[1] Detecting Deepfakes: Can You Trust What You See? YouTube, 2024. [Online]. Available: https://www.youtube.com/watch?v=wJYY0ngBwT0. Accessed: Nov. 25, 2024.

[2] A. Jadhav, A. Patange, J. Patel, H. Patil, and M. Mahajan, "Deepfake Video Detection Using Neural Networks," IJSRD - International Journal for Scientific Research and Development, vol. 8, no. 1, pp. 1016–1022, 2020.

[3] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning Internal Representations by Error Propagation. Technical Report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985.

[4] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial networks," arXiv preprint arXiv:1406.2661, 2014.

[5] Y. Jia, Y. Zhang, R. J. Weiss, Q. Wang, J. Shen, F. Ren, Z. Chen, P. Nguyen, R. Pang, I. L. Moreno, and Y. Wu, "Transfer learning from speaker verification to multispeaker text-to-speech synthesis," arXiv preprint arXiv:1806.04558, 2019.

[6] A. Smith, "Deepfakes are the most dangerous crime of the future, researchers," The Independent, 2020. [Online]. Available: https://www.independent.co.uk/life-style/gadgets-and-tech/news/deepfakes-dangerous-crime-artificial-intelligence-a9655821.html. Accessed: May 31, 2021.

[7] A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, and M. Nießner, "FaceForensics++: Learning to detect manipulated facial images," in Proceedings of the International Conference on Computer Vision (ICCV), 2019.

[8] S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, "CNN-generated images are surprisingly easy to spot...for now," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[9] N. Yu, T. S. Davis, and M. Fritz, "Attributing fake images to GANs: Learning and analyzing GAN fingerprints," in Proceedings of the International Conference on Computer Vision (ICCV), 2019.

[10] L. Chai, D. Bau, S.-N. Lim, and P. Isola, "What makes fake images detectable? Understanding properties that generalize," in Proceedings of the European Conference on Computer Vision (ECCV), 2020.

[11] L. Li, J. Bao, T. Zhang, H. Yang, D. Chen, F. Wen, and B. Guo, "Face X-ray for more general face forgery detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.

[12] Y. Li and S. Lyu, "Exposing deepfake videos by detecting face warping artifacts," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019.

[13] S. Agarwal, H. Farid, Y. Gu, M. He, K. Nagano, and H. Li, "Protecting world leaders against deep fakes," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019.

[14] S. Agarwal, H. Farid, T. El-Gaaly, and S.-N. Lim, "Protecting world leaders against deep fakes," in Proceedings of the IEEE International Workshop on Information Forensics and Security (WIFS), 2020.

[15] H. B. Zhuo, Y. L. Zhang, and N. Wang, "Generative Adversarial Networks in Biometric Presentation Attack Detection: A Comprehensive Survey," IEEE Access, vol. 8, pp. 19536–19555, Feb. 2020.

[16] Z. Cai, S. Ghosh, A. P. Adatia, M. Hayat, A. Dhall, T. Gedeon, and K. Stefanov, "AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset," in Proceedings of the 32nd ACM International Conference on Multimedia, 2024.

[17] "SceneFake," Kaggle, Apr. 20, 2024. Available: https://www.kaggle.com/datasets/mohammedabdeldayem/scenefake.

[18] "Deepfake Detection Challenge," Kaggle. Available: https://www.kaggle.com/c/deepfake-detection-challenge/overview.

[19] A. Hamza, A. R. Javed, F. Iqbal, N. Kryvinska, A. S. Almadhor, Z. Jalil, and R. Borghol, "Deepfake Audio Detection via MFCC Features Using Machine Learning," IEEE Access, vol. 10, Dec. 2022.

Comparative Analysis of Deep fake Video and Audio Detection

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

ISSN Numbers

Article Template

Language

Information

Abstract & Indexing