DEEP LEARNING IN IMAGE RECOGNITION: A COMPARATIVE REVIEW OF ARCHITECTURES AND MODELS

Abstract View PDF Download PDF

##plugins.themes.academic_pro.article.main##

Alladi Deekshith

Abstract

Deep learning has revolutionized image recognition, providing state-of-the-art performance across various applications, from medical diagnostics to autonomous vehicles. This comparative review explores the evolution of deep learning architectures and models used in image recognition. We categorize and analyze prominent architectures, including Convolutional Neural Networks (CNNs), Residual Networks (ResNets), Inception Networks, and more recent developments like Vision Transformers (ViTs). The review highlights key features, strengths, and limitations of each architecture while discussing their performance metrics in standard benchmark datasets such as ImageNet, CIFAR-10, and MNIST. Additionally, we examine the impact of transfer learning, data augmentation, and regularization techniques on model performance. By synthesizing current research, this review aims to provide insights into selecting appropriate architectures for specific image recognition tasks and identifies future research directions to enhance the capabilities of deep learning models in this domain.

##plugins.themes.academic_pro.article.details##

How to Cite
[1]
Alladi Deekshith, “DEEP LEARNING IN IMAGE RECOGNITION: A COMPARATIVE REVIEW OF ARCHITECTURES AND MODELS”, IEJRD - International Multidisciplinary Journal, vol. 8, no. 6, p. 7, Dec. 2023.

References

  1. Alexey, K., & Vincent, Y. (2015). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90. https://doi.org/10.1145/3065386
  2. Chollet, F. (2017). Deep learning with Python. Manning Publications.
  3. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). https://doi.org/10.1109/CVPR.2016.90
  4. Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2261-2269). https://doi.org/10.1109/CVPR.2017.243
  5. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
  6. LeCun, Y., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. https://doi.org/10.1109/5.726791
  7. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440). https://doi.org/10.1109/CVPR.2015.7298965
  8. Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (Vol. 27, pp. 807-814).
  9. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788). https://doi.org/10.1109/CVPR.2016.9
  10. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. In Proceedings of the International Conference on Learning Representations. https://arxiv.org/abs/1409.1556
  11. Szegedy, C., Vanhoucke, V., Vinyals, O., & Google, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9). https://doi.org/10.1109/CVPR.2015.7298594
  12. Tan, M., & Le, Q. V. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning (Vol. 97, pp. 6105-6114). https://arxiv.org/abs/1905.11946
  13. Vaswani, A., Shard, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Kaiser, Ł. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (pp. 5998-6008). https://arxiv.org/abs/1706.03762
  14. Weng, J., Cheng, Y., & Zhao, L. (2018). Deep learning for image classification: A comprehensive review. Journal of Computer Science and Technology, 33(4), 705-726. https://doi.org/10.1007/s11390-018-1824-2
  15. Zhang, K., Zhang, Z., & Chen, Y. (2016). A survey on deep learning-based image recognition. Journal of Computer Science and Technology, 31(1), 85-108. https://doi.org/10.1007/s11390-016-1610-0
  16. Zhang, Y., Song, L., & Wei, X. (2019). Transfer learning for image classification: A survey. IEEE Transactions on Neural Networks and Learning Systems, 30(5), 1357-1377. https://doi.org/10.1109/TNNLS.2018.2810981
  17. Zhao, H., Shi, J., Qi, X., Wang, Z., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 6230-6239). https://doi.org/10.1109/CVPR.2017.623
  18. Zhou, K., Wang, H., & Zhao, X. (2019). A brief review of deep learning for image classification. Journal of Physics: Conference Series, 1396(1), 012023. https://doi.org/10.1088/1742-6596/1396/1/012023
  19. Zhuang, F., et al. (2019). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1), 43-76. https://doi.org/10.1109/JPROC.2020.2979930
  20. Zhang, Y., & Xu, B. (2019). A comprehensive review on image recognition with deep learning. Neural Computing and Applications, 32(5), 1551-1563. https://doi.org/10.1007/s00500-018-3774-8

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.