2021 6th International Conference on Communication and Electronics Systems (ICCES) | 2021

Image Captioning Using Inception V3 Transfer Learning Model

 
 
 
 
 

Abstract


As artificial intelligence has grown rapidly in recent years, picture captioning has attracted the interest of numerous experts, which has become a fascinating and challenging challenge. A critical component of scene analysis which combines machine vision and the natural languages of language processing capabilities is visual subtitles which automatically generate natural language interpretations based on image details. This paper utilizes different NLP strategies for perceiving and clarifying an image meaning in a natural language such as English. The proposed Inception V3 image caption generator model uses CNN (Coevolutionary Neural Networks) and LSTM (Long Short-Term Memory) units. The InceptionV3 model has been educated in 1000 different classes on an ImageNet dataset. The model was imported directly from the Keras module of applications. Remove from the InceptionV3model the last classification layer for the dimension (1343,) vector. The embedded matrix is used for vocabulary connections. A building matrix is the linear transformation of the original space with important relations into a real-life space. Image captions are commonly used and, for example, important in implementing interaction between humans and the computer.

Volume None
Pages 1103-1108
DOI 10.1109/ICCES51350.2021.9489111
Language English
Journal 2021 6th International Conference on Communication and Electronics Systems (ICCES)

Full Text