Archive | 2021

VISION AID: Scene Recognition Through Caption Generation Using Deep Learning

Abstract

Visually impaired individuals heavily trust their alternative senses like acoustic signals and touch to comprehend the world outside. It is incredibly tough for a visually handicapped individual, to perceive objects without feeling them. But there could be times when physical contact between the individual and the object is risky or deadly. This proposed paper presents a real-time object recognition application to aid the visually impaired. A camera linked mobile phone with systematised orientation, given as input to a computer device for generation of real-time object detection. The proposed project utilises a convolutional neural network (CNN) to recognise pre-trained items in captured imagery and uses recurrent neural network (RNN) with LSTM for generation of captions. Here, the caption dataset is utilised for the training of captioning model. After the training, these neural models can generate captions of objects. The network output can then be analysed to impart to those with visual impairment. This is put forth in audio format by converting the generated captions to audio. Exploratory outcomes on the MS-COCO dataset show that our design beats the best in class.

Volume None

Archive | 2021

VISION AID: Scene Recognition Through Caption Generation Using Deep Learning

Abstract

Volume None

Pages 473-481

DOI 10.1007/978-981-33-4543-0_51

Language English

Journal None

Full Text