Nothing Special   »   [go: up one dir, main page]

Verma et al., 2024 - Google Patents

Automatic image caption generation using deep learning

Verma et al., 2024

View PDF
Document ID
2300853860263171345
Author
Verma A
Yadav A
Kumar M
Yadav D
Publication year
Publication venue
Multimedia Tools and Applications

External Links

Snippet

Image captioning is an interesting and challenging task with applications in diverse domains such as image retrieval, organizing and locating images of users' interest, etc. It has huge potential for replacing manual caption generation for images and is especially suitable for …
Continue reading at www.researchsquare.com (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F17/30634Querying
    • G06F17/30657Query processing
    • G06F17/3066Query translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2705Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30861Retrieval from the Internet, e.g. browsers
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30017Multimedia data retrieval; Retrieval of more than one type of audiovisual media
    • G06F17/30023Querying
    • G06F17/30038Querying based on information manually generated or based on information not derived from the media content, e.g. tags, keywords, comments, usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/27Automatic analysis, e.g. parsing
    • G06F17/2765Recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/21Text processing
    • G06F17/22Manipulating or registering by use of codes, e.g. in sequence of text characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30781Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F17/30784Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/20Handling natural language data
    • G06F17/28Processing or translating of natural language
    • G06F17/2809Data driven translation
    • G06F17/2827Example based machine translation; Alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/62Methods or arrangements for recognition using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
    • G06K9/36Image preprocessing, i.e. processing the image information without deciding about the identity of the image
    • G06K9/46Extraction of features or characteristics of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Similar Documents

Publication Publication Date Title
Li et al. Visual to text: Survey of image and video captioning
Li et al. Know more say less: Image captioning based on scene graphs
CN108628828B (en) Combined extraction method based on self-attention viewpoint and holder thereof
CN112000818B (en) Text and image-oriented cross-media retrieval method and electronic device
Gupta et al. Integration of textual cues for fine-grained image captioning using deep CNN and LSTM
Xiao et al. Dense semantic embedding network for image captioning
Verma et al. Automatic image caption generation using deep learning
Biswas et al. Towards explanatory interactive image captioning using top-down and bottom-up features, beam search and re-ranking
Li et al. Bundled object context for referring expressions
Salur et al. A soft voting ensemble learning-based approach for multimodal sentiment analysis
Su et al. Hierarchical deep neural network for image captioning
Guo et al. Implicit discourse relation recognition via a BiLSTM-CNN architecture with dynamic chunk-based max pooling
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
Huang et al. An effective multimodal representation and fusion method for multimodal intent recognition
Merkx et al. Learning semantic sentence representations from visually grounded language without lexical knowledge
Pande et al. Development and deployment of a generative model-based framework for text to photorealistic image generation
CN115730232A (en) Topic-correlation-based heterogeneous graph neural network cross-language text classification method
Dai et al. Visual relationship detection based on bidirectional recurrent neural network
Cao et al. Visual question answering research on multi-layer attention mechanism based on image target features
Xie et al. Extractive text-image summarization with relation-enhanced graph attention network
Sharma et al. Graph neural network-based visual relationship and multilevel attention for image captioning
Liu et al. A multimodal approach for multiple-relation extraction in videos
Al-Shamayleh et al. A comprehensive literature review on image captioning methods and metrics based on deep learning technique
Zhang et al. Chinese-English mixed text normalization
Zhan et al. Improving offline handwritten Chinese text recognition with glyph-semanteme fusion embedding