Dehaqi et al., 2021 - Google Patents
Adversarial image caption generator networkDehaqi et al., 2021
- Document ID
- 14874347876408816400
- Author
- Dehaqi A
- Seydi V
- Madadi Y
- Publication year
- Publication venue
- SN Computer Science
External Links
Snippet
Image captioning is a task to make an image description, which needs recognizing the important attributes and also their relationships in the image. This task requires to generate semantically and syntactically correct sentences. Most image captioning models are based …
- 238000000034 method 0 description 19
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/27—Automatic analysis, e.g. parsing
- G06F17/2705—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/3061—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/68—Methods or arrangements for recognition using electronic means using sequential comparisons of the image signals with a plurality of references in which the sequence of the image signals or the references is relevant, e.g. addressable memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding or deleting nodes or connections, pruning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computer systems utilising knowledge based models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110737801B (en) | Content classification method, apparatus, computer device, and storage medium | |
Kang et al. | Convolve, attend and spell: An attention-based sequence-to-sequence model for handwritten word recognition | |
CN110188358B (en) | Training method and device for natural language processing model | |
Karpathy et al. | Deep visual-semantic alignments for generating image descriptions | |
CN110750959A (en) | Text information processing method, model training method and related device | |
CN110704601A (en) | Method for solving video question-answering task requiring common knowledge by using problem-knowledge guided progressive space-time attention network | |
CN114676234A (en) | Model training method and related equipment | |
CN114818691A (en) | Article content evaluation method, device, equipment and medium | |
CN112309528B (en) | Medical image report generation method based on visual question-answering method | |
CN111985243B (en) | Emotion model training method, emotion analysis device and storage medium | |
CN111145914B (en) | Method and device for determining text entity of lung cancer clinical disease seed bank | |
Halvardsson et al. | Interpretation of swedish sign language using convolutional neural networks and transfer learning | |
Huang et al. | C-Rnn: a fine-grained language model for image captioning | |
Faiyaz Khan et al. | Improved bengali image captioning via deep convolutional neural network based encoder-decoder model | |
Liu et al. | Sign language recognition from digital videos using feature pyramid network with detection transformer | |
Keren et al. | Deep learning for multisensorial and multimodal interaction | |
CN116385937B (en) | Method and system for solving video question and answer based on multi-granularity cross-mode interaction framework | |
Suresh et al. | Image captioning encoder–decoder models using CNN-RNN architectures: A comparative study | |
Ferlitsch | Deep Learning Patterns and Practices | |
CN112132075B (en) | Method and medium for processing image-text content | |
Dehaqi et al. | Adversarial image caption generator network | |
Tannert et al. | FlowchartQA: the first large-scale benchmark for reasoning over flowcharts | |
Alwaneen et al. | Stacked dynamic memory-coattention network for answering why-questions in Arabic | |
CN115759262A (en) | Visual common sense reasoning method and system based on knowledge perception attention network | |
Kyaw et al. | Automated recognition of Myanmar sign language using deep learning module |