Abstract
This paper outlines a novel advanced framework that combines structurized knowledge and visual models—Computational Knowledge Vision. In advanced studies of image and visual perception, a visual model’s understanding and reasoning ability often determines whether it works well in complex scenarios. This paper presents the state-of-the-art mainstream of vision models for visual perception. This paper then proposes a concept and basic framework of Computational Knowledge Vision that extends the knowledge engineering methodology to the computer vision field. In this paper, we first retrospect prior work related to Computational Knowledge Vision in the light of the connectionist and symbolist streams. We discuss neural network models, meta-learning models, graph models, and Transformer models in detail. We then illustrate a basic framework for Computational Knowledge Vision, whose essential techniques include structurized knowledge, knowledge projection, and conditional feedback. The goal of the framework is to enable visual models to gain the ability of representation, understanding, and reasoning. We also describe in-depth works in Computational Knowledge Vision and its extensions in other fields.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Achille A, Lam M, Tewari R, Ravichandran A, Maji S, Fowlkes CC, Soatto S, Perona P (2019) Task2vec: task embedding for meta-learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6430–6439
Adey P, Shayer M (1988) Strategies for meta-learning in physics. Phys Educ 23(2):97
Ainslie J, Ontanon S, Alberti C, Cvicek V, Fisher Z, Pham P, Ravula A, Sanghai S, Wang Q, Yang L (2020) ETC: encoding long and structured inputs in transformers. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), Association for computational linguistics, Online, pp 268–284. https://doi.org/10.18653/v1/2020.emnlp-main.19. https://www.aclweb.org/anthology/2020.emnlp-main.19
Antol S, Agrawal A, Lu J, Mitchell M, Batra D, Zitnick CL, Parikh D (2015) Vqa: visual question answering. In: Proceedings of the IEEE international conference on computer vision (ICCV)
Araya D (2013) Thinking forward: Conrad wolfram on the computational knowledge economy. E-Learn Digit Media 10(3):324–327
Arditi A, Legge G, Granquist C, Gage R, Clark D (2021) Reduced visual acuity is mirrored in low vision imagery. Br J Psychol 112:611
Aristotle A (1995) The art of rhetoric, trans. John Henry Freese, Loeb Classical Library
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Precup D, Teh YW (eds) Proceedings of the 34th international conference on machine learning. PMLR, International Convention Centre, Sydney, Australia, Proceedings of machine learning research, vol 70, pp 214–223, http://proceedings.mlr.press/v70/arjovsky17a.html
Babak Z, Quoc KT (2021) Deep learning-based pupil model predicts time and spectral dependent light responses. Sci Rep (Nature Publisher Group) 11(1):1–16
Bae H, Kim SJ, Kim CE (2021) Lessons from deep neural networks for studying the coding principles of biological neural networks. Front Syst Neurosci 14:103
Barbu A, Mayo D, Alverio J, Luo W, Wang C, Gutfreund D, Tenenbaum J, Katz B (2019) Objectnet: a large-scale bias-controlled dataset for pushing the limits of object recognition models. In: Wallach H, Larochelle H, Beygelzimer A, d’Alché-Buc F, Fox E, Garnett R (eds) Advances in neural information processing systems, Curran Associates, Inc., vol 32, pp 9453–9463. https://proceedings.neurips.cc/paper/2019/file/97af07a14cacba681feacf3012730892-Paper.pdf
Barsalou LW et al (1999) Perceptual symbol systems. Behav Brain Sci 22(4):577–660
Beltagy I, Peters ME, Cohan A (2020) Longformer: the long-document transformer. arXiv:2004.05150
Bengio Y (2019) From system 1 deep learning to system 2 deep learning. In: Proceedings of thirty-third conference on neural information processing systems
Bengio Y (2020a) Deep learning for system 2 processing. http://www.iro.umontreal.ca/~bengioy/AAAI-9feb2020.pdf
Bengio Y (2020b) Priors for semantic variables. https://www.ias.edu/video/machinelearning/2020/0723-YoshuaBengio
Bengio Y, Ducharme R, Vincent P, Janvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155
Bensusan H, Giraud-Carrier CG, Kennedy CJ (2000) A higher-order approach to meta-learning. ILP Work-in-progress reports 35
Bhatnagar G, Wu QJ, Raman B (2013) Discrete fractional wavelet transform and its application to multiple encryption. Inf Sci 223:297–316. https://doi.org/10.1016/j.ins.2012.09.053
Biggs JB (1985) The role of metalearning in study processes. Br J Educ Psychol 55(3):185–212
Bordes A, Weston J, Collobert R, Bengio Y (2011) Learning structured embeddings of knowledge bases. In: Proceedings of the AAAI conference on artificial intelligence, vol 25
Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ (eds) Advances in neural information processing systems, Curran Associates, Inc., vol 26, pp 2787–2795. https://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf
Bourlard H, Kamp Y (1989) Auto-association by multilayer perceptrons and singular value decomposition. Biol Cybern 59(4):291–294
Brady M (1984) Artificial intelligence and robotics, pp 47–63
Bronskill J, Gordon J, Requeima J, Nowozin S, Turner R (2020) TaskNorm: rethinking batch normalization for meta-learning. In: III HD, Singh A (eds) Proceedings of the 37th international conference on machine learning, PMLR, Proceedings of machine learning research, vol 119, pp 1153–1164. http://proceedings.mlr.press/v119/bronskill20a.html
Bruna J, Zaremba W, Szlam A, LeCun Y (2013) Spectral networks and locally connected networks on graphs. arXiv:1312.6203
Buchanan BG (2005) A (very) brief history of artificial intelligence. AI Mag 26(4):53–53
Cai H, Zheng VW, Chang KC (2018) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30(9):1616–1637. https://doi.org/10.1109/TKDE.2018.2807452
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: Vedaldi A, Bischof H, Brox T, Frahm JM (eds) Computer vision - ECCV 2020. Springer International Publishing, Cham, pp 213–229
Chan PK, Stolfo SJ (1993) Experiments on multistrategy learning by meta-learning. In: Proceedings of the second international conference on information and knowledge management, pp 314–323
Chao WL, Ye HJ, Zhan DC, Campbell M, Weinberger KQ (2020) Revisiting meta-learning as supervised learning. arXiv:2002.00573
Chaum D, Rivest RL, Sherman AT (1983) Advances in cryptology. Springer, New York
Chen T, Lin L, Chen R, Wu Y, Luo X (2018) Knowledge-embedded representation learning for fine-grained image recognition. In: Proceedings of the 27th international joint conference on artificial intelligence. AAAI Press, IJCAI’18, pp 627–634
Child R, Gray S, Radford A, Sutskever I (2019) Generating long sequences with sparse transformers. arXiv:1904.10509
Choi E, Bahadori MT, Song L, Stewart WF, Sun J (2017) Gram: graph-based attention model for healthcare representation learning. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, association for computing machinery, New York, KDD ’17, pp 787–795. https://doi.org/10.1145/3097983.3098126
Choromanski K, Likhosherstov V, Dohan D, Song X, Gane A, Sarlos T, Hawkins P, Davis J, Mohiuddin A, Kaiser L, et al. (2020) Rethinking attention with performers. arXiv:2009.14794
Cini F, Ortenzi V, Corke P, Controzzi M (2019) On the choice of grasp type and location when handing over an object. Sci Robot 4(27):eaau9757. https://doi.org/10.1126/scirobotics.aau9757
Collins H (2010) Tacit and explicit knowledge. University of Chicago Press, Chicago
Cooper SB (2003) Computability theory. CRC Press, Boca Raton
Crevier D, Lepage R (1997) Knowledge-based image understanding systems: a survey. Comput Vis Image Underst 67(2):161–185. https://doi.org/10.1006/cviu.1996.0520
Cunha T, Soares C, de Carvalho AC (2018) Metalearning and recommender systems: a literature review and empirical study on the algorithm selection problem for collaborative filtering. Inf Sci 423:128–144
Dai Z, Yang Z, Yang Y, Carbonell J, Le Q, Salakhutdinov R (2019) Transformer-XL: Attentive language models beyond a fixed-length context. In: Proceedings of the 57th annual meeting of the association for computational linguistics, association for computational linguistics, Florence, pp 2978–2988. https://doi.org/10.18653/v1/P19-1285
Defferrard M, Bresson X, Vandergheynst P (2016) Convolutional neural networks on graphs with fast localized spectral filtering. arXiv:1606.09375
Denzler A, Kaufmann M (2017) Toward granular knowledge analytics for data intelligence: Extracting granular entity-relationship graphs for knowledge profiling. In: 2017 IEEE international conference on big data (Big Data), pp 923–928. https://doi.org/10.1109/BigData.2017.8258010
Descartes R, Haldane ES, Ross GRT (1993) Meditations on first philosophy in focus. Psychology Press, Hove
Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, pp 4171–4186. https://doi.org/10.18653/v1/N19-1423
Edmonds M, Gao F, Liu H, Xie X, Qi S, Rothrock B, Zhu Y, Wu YN, Lu H, Zhu SC (2019) A tale of two explanations: enhancing human trust by explaining robot behavior. Sci Robot 4(37):eaay4663
ElBedwehy MN, Ghoneim ME, Hassanien AE, Azar AT (2014) A computational knowledge representation model for cognitive computers. Neural Comput Appl 25(7):1517–1534. https://doi.org/10.1007/s00521-014-1614-0
Enderton HB (2010) Computability theory: an introduction to recursion theory. Academic Press, Cambridge
Feigenbaum E (2003) Some challenges and grand challenges for computational intelligence. J ACM 50:32–40
Feigenbaum E, McCorduck P (1983) The fifth generation: artificial intelligence and Japan’s computer challenge to the world. Addison-Wesley Longman Publishing Co., Boston
Feigenbaum EA (1961) Soviet cybernetics and computer sciences. IRE Trans Electr Comput EC 10(4):759–776. https://doi.org/10.1109/TEC.1961.5219285
Feigenbaum EA (1977) The art of artificial intelligence. 1. Themes and case studies of knowledge engineering. Tech. rep., Stanford Univ CA Dept of Computer Science
Feigenbaum EA (1992) Expert systems: principles and practice
Feng Y, Chen J, Yang Z, Song X, Chang Y, He S, Xu E, Zhou Z (2021) Similarity-based meta-learning network with adversarial domain adaptation for cross-domain fault identification. Knowl-Based Syst 217:106829. https://doi.org/10.1016/j.knosys.2021.106829
Ferryman JM, Maybank SJ, Worrall AD (2000) Visual surveillance for moving vehicles. Int J Comput Vis 37(2):187–197
Fred A, Dietz JL, Liu K, Filipe J (2020) Knowledge discovery, knowledge engineering and knowledge management. Springer, New York
Fukushima K, Miyake S, Ito T (1983) Neocognitron: a neural network model for a mechanism of visual pattern recognition. IEEE Trans Syst Man Cybern SMC 13(5):826–834. https://doi.org/10.1109/TSMC.1983.6313076
Gallese V, Lakoff G (2005) The brain’s concepts: the role of the sensory-motor system in conceptual knowledge. Cogn Neuropsychol 22(3–4):455–479
Gibson JJ (1977a) The concept of affordances. Perceiving, acting, and knowing 1
Gibson JJ (1977b) The theory of affordances. Hilldale, USA 1(2):67–82
Gilmer J, Schoenholz SS, Riley PF, Vinyals O, Dahl GE (2017) Neural message passing for quantum chemistry. In: Proceedings of the international conference on machine learning. PMLR, pp 1263–1272
Glass GV (1976) Primary, secondary, and meta-analysis of research. Educ Res 5(10):3–8
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press. http://www.deeplearningbook.org
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. http://arxiv.org/abs/1406.2661
Graves A, Mohamed A, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp 6645–6649, https://doi.org/10.1109/ICASSP.2013.6638947
Grier DA (2013) Edward feigenbaum. IEEE Ann Hist Comput 35(4):74–81. https://doi.org/10.1109/MAHC.2013.49
Grover A, Leskovec J (2016) Node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York. KDD ’16, pp 855–864. https://doi.org/10.1145/2939672.2939754
Guo J, Lu S, Cai H, Zhang W, Yu Y, Wang J (2018) Long text generation via adversarial training with leaked information. In: Proceedings of the AAAI conference on artificial intelligence 32(1) https://ojs.aaai.org/index.php/AAAI/article/view/11957
Hafed ZM, Levine MD (2001) Face recognition using the discrete cosine transform. Int J Comput Vision 43(3):167–188
Hamilton WL, Ying R, Leskovec J (2017) Inductive representation learning on large graphs. http://arxiv.org/abs/1706.02216
Hassabis D, Kumaran D, Summerfield C, Botvinick M (2017) Neuroscience-inspired artificial intelligence. Neuron 95(2):245–258
Hasselt Hv, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI Press, AAAI’16, pp 2094–2100
Hasson U, Nastase SA, Goldstein A (2020) Direct fit to nature: an evolutionary perspective on biological and artificial neural networks. Neuron 105(3):416–434. https://doi.org/10.1016/j.neuron.2019.12.002. https://www.sciencedirect.com/science/article/pii/S089662731931044X
Haugeland J (1989) Artificial intelligence: The very idea. MIT press
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90
He Y, Yan R, Fragkiadaki K, Yu SI (2020) Epipolar transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Henaff M, Bruna J, LeCun Y (2014) Deep convolutional networks on graph-structured data. http://arxiv.org/abs/1506.05163
Hendler J, Mulvehill AM (2016) Social machines: the coming collision of artificial intelligence, social networking, and humanity. Apress
Hinton GE (2009) Deep belief networks. Scholarpedia 4(5):5947
Hinton GE, et al. (1986) Learning distributed representations of concepts. In: Proceedings of the eighth annual conference of the cognitive science society, Amherst, MA, vol 1, p 12
Ho J, Kalchbrenner N, Weissenborn D, Salimans T (2019) Axial attention in multidimensional transformers. http://arxiv.org/abs/1912.12180
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Höllerer MA, van Leeuwen T, Jancsary D, Meyer RE, Andersen TH, Vaara E (2019) Visual and multimodal research in organization and management studies. Routledge, London
Honavar V (1995) Symbolic artificial intelligence and numeric artificial neural networks: towards a resolution of the dichotomy, Springer US, Boston, pp 351–388. https://doi.org/10.1007/978-0-585-29599-2_11
Hong Y, Li Q, Ciao D, Huang S, Zhu SC (2021a) Learning by fixing:solving math word problems with weak supervision. In: Proceedings of the thirty-fifth AAAI conference on artificial intelligence
Hong Y, Li Q, Gong R, Ciao D, Huang S, Zhu SC (2021b) Smart: a situation model for algebra story problems via attributed grammar. In: Proceedings of the thirty-fifth AAAI conference on artificial intelligence, AAAI-21
Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abilities. Proc Natl Acad Sci 79(8):2554–2558. https://doi.org/10.1073/pnas.79.8.2554
Hospedales T, Antoniou A, Micaelli P, Storkey A (2020) Meta-learning in neural networks: a survey. http://arxiv.org/abs/2004.05439
Høye TT, Ärje J, Bjerge K, Hansen OLP, Iosifidis A, Leese F, Mann HMR, Meissner K, Melvad C, Raitoharju J (2021) Deep learning and computer vision will transform entomology. Proc Natl Acad Sci 118(2). https://doi.org/10.1073/pnas.2002545117,
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2261–2269. https://doi.org/10.1109/CVPR.2017.243
Huang Q, Yang L, Huang H, Wu T, Lin D (2020) Caption-supervised face recognition: training a state-of-the-art face model without manual annotation. In: Vedaldi A, Bischof H, Brox T, Frahm JM (eds) Computer vision-ECCV 2020. Springer International Publishing, Cham, pp 139–155
Huang TJ (2017) Imitating the brain with neurocomputer a new way towards artificial general intelligence. Int J Autom Comput 14(5):520–531
Huisman M, van Rijn JN, Plaat A (2020) A survey of deep meta-learning. http://arxiv.org/abs/2010.03522
Hulme PE (2014) Bridging the knowing–doing gap: know-who, know-what, know-why, know-how and know-when. Wiley Online Library
Iglesias A, del Castillo M, Serrano J, Oliva J (2012) A computational knowledge-based model for emulating human performance in the iowa gambling task. Neural Netw 33:168–180. https://doi.org/10.1016/j.neunet.2012.05.008
Jiang X, Yu J, Qin Z, Zhuang Y, Zhang X, Hu Y, Wu Q (2020) Dualvd: an adaptive dual encoding model for deep visual understanding in visual dialogue. In: Proceedings of the AAAI conference on artificial intelligence 34(07):11125–11132. https://doi.org/10.1609/aaai.v34i07.6769. https://ojs.aaai.org/index.php/AAAI/article/view/6769
Johnson M (2008) The meaning of the body: aesthetics of human understanding. University of Chicago Press, Chicago
Joshi C (2020) Transformers are graph neural networks. The Gradient
Kahneman D (2011) Thinking, fast and slow. Macmillan, London
Kambhampati S (2021) Polanyi’s revenge and ai’s new romance with tacit knowledge. Commun ACM 64(2):31–32. https://doi.org/10.1145/3446369
Karras T, Aila T, Laine S, Lehtinen J (2018) Progressive growing of GANs for improved quality, stability, and variation. In: Proceedings of the international conference on learning representations. https://openreview.net/forum?id=Hk99zCeAb
Katharopoulos A, Vyas A, Pappas N, Fleuret F (2020) Transformers are rnns: Fast autoregressive transformers with linear attention. In: Proceedings of the international conference on machine learning (ICML)
Kinderkhedia M (2019) Learning representations of graph data–a survey. http://arxiv.org/abs/1906.02989
Kingma DP, Welling M (2013) Auto-encoding variational bayes. http://arxiv.org/abs/1312.6114
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. http://arxiv.org/abs/1609.02907
Kitaev N, Kaiser L, Levskaya A (2020) Reformer: the efficient transformer. In: Proceedings of the international conference on learning representations. https://openreview.net/forum?id=rkgNKkHtvB
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems. Curran Associates, Inc., vol 25, pp 1097–1105. https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Lamb L, Garcez A, Gori M, Prates M, Avelar P, Vardi M (2020) Graph neural networks meet neural-symbolic computing: a survey and perspective. http://arxiv.org/abs/2003.00330
Layer A (2017) Computer networking: a top down approach
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: Proceedings of the international conference on machine learning. PMLR, pp 1188–1196
Le Cacheux Y, Popescu A, Le Borgne H (2020) Webly supervised semantic embeddings for large scale zero-shot learning. In: Proceedings of the Asian conference on computer vision (ACCV)
Le Cun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Handwritten digit recognition with a back-propagation network. In: Proceedings of the 2nd international conference on neural information processing systems. MIT Press, Cambridge, NIPS’89, pp 396–404
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Lee J, Lee Y, Kim J, Kosiorek A, Choi S, Teh YW (2019) Set transformer: A framework for attention-based permutation-invariant neural networks. In: Chaudhuri K, Salakhutdinov R (eds) Proceedings of the 36th international conference on machine learning, PMLR, Proceedings of machine learning research, vol 97, pp 3744–3753. http://proceedings.mlr.press/v97/lee19d.html
Lemke C, Budka M, Gabrys B (2015) Metalearning: a survey of trends and technologies. Artif Intell Rev 44(1):117–130
Li G, Zhu X, Zeng Y, Wang Q, Lin L (2019) Semantic relationships guided representation learning for facial action unit recognition. In: Proceedings of the AAAI conference on artificial intelligence vol 33(01), pp 8594–8601. https://doi.org/10.1609/aaai.v33i01.33018594. https://ojs.aaai.org/index.php/AAAI/article/view/4879
Li L, Lin YL, Zheng NN, Wang FY, Liu Y, Cao D, Wang K, Huang WL (2018) Artificial intelligence test: a case study of intelligent vehicles. Artif Intell Rev 50(3):441–465. https://doi.org/10.1007/s10462-018-9631-5
Li L, Wang X, Wang K, Lin Y, Xin J, Chen L, Xu L, Tian B, Ai Y, Wang J, Cao D, Liu Y, Wang C, Zheng N, Wang FY (2019b) Parallel testing of vehicle intelligence via virtual-real interaction. Sci Robot 4(28) https://doi.org/10.1126/scirobotics.aaw4106. https://robotics.sciencemag.org/content/4/28/eaaw4106
Li L, Zheng N, Wang F (2020) A theoretical foundation of intelligence testing and its application for intelligent vehicles. In: Proceedings of the IEEE transactions on intelligent transportation systems, pp 1–10. https://doi.org/10.1109/TITS.2020.2991039
Li Q, Huang S, Hong Y, Chen Y, Wu YN, Zhu SC (2020a) Closed loop neural-symbolic learning via integrating neural perception, grammar parsing, and symbolic reasoning. In: Proceedings of the international conference on machine learning (ICML)
Li Q, Peng X, Cao L, Du W, Xing H, Qiao Y, Peng Q (2020) Product image recognition with guidance learning and noisy supervision. Comput Vis Image Underst 196:102963. https://doi.org/10.1016/j.cviu.2020.102963. https://www.sciencedirect.com/science/article/pii/S1077314220300436
Li Q, Gkoumas D, Lioma C, Melucci M (2021) Quantum-inspired multimodal fusion for video sentiment analysis. Inf Fus 65:58–71
Li Z, Wallace E, Shen S, Lin K, Keutzer K, Klein D, Gonzalez J (2020c) Train big, then compress: rethinking model size for efficient training and inference of transformers. In: III HD, Singh A (eds) Proceedings of the 37th international conference on machine learning, PMLR, Proceedings of machine learning research, vol 119, pp 5958–5968. http://proceedings.mlr.press/v119/li20m.html
Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. http://arxiv.org/abs/1509.02971
Lim EH, Liu JN, Lee RS (2013) Knowledge seeker-ontology modelling for information search and management. Springer, Cham
Lin Y, Liu Z, Sun M, Liu Y, Zhu X (2015) Learning entity and relation embeddings for knowledge graph completion. In: Proceedings of the AAAI conference on artificial intelligence, vol 29
Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JA, van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med. Image Anal 42:60–88. https://doi.org/10.1016/j.media.2017.07.005. https://www.sciencedirect.com/science/article/pii/S1361841517301135
Liu JNK, He Y, Lim EHY, Wang X (2013) A new method for knowledge and information management domain ontology graph model. IEEE Trans Syst Man Cybern Syst 43(1):115–127. https://doi.org/10.1109/TSMCA.2012.2196431
Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2020) Deep learning for generic object detection: a survey. Int J Comput Vis 128(2):261–318
Liu L, Wang B, Kuang Z, Xue JH, Chen Y, Yang W, Liao Q, Zhang W (2021) Gendet: Meta learning to generate detectors from few shots. In: Proceedings of the IEEE transactions on neural networks and learning systems ,pp 1–13. https://doi.org/10.1109/TNNLS.2021.3053005
Liu PJ, Saleh M, Pot E, Goodrich B, Sepassi R, Kaiser L, Shazeer N (2018) Generating wikipedia by summarizing long sequences. In: Proceedings of the international conference on learning representations. https://openreview.net/forum?id=Hyg0vbWC-
Liu Y, Cheng M, Hu X, Bian J, Zhang L, Bai X, Tang J (2019) Richer convolutional features for edge detection. IEEE Trans Pattern Anal Mach Intell 41(8):1939–1946. https://doi.org/10.1109/TPAMI.2018.2878849
Liu Z, Chen C, Wang J, Huang Y, Hu J, Wang Q (2020) Owl eyes: spotting ui display issues via visual understanding. In: 2020 35th IEEE/ACM international conference on automated software engineering (ASE), pp 398–409
Lonergan B (1992) Insight: a study of human understanding, vol 3. University of Toronto Press, Toronto
Lu C, Krishna R, Bernstein M, Fei-Fei L (2016) Visual relationship detection with language priors. In: Proceedings of European conference on computer vision. Springer, pp 852–869
Luo A, Li X, Yang F, Jiao Z, Cheng H (2020) Webly-supervised learning for salient object detection. Pattern Recogn 103:107308. https://doi.org/10.1016/j.patcog.2020.107308
Maudsley DB (1980) A theory of meta-learning and principles of facilitation: an organismic perspective
McCarthy J, Minsky ML, Rochester N, Shannon CE (2006) A proposal for the Dartmouth summer research project on artificial intelligence. AI Mag 27(4):12. https://doi.org/10.1609/aimag.v27i4.1904
Mei T, Zhang W, Yao T (2020) Vision and language: from visual perception to content creation. APSIPA Trans Signal Inf Process. https://doi.org/10.1017/ATSIP.2020.10
Melamud O, Goldberger J, Dagan I (2016) context2vec: learning generic context embedding with bidirectional LSTM. In: Proceedings of The 20th SIGNLL conference on computational natural language learning. Association for Computational Linguistics, Berlin, pp 51–61. https://doi.org/10.18653/v1/K16-1006. https://www.aclweb.org/anthology/K16-1006
Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: Proceedings of the Eleventh annual conference of the international speech communication association
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the advances in neural information processing systems, pp 3111–3119
Min W, Tian Y, Huang Z, Cheng WH, El Saddik A (2020) Urban multimedia computing: emerging methods in multimedia computing for urban data analysis and applications. IEEE Multimed 27(3):8–11. https://doi.org/10.1109/MMUL.2020.3017877
Minsky M (1988) Society of mind. Simon and Schuster, New York
Minsky M (2007) The emotion machine: commonsense thinking, artificial intelligence, and the future of the human mind. Simon and Schuster, New York
Mitchell J, Bowers JS (2020) Harnessing the symmetry of convolutions for systematic generalisation. In: Proceedings of the 2020 international joint conference on neural networks (IJCNN), pp 1–8. https://doi.org/10.1109/IJCNN48605.2020.9207183
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. http://arxiv.org/abs/1312.5602
Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K (2016) Asynchronous methods for deep reinforcement learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of the 33rd international conference on machine learning. PMLR, New York. Proceedings of machine learning research, vol 48, pp 1928–1937. http://proceedings.mlr.press/v48/mniha16.html
Parisotto E, Song F, Rae J, Pascanu R, Gulcehre C, Jayakumar S, Jaderberg M, Kaufman RL, Clark A, Noury S, Botvinick M, Heess N, Hadsell R (2020) Stabilizing transformers for reinforcement learning. In: III HD, Singh A (eds) Proceedings of the 37th international conference on machine learning. PMLR, Proceedings of machine learning research, vol 119, pp 7487–7498. http://proceedings.mlr.press/v119/parisotto20a.html
Parmar N, Vaswani A, Uszkoreit J, Kaiser L, Shazeer N, Ku A, Tran D (2018) Image transformer. In: Dy J, Krause A (eds) Proceedings of the 35th international conference on machine learning, PMLR, Stockholmsmässan, Stockholm Sweden, Proceedings of machine learning research, vol 80, pp 4055–4064. http://proceedings.mlr.press/v80/parmar18a.html
Patel VL, Arocha JF, Kaufman DR (1999) Expertise and tacit knowledge in medicine. Tacit knowledge in professional practice: researcher and practitioner perspectives, pp 75–99
Pearl J, Mackenzie D (2018) The book of why: the new science of cause and effect, 1st edn. Basic Books Inc, New York
Peng H (2021) A brief survey of associations between meta-learning and general AI. http://arxiv.org/abs/2101.04283
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, KDD ’14, pp 701–710. https://doi.org/10.1145/2623330.2623732
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the of NAACL
Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv 51(5). https://doi.org/10.1145/3234150
Powell G (1980) A meta-analysis of the effects of imposed and induced imagery upon word recall
Qiu J, Dong Y, Ma H, Li J, Wang K, Tang J (2018) Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In: Proceedings of the eleventh ACM international conference on web search and data mining, pp 459–467
Qiu J, Dong Y, Ma H, Li J, Wang C, Wang K, Tang J (2019) Netsmf: large-scale network embedding as sparse matrix factorization. In: Proceedings of the world wide web conference, pp 1509–1520
Qiu J, Ma H, Levy O, Yih Wt, Wang S, Tang J (2020) Blockwise self-attention for long document understanding. In: Proceedings of the findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, Online, pp 2555–2565. https://doi.org/10.18653/v1/2020.findings-emnlp.232. https://www.aclweb.org/anthology/2020.findings-emnlp.232
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. http://arxiv.org/abs/1511.06434
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2018) Language models are unsupervised multitask learners. OpenAI blog 1(8):9
Rae JW, Potapenko A, Jayakumar SM, Hillier C, Lillicrap TP (2020) Compressive transformers for long-range sequence modelling. In: Proceedings of the international conference on learning representations. https://openreview.net/forum?id=SylKikSYDH
Rao Y, Lu J, Zhou J (2019) Learning discriminative aggregation network for video-based face recognition and person re-identification. Int J Comput Vis 127(6):701–718
Rid T (2016) Rise of the machines: a cybernetic history. WW Norton & Company, Manhattan
Ritter S, Wang J, Kurth-Nelson Z, Jayakumar S, Blundell C, Pascanu R, Botvinick M (2018) Been there, done that: meta-learning with episodic recall. In: Proceedings of the international conference on machine learning. PMLR, pp 4354–4363
Robič B (2015) The foundations of computability theory. Springer, Cham
Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386–408
Roy A, Saffar M, Vaswani A, Grangier D (2020) Efficient content-based sparse attention with routing transformers. arXiv:2003.05997
Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., vol 30, pp 3856–3866
Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T (2016) Meta-learning with memory-augmented neural networks. In: Proceedings of the international conference on machine learning. PMLR, pp 1842–1850
Sato R (2020) A survey on the expressive power of graph neural networks. arXiv:2003.04078
Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G (2008) The graph neural network model. IEEE Trans Neural Netw 20(1):61–80
Schaal S (1999) Is imitation learning the route to humanoid robots? Trends Cogn Sci 3(6):233–242
Schank RC, Abelson RP (2013) Scripts, plans, goals, and understanding: an inquiry into human knowledge structures. Psychology Press, Hove
Semmlow JL, Griffel B (2014) Biosignal and medical image processing. CRC Press, Boca Raton
Shen S, Yao Z, Gholami A, Mahoney M, Keutzer K (2020a) PowerNorm: rethinking batch normalization in transformers. In: III HD, Singh A (eds) Proceedings of the 37th international conference on machine learning, PMLR, proceedings of machine learning research, vol 119, pp 8741–8751. http://proceedings.mlr.press/v119/shen20e.html
Shen Y, Ji R, Chen Z, Hong X, Zheng F, Liu J, Xu M, Tian Q (2020b) Noise-aware fully webly supervised object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Shi S, Chen H, Ma W, Mao J, Zhang M, Zhang Y (2020) Neural logic reasoning. Association for Computing Machinery, New York, pp 1365–1374
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A et al (2017) Mastering the game of go without human knowledge. Nature 550(7676):354–359
Simon HA, Newell A (1971) Human problem solving: the state of the theory in 1970. Am Psychol 26(2):145
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International conference on learning representations
Smolensky P (1986) Information processing in dynamical systems: foundations of harmony theory. Colorado Univ at Boulder Dept of Computer Science, Tech. rep, Boulder
Socher R, Chen D, Manning CD, Ng A (2013) Reasoning with neural tensor networks for knowledge base completion. In: Advances in neural information processing systems, Citeseer, pp 926–934
Sodhro AH, Luo Z, Sodhro GH, Muzamal M, Rodrigues JJ, de Albuquerque VHC (2019) Artificial intelligence based QOS optimization for multimedia communication in IOV systems. Future Gener Comput Syst 95:667–680. https://doi.org/10.1016/j.future.2018.12.008
Solvi C, Gutierrez Al-Khudhairy S, Chittka L (2020) Bumble bees display cross-modal object recognition between visual and tactile senses. Science 367(6480):910–912. https://doi.org/10.1126/science.aay8064
Stanley KO, Clune J, Lehman J, Miikkulainen R (2019) Designing neural networks through neuroevolution. Nat Mach Intell 1(1):24–35
Stewart R, Ermon S (2017) Label-free supervision of neural networks with physics and domain knowledge. In: Proceedings of the thirty-first AAAI conference on artificial intelligence. AAAI Press, AAAI’17, pp 2576–2582
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Proceedings of the 27th international conference on neural information processing systems, vol 2. MIT Press, Cambridge. NIPS’14, pp 3104–3112
Synakowski S, Feng Q, Martinez A (2021) Adding knowledge to unsupervised algorithms for the recognition of intent. Int J Comput Vis. https://doi.org/10.1007/s11263-020-01404-0
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2014) Going deeper with convolutions. CoRR abs/1409.4842. http://arxiv.org/abs/1409.4842
Szeliski R (2010) Computer vision: algorithms and applications. Springer Science & Business Media, Cham
Szeliski R (2021) Computer vision: algorithms and applications, 2nd edn. Springer Science & Business Media, Cham
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) LINE: large-scale information network embedding. In: Proceedings of the international world wide web conferences steering committee. Republic and Canton of Geneva, CHE, pp 1067–1077. https://doi.org/10.1145/2736277.2741093
Tay Y, Bahri D, Metzler D, Juan DC, Zhao Z, Zheng C (2020a) Synthesizer: rethinking self-attention in transformer models. arXiv:2005.00743
Tay Y, Bahri D, Yang L, Metzler D, Juan DC (2020b) Sparse sinkhorn attention
Testa M, Altarelli G (2000) Weaving the web-the original design and ultimate destiny of the world wide. CERN Courier p 37
Tranel D, Damasio H, Damasio AR (1997) A neural basis for the retrieval of conceptual knowledge. Neuropsychologia 35(10):1319–1327
Turing AM (1950) Computing machinery and intelligence. Mind 59(October):433–60. https://doi.org/10.1093/mind/LIX.236.433
Uppal S, Bhagat S, Hazarika D, Majumdar N, Poria S, Zimmermann R, Zadeh A (2020) Emerging trends of multimodal research in vision and language. arXiv:2010.09522
VanLehn K (1996) Conceptual and meta learning during coached problem solving. In: Proceedings of international conference on intelligent tutoring systems, Springer, pp 29–47
Vanschoren J (2018) Meta-learning: a survey. arXiv:1810.03548
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Lu, Polosukhin I (2017) Attention is all you need. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., vol 30, pp 5998–6008. https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y (2017) Graph attention networks. arXiv:1710.10903
Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95
Vincent P, Larochelle H, Bengio Y, Manzagol PA (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on Machine learning, pp 1096–1103
Von Neumann J, Kurzweil R (2012) The computer and the brain. Yale University Press, London
Vyas A, Katharopoulos A, Fleuret F (2020) Fast transformers with clustered attention
Wang F (1993) A knowledge-based vision system for detecting land changes at urban fringes. IEEE Trans Geosci Remote Sens 31(1):136–145
Wang F (2007) Toward a paradigm shift in social computing: the ACP approach. IEEE Intell Syst 22(5):65–67. https://doi.org/10.1109/MIS.2007.4338496
Wang H, Zhang C, Wang W, Hu X, Xu F (2014) Human-centric computational knowledge environment for complex or ill-structured problem solving. In: Proceedings of 2014 IEEE international conference on systems, man, and cybernetics (SMC), pp 2940–2945. https://doi.org/10.1109/SMC.2014.6974377
Wang J, Cheng R, Liao PC (2021) Trends of multimodal neural engineering study: a bibliometric review. Arch Comput Methods Eng 28:1–15
Wang K, Gou C, Zheng N, Rehg JM, Wang FY (2017) Parallel vision for perception and understanding of complex scenes: methods, framework, and perspectives. Artif Intell Rev 48(3):299–329. https://doi.org/10.1007/s10462-017-9569-z
Wang Q, Liu X, Liu W, Liu A, Liu W, Mei T (2020) Metasearch: incremental product search via deep meta-learning. IEEE Trans Image Process 29:7549–7564. https://doi.org/10.1109/TIP.2020.3004249
Wang S, Li B, Khabsa M, Fang H, Ma H (2020a) Linformer: self-attention with linear complexity. arXiv:2006.04768
Wang S, Yang Y, Sun J, Xu Z (2021) Variational hyperadam: a meta-learning approach to network training. IEEE Trans Pattern Anal Mach Intell 01:1–1. https://doi.org/10.1109/TPAMI.2021.3061581
Wang X, Zhu W, Tian Y, Gao W (2020b) Multimedia intelligence: when multimedia meets artificial intelligence. Association for Computing Machinery, New York, pp 4775–4776. https://doi.org/10.1145/3394171.3418547
Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes. In: AAAI, pp 1112–1119
Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N (2016) Dueling network architectures for deep reinforcement learning. In: Balcan MF, Weinberger KQ (eds) Proceedings of The 33rd international conference on machine learning. PMLR, New York, Proceedings of machine learning research, vol 48, pp 1995–2003. http://proceedings.mlr.press/v48/wangf16.html
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Le Scao T, Gugger S, Drame M, Lhoest Q, Rush A (2020) Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations. Association for Computational Linguistics, Online, pp 38–45. https://doi.org/10.18653/v1/2020.emnlp-demos.6
Wu L, Mo L, Wang R (2005) What is situation model: propositional symbol or perceptual symbol? Adv Psychol Sci 13(04):479–487
Wu X, He R, Hu Y, Sun Z (2020) Learning an evolutionary embedding via massive knowledge distillation. Int J Comput Vis 128(8):2089–2106
Wu Z, Pan S, Chen F, Long G, Zhang C, Philip SY (2020b) A comprehensive survey on graph neural networks. In: Proceedings of the IEEE transactions on neural networks and learning systems
Xia T, Wang Y, Tian Y, Chang Y (2021) Using prior knowledge to guide bert’s attention in semantic textual matching tasks. arXiv:2102.10934
Xiao H, Huang M, Hao Y, Zhu X (2015) Transa: an adaptive approach for knowledge graph embedding. arXiv:1509.05490
Xiao H, Huang M, Zhu X (2016) Transg: a generative model for knowledge graph embedding. In: Proceedings of the 54th annual meeting of the association for computational linguistics, vol 1, pp 2316–2325. Long Papers
Yang GR, Wang XJ (2020) Artificial neural networks for neuroscientists: a primer. Neuron 107(6):1048–1070. https://doi.org/10.1016/j.neuron.2020.09.005
Yang H, Chen W, Yf Hao (2020) Supply chain partnership, inter-organizational knowledge trading and enterprise innovation performance: the theoretical and empirical research in project-based supply chain. Soft Comput 24(9):6433–6444. https://doi.org/10.1007/s00500-019-04548-5
Yang J, Chen W, Feng L, Yan X, Zheng H, Zhang W (2020b) Webly supervised image classification with metadata: Automatic noisy label correction via visual-semantic graph. In: Proceedings of the 28th ACM international conference on multimedia. Association for Computing Machinery, New York. MM ’20, pp 83–91. https://doi.org/10.1145/3394171.3413952
Yang Z, Liu S, Hu H, Wang L, Lin S (2019) Reppoints: point set representation for object detection. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV)
Yang Z, Ding M, Zhou C, Yang H, Zhou J, Tang J (2020c) Understanding negative sampling in graph representation learning. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery& data mining. Association for Computing Machinery, New York, KDD ’20, pp 1666–1676. https://doi.org/10.1145/3394486.3403218
Yao H, Wei Y, Huang J, Li Z (2019) Hierarchically structured meta-learning. In: Proceedings of the international conference on machine learning, PMLR, pp 7045–7054
Yin W (2020) Meta-learning for few-shot natural language processing: a survey. arXiv:2007.09604
Yoon J, Kim T, Dia O, Kim S, Bengio Y, Ahn S (2018) Bayesian model-agnostic meta-learning. In: Proceedings of the 32nd international conference on neural information processing systems, pp 7343–7353
Yu L, Zhang W, Wang J, Yu Y (2017) Seqgan: sequence generative adversarial nets with policy gradient. In: Proceedings of the AAAI conference on artificial intelligence, vol 31
Yu X, Gao Y, Xiong S, Yuan X (2019) Multiscale contour steered region integral and its application for cultivar classification. IEEE Access 7:69087–69100. https://doi.org/10.1109/ACCESS.2019.2918263
Yu X, Xiong S, Gao Y, Yuan X (2019b) Contour covariance: a fast descriptor for classification. In: Proceedings of 2019 IEEE international conference on image processing (ICIP), pp 569–573. https://doi.org/10.1109/ICIP.2019.8803806
Yu X, Zhao Y, Gao Y, Xiong S, Yuan X (2020) Patchy image structure classification using multi-orientation region transform. In: Proceedings of the AAAI conference on artificial intelligence vol 34, Issue 07, pp 12741–12748. https://doi.org/10.1609/aaai.v34i07.6968. https://ojs.aaai.org/index.php/AAAI/article/view/6968
Yuan H, Yu H, Gui S, Ji S (2020) Explainability in graph neural networks: a taxonomic survey. arXiv:2012.15445
Zador AM (2019) A critique of pure learning and what artificial neural networks can learn from animal brains. Nat Commun 10(1):1–7
Zaheer M, Guruganesh G, Dubey A, Ainslie J, Alberti C, Ontanon S, Pham P, Ravula A, Wang Q, Yang L, et al. (2020) Big bird: transformers for longer sequences. arXiv:2007.14062
Zandi B, Khanh TQ (2021) Deep learning-based pupil model predicts time and spectral dependent light responses. Sci Rep 11(1):1–16
Zhang C, Yang Z, He X, Deng L (2020) Multimodal intelligence: representation learning, information fusion, and applications. IEEE J Sel Top Signal Process 14(3):478–493. https://doi.org/10.1109/JSTSP.2020.2987728
Zhang N (2017) A brief history of artificial intelligence. Posts & Telecom Press, Beijing
Zhang Q, Yang LT, Chen Z, Li P (2018) A survey on deep learning for big data. Inf Fusion 42:146–157
Zhang Q, Yang LT, Chen Z, Li P (2018) A survey on deep learning for big data. Inf Fusion 42:146–157. https://doi.org/10.1016/j.inffus.2017.10.006
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Cortes C, Lawrence N, Lee D, Sugiyama M, Garnett R (eds) Advances in neural information processing systems. Curran Associates, Inc., vol 28, pp 649–657. https://proceedings.neurips.cc/paper/2015/file/250cf8b51c773f3f8dc8b4be867a9a02-Paper.pdf
Yj Zhang (2021) Handbook of image engineering. Springer, Cham
Zhang Z, Zhu Y, Zhu SC (2020) Graph-based hierarchical knowledge representation for robot task transfer from virtual to physical world. In: IROS
Zheng NN (2019) The new era of artificial intelligence. Chin J Intell Sci Technol 1(1):1. https://doi.org/10.11959/j.issn.2096-6652.201914
Zheng W, Wang FY, Wang K (2017) An ACP-based approach to color image encryption using DNA sequence operation and hyper-chaotic system. In: Proceedings of the 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp 461–466. https://doi.org/10.1109/SMC.2017.8122648
Zheng W, Yan L, Gou C, Wang FY (2018) Deep forest with local experts based on ELM for pedestrian detection. In: Hong R, Cheng WH, Yamasaki T, Wang M, Ngo CW (eds) Advances in multimedia information processing-PCM 2018. Springer International Publishing, Cham, pp 803–814
Zheng W, Yan L, Gou C, Wang FY (2019a) Differential-evolution-based generative adversarial networks for edge detection. In: Proceedings of the 2019 IEEE/CVF international conference on computer vision (ICCV), pp 2999–3008. https://doi.org/10.1109/ICCV.2019.00362
Zheng W, Yan L, Gou C, Wang FY (2019b) Forest representation learning with multiscale contour feature learning for leaf cultivar classification. In: Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp 774–777. https://doi.org/10.1109/BIBM47256.2019.8983276
Zheng W, Yan L, Gou C, Wang FY (2019c) Guided cycleGAN via semi-dual optimal transport for photo-realistic face super-resolution. In: Proceedings of the 2019 IEEE international conference on image processing (ICIP), pp 2851–2855. https://doi.org/10.1109/ICIP.2019.8803393
Zheng W, Yan L, Gou C, Wang FY (2019d) Software defect prediction model based on improved deep forest and autoencoder by forest. In: SEKE, pp 419–540
Zheng W, Yan L, Gou C, Wang FY (2019) Unsupervised data augmentation for improving traffic sign recognition. In: Nayak AC, Sharma A (eds) PRICAI 2019: trends in artificial intelligence. Springer International Publishing, Cham, pp 297–306
Zheng W, Yan L, Gou C, Zhang W, Wang F (2019) A relation network embedded with prior features for few-shot caricature recognition. In: Proceedings of the 2019 IEEE international conference on multimedia and expo (ICME), pp 1510–1515. https://doi.org/10.1109/ICME.2019.00261
Zheng W, Gou C, Wang FY (2020) A novel approach inspired by optic nerve characteristics for few-shot occluded face recognition. Neurocomputing 376:25–41. https://doi.org/10.1016/j.neucom.2019.09.045
Zheng W, Wang FY, Gou C (2020) Nonparametric different-feature selection using wasserstein distance. In: Proceedings of the 2020 IEEE 32nd International conference on tools with artificial intelligence (ICTAI), pp 982–988. https://doi.org/10.1109/ICTAI50040.2020.00153
Zheng W, Wang K, Wang FY (2020) A novel background subtraction algorithm based on parallel vision and bayesian GANs. Neurocomputing 394:178–200. https://doi.org/10.1016/j.neucom.2019.04.088
Zheng W, Yan L, Gou C, Wang F (2020b) JND-GAN: human-vision-systems inspired generative adversarial networks for image-to-image translation. In: Giacomo GD, Catalá A, Dilkina B, Milano M, Barro S, Bugarín A, Lang J (eds) ECAI 2020 - 24th European conference on artificial intelligence, 29 Aug–8 Sept 2020. Santiago de Compostela, Spain, August 29–September 8, 2020 - Including 10th conference on prestigious applications of artificial intelligence (PAIS 2020), IOS Press, Frontiers in Artificial Intelligence and Applications, vol 325, pp 2816–2823. https://doi.org/10.3233/FAIA200423
Zheng W, Yan L, Gou C, Wang FY (2020c) Federated meta-learning for fraudulent credit card detection. In: Bessiere C (ed) Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20, international joint conferences on artificial intelligence organization, pp 4654–4660. special Track on AI in FinTech
Zheng W, Yan L, Gou C, Wang FY (2020d) Graph attention model embedded with multi-modal knowledge for depression detection. In: Proceedings of 2020 IEEE international conference on multimedia and expo (ICME), pp 1–6. https://doi.org/10.1109/ICME46284.2020.9102872
Zheng W, Yan L, Gou C, Wang FY (2020) Learning from the guidance: knowledge embedded meta-learning for medical visual question answering. In: Yang H, Pasupa K, Leung ACS, Kwok JT, Chan JH, King I (eds) Neural information processing. Springer International Publishing, Cham, pp 194–202
Zheng W, Yan L, Gou C, Wang FY (2020f) Learning from the Past: meta-continual learning with knowledge embedding for jointly sketch, cartoon, and caricature face recognition. Association for Computing Machinery, New York, pp 736–743. https://doi.org/10.1145/3394171.3413892
Zheng W, Yan L, Gou C, Wang FY (2020g) Learning to classify: a flow-based relation network for encrypted traffic classification. Association for Computing Machinery, New York, pp 13–22. https://doi.org/10.1145/3366423.3380090
Zheng W, Yan L, Gou C, Wang FY (2020) A relation hashing network embedded with prior features for skin lesion classification. In: Martel AL, Abolmaesumi P, Stoyanov D, Mateus D, Zuluaga MA, Zhou SK, Racoceanu D, Joskowicz L (eds) Medical image computing and computer assisted intervention - MICCAI 2020. Springer International Publishing, Cham, pp 115–123
Zheng W, Yan L, Gou C, Wang FY (2020i) Webly supervised knowledge embedding model for visual reasoning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Zheng W, Wang K, Wang FY (2021) Gan-based key secret-sharing scheme in blockchain. IEEE Trans Cybern 51(1):393–404. https://doi.org/10.1109/TCYB.2019.2963138
Zheng W, Yan L, Gou C, Wang FY (2021) Fighting fire with fire: a spatial-frequency ensemble relation network with generative adversarial learning for adversarial image classification. Int J Intell Syst. https://doi.org/10.1002/int.22372
Zheng W, Yan L, Gou C, Wang FY (2021) KM\(^4\): visual reasoning via knowledge embedding memory model with mutual modulation. Inf Fusion 67:14–28. https://doi.org/10.1016/j.inffus.2020.10.007
Zheng W, Yan L, Gou C, Wang FY (2021) Learning from the negativity: deep negative correlation meta-learning for adversarial image classification. In: Lokoč J, Skopal T, Schoeffmann K, Mezaris V, Li X, Vrochidis S, Patras I (eds) MultiMedia modeling. Springer International Publishing, Cham, pp 531–540
Zhong N, Weihrauch K (2003) Computability theory of generalized functions. J ACM 50(4):469–505. https://doi.org/10.1145/792538.792542
Zhou Z, Liu S (2021) Machine learning. Springer, Singapore
Zhu W, Wang X, Gao W (2020) Multimedia intelligence: when multimedia meets artificial intelligence. IEEE Trans Multimed 22(7):1823–1835. https://doi.org/10.1109/TMM.2020.2969791
Zhu Y, Gao T, Fan L, Huang S, Edmonds M, Liu H, Gao F, Zhang C, Qi S, Wu YN, Tenenbaum JB, Zhu SC (2020) Dark, beyond deep: a paradigm shift to cognitive AI with humanlike common sense. Engineering 6(3):310–345. https://doi.org/10.1016/j.eng.2020.01.011
Zikria YB, Afzal MK, Kim SW (2020) Internet of multimedia things (iomt): opportunities, challenges and solutions. Sensors 20(8):2334. https://doi.org/10.3390/s20082334
Acknowledgements
This work is supported in part by National Key R&D Program of China (2020YFB1600400), in part by Key Research and Development Program of Guangzhou (202007050002), in part by National Natural Science Foundation of China (61806198, U1811463), and in part by the National Key R&D Program of China (2018AAA0101502).
Author information
Authors and Affiliations
Contributions
WZ: Conceptualization, Methodology, Software, Validation, Investigation, Writing—Original Draft, Writing - Review & Editing; LY: Methodology, Investigation, Writing—Original Draft; CG: Conceptualization, Resources, Writing—Review & Editing, Project administration, Funding acquisition; FW Resources, Writing—Review & Editing, Supervision, Funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zheng, W., Yan, L., Gou, C. et al. Computational knowledge vision: paradigmatic knowledge based prescriptive learning and reasoning for perception and vision. Artif Intell Rev 55, 5917–5952 (2022). https://doi.org/10.1007/s10462-022-10166-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10166-9