计算机科学 ›› 2020, Vol. 47 ›› Issue (3): 174-181.doi: 10.11896/jsjkx.190800040
王晓明,赵歆波
WANG Xiao-ming,ZHAO Xin-bo
摘要: 阅读文字时眼球的运动反映了人类的认知过程。阅读眼动数据是认知心理学、应用语言学、计算机科学等领域中重要的基础数据,而我国在阅读眼动研究的基础数据方面较欠缺。针对这一现状,首先介绍了阅读眼动追踪语料库产生的背景以及国内外的相关文献;然后从影响阅读眼动的低水平视觉因素和高水平视觉因素角度介绍了阅读眼动追踪语料库的内容及所使用的各项眼动指标,如单一注视时间、首次注视时间、凝视时间、总注视时间、回视出次数、回视入次数等,并分析了使用语料库研究法进行阅读眼动研究相比传统阅读眼动研究具有的3个优势;最后从语料库眼动指标变量、语料规模、语料内容、语料语种、被试规模、被试特征、采集设备等方面介绍了国外已经建成的较有影响力的若干阅读眼动追踪语料库,以供阅读眼动研究者参考。在眼动追踪语料库应用研究方面,对认知心理学、应用语言学和计算机科学等相关领域已开展的主要研究进行述评,重点介绍了在计算机科学的眼动可计算模型、自然语言处理、模式识别3个领域中基于阅读眼动追踪语料库开展的典型研究。在中文阅读眼动追踪语料库的构建与应用研究方面,介绍了我国相关研究的开展现状,分析了我国在眼动基础数据方面欠缺的原因,并从国家、科研机构、科研工作者3个层面提出了解决此问题的对策和建议。
中图分类号:
[1]LI J Y,PU Y T.A Review of Domestic Studies on L2 Reading Strategies in the Past Decade [J].Foreign Language Education,2017,38(3):62-67. [2]RAYNER K.Eye movements in reading and information pro- cessing:20 years of research [J].Psychological Bulletin,1998,124(3):372-422. [3]COLTHEART M,RASTLE K,PERRY C,et al.DRC:A dual route cascaded model of visual word recognition and reading aloud[J].Psychological Review,2001,108(1):204-256. [4]DILKINA K,MCCLELLAND J L,PLAUT D C.Are there mental lexicons? The role of semantics in lexical decision [J].Brain Research,2010,1365(3):66-81. [5]HARM M W,SEIDENBERG M S.Computing the Meanings of Words in Reading:Cooperative Division of Labor Between Visual andPhonological Processes[J].Psychological Review,2004,111(3):662-720. [6]MENG H X.The selection mechanism of saccade target in Chinese reading [M].Guangzhou:World Book Publishing Guangdong Co.,Ltd.,2016:30-108. [7]LIU X L.Visual neurophysiology [M].Beijing:People’s Medical Publishing House,2011:1-42. [8]YAN G L,XIONG J P,ZANG C L,et al.Review of Eye-movement Measures in Reading Research [J].Advances in Psychological Science,2013,21(4):589-605. [9]WANG F J,TIAN M,HUANG Y P,et al.Classification Model of Visual Attention Based on Eye Movement Data[J].Computer Science,2016,43(1):85-88,115. [10]MOUSIKOU P,SADAT J,LUCAS R,et al.Moving beyond the monosyllable in models of skilled reading:Mega-study of disyllabic nonword reading [J].Journal of Memory and Language,2017,93(3):169-192. [11]KENNEDY A,HILL R,PYNTE J.The Dundee Corpus[C]∥Proceedings of the 12th European Conference on Eye Movement.Dundee,Scotland:Elsevier,2003:13-23. [12]University of Dundee.Dundee Corpus update could help reveal the secrets of reading [EB/OL].(2018-04-13).https://www.dundee.ac.uk/social-sciences/news/2018/article/dundee-corpus-update-could-help-reveal-the-secrets-of-reading.php. [13]FRANK S.Surprisal-based comparison between a symbolic and a connectionist model of sentence processing[J].Proceedings of the annual meeting of the Cognitive Science Society,2009,31(31):1139-1144. [14]BARRETT M,AGIC ,SØGAARD A.The Dundee Treebank [C]∥Proceedings of the 14th International Workshop on Treebanks and Linguistic Theories.2015. [15]KLIEGL R,GRABNER E,ROLFS M,et al.Length,frequency,and predictability effects of words on eye movements in reading [J].European Journal of Cognitive Psychology,2004,16(1/2):262-284. [16]HUSAIN S,VASISHTH S,SRINIVASAN N.Integration and prediction difficulty in Hindi sentence comprehension:Evidence from an eye-tracking corpus [J].Journal of Eye Movement Research,2014,8(2):1-13. [17]BHATT R,NARASIMHAN B,PALMER M,et al.A multi-representational and multi-layered treebank for hindi/urdu[C]∥Proceedings of the Third Linguistic Annotation Workshop.Association for Computational Linguistics,2009:186-189. [18]LUKE S G,CHRISTIANSON K.The Provo Corpus:A large eye-tracking corpus with predictability norms [J].Behavior Research Methods,2018,50(2):826-833. [19]COP U,DIRIX N,DRIEGHE D,et al.Presenting GECO:An eye tracking corpus of monolingual and bilingual sentence reading [J].Behavior Research Methods,2016,49(2):1-14. [20]HOLLENSTEIN N,ROTSZTEJN J,TROENDLE M,et al.ZuCo,a simultaneous EEG and eye-tracking resource for natural sentence reading[J].Scientific Data,2018,5:180291-180291. [21]BALOTA D A,YAP M J,HUTCHISON K A,et al.The English lexicon project [J].Behavior Research Methods,2007,39(3):445-459. [22]FERRAND L,NEW B,BRYSBAERT M,et al.The French Lexicon Project:Lexical decision data for 38,840 French words and 38,840 pseudowords [J].Behavior Research Methods,2010,42(2):488-496. [23]KEULEERS E,DIEPENDAELE K,BRYSBAERT M.Practice effects in large-scale visual word recognition studies:A lexical decision study on 14,000 Dutch mono-and disyllabic words and nonwords [J].Frontiers in Psychology,2010,13(1):174-183. [24]KEULEERS E,LACEY P,RASTLE K,et al.The British Lexicon Project:Lexical decision data for 28,730 monosyllabic and disyllabic English words [J].Behavior research methods,2012,44(1):287-304. [25]SHI F.The Macrohistory,Mesohistory and Microhistory of Evo- lutionary Linguistics [J].Nankai Journal(Philosophy,Literature and Social Science Edition),2018,33(4):65-71. [26]GUO X Y,LI L,GENG H J.Eye-movement Analysis of Visual Similarity Perception on Synthesized Texture Images [J].Computer Science,2018,45(8):223-228. [27]YU M.Eye Movement Research on Syntactic Ambiguity Pro- cessing in Modern Chinese[M].Tianjin:Nankai University Press,2014:21-25. [28]KUPERMAN V,VAN DYKE J A.Reassessing word frequency as a determinant of word recognition for skilled and unskilled readers [J].Journal of Experimental Psychology:Human Perception and Performance,2013,39(3):802-813. [29]YAP M J,BALOTA D A.Visual word recognition of multisyllabic words [J].Journal of Memory and Language,2009,60(4):502-529. [30]WHITNEY C.Location,location,location:How it affects the neighborhood (effect) [J].Brain and Language,2011,118(3):90-104. [31]KENNEDY A,PYNTE J.Parafoveal-on-foveal effects in normal reading [J].Vision Research,2005,45(2):153-168. [32]PYNTE J,KENNEDY A.The influence of punctuation and word class on distributed processing in normal reading [J].Vision Research,2007,47(9):1215-1227. [33]KENNEDY A,PYNTE J.The consequences of violations to reading order:An eye movement analysis [J].Vision Research,2008,48(21):2309-2320. [34]DIEPENDAELE K,BRYSBAERT M,NERI P.How noisy is lexical decision? [J].Frontiers in psychology,2012,13(3):348-353. [35]NORRIS D,KINOSHITA S.Orthographic processing is universal;it’s what you do with it that’s different [J].Behavioral and Brain Sciences,2012,35(5):296-297. [36]KENNEDY A,PYNTE J,MURRAY W S,et al.Frequency and predictability effects in the Dundee Corpus:An eye movement analysis [J].The Quarterly Journal of Experimental Psycho-logy,2013,66(3):601-618. [37]DEMBERG V,KELLER F.Data from eye-tracking corpora as evidence for theories of syntactic processing complexity [J].Cognition,2008,109(2):193-210. [38]MITCHELL J,LAPATA M,DEMBERG V,et al.Syntactic and semantic factors in processing difficulty:An integrated measure[C]∥Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2010:196-206. [39]FRANK S L,BOD R.Insensitivity of the human sentence-processing system to hierarchical structure [J].Psychological Scie-nce,2011,22(6):829-834. [40]FOSSUM V,LEVY R.Sequential vs.hierarchical syntactic models of human incremental sentence processing[C]∥Procee-dings of the 3rd Workshop on Cognitive Modeling and Computational Linguistics.Association for Computational Linguistics,2012:61-69. [41]KUPERMAN V,DRIEGHE D,KEULEERS E,et al.How strongly do word reading times and lexical decision times correlate? Combining data from eye movement corpora and megastudies[J].The Quarterly Journal of Experimental Psychology,2013,66(3):563-580. [42]REICHLE E D,RAYNER K,POLLATSEK A.The E-Z reader model of eye-movement control in reading:comparisons to other models [J].Behavioral & Brain Sciences,2003,26(4):477-526. [43]DUCROT S,LÉTÉ B,SPRENGER-CHAROLLES L,et al.The optimal viewing position effect in beginning and dyslexic readers [J].Current Psychology Letters,Behaviour,Brain & Cognition,2003,33(10):23-33. [44]JOSEPH H S S L,LIVERSEDGE S P,BLYTHE H I,et al.Word length and landing position effects during reading in children and adults[J].Vision Research,2009,49(16):2078-2086. [45]VITU F,MCCONKIE G W,KERR P,et al.Fixation location effects on fixation durations during reading:An inverted optimal viewing position effect[J].Vision Research,2001,41(25/26):3513-3533. [46]NILSSON M,NIVRE J.Learning where to look:modeling eye movements in reading[C]∥Thirteenth Conference on Computational Natural Language Learning.Boulder,Colorado:Association for Computational Linguistics,2009:93-101. [47]NILSSON M,NIVRE J.Towards a data-driven model of eye movement control in reading[C]∥The Workshop on Cognitive Modeling & Computational Linguistics.Uppsala,Sweden:Association for Computational Linguistics,2010:63-71. [48]HARA T,MOCHIHASHI D,KANO Y,et al.Predicting Word Fixations in Text with a CRF Model for Capturing General Reading Strategies among Readers[C]∥The Workshop on Eye-Tracking & Natural Language Processing.Mumbai:Association for Computational Linguistics,2012:55-70. [49]MATTIES F,SØGAARD A.With blinkers on:robust prediction of eye movements across readers[C]∥Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Seattle,Washington:Association for Computational Linguistics,2013:803-807. [50]WANG X,ZHAO X,XIA M.The Prediction Model of Saccade Target Based on LSTM-CRF for Chinese Reading[C]∥International Conference on Brain Inspired Cognitive Systems.Cham:Springer,2018:44-53. [51]WANG X M,ZHAO X B,REN J C.A New Type of Eye Movement Model Based on Recurrent Neural Networks for Simulating the Gaze Behavior of Human Reading [J].Complexity,2019,2019:1-12. [52]WANG X M,ZHAO X B.Eye movement prediction of indivi- duals while reading based on deep neural networks [J].Journal of Tsinghua University(Science and Technology),2019,59(6):468-475. [53]BARRETT M,BINGEL J,KELLER F,et al.Weakly supervised part-of-speech tagging using eye-tracking data[C]∥54th AnnualMeeting of the Association for Computational Linguistics,(ACL 2016).2016:579-584. [54]MISHRA A,KANOJIA D,NAGAR S,et al.Leveraging Cognitive Features for Sentiment Analysis[C]∥Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning.2016. [55]SØGAARD A.Evaluating word embeddings with fMRI and eye-tracking[C]∥Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP.2016:116-121. [56]WANG Z M,ZHANG S,HE Y.Selective Ensemble Learning Human Activity Recognition Model Based on Diversity Mea-surement Cluster[J].Computer Science,2018,45(1):307-312. [57]KASPROWSKI P,OBER J.Eye movements in biometrics[C]∥International Workshop on Biometric Authentication.Berlin:Springer,2004:248-258. [58]HOLLAND C,KOMOGORTSEV O V.Biometric identification via eye movement scanpaths in reading[C]∥2011 International Joint Conference on Biometrics (IJCB).IEEE,2011:1-8. [59]RIGAS I,ECONOMOU G,FOTOPOULOS S.Biometric identification based on the eye movements and graph matching techniques[J].Pattern Recognition Letters,2012,33(6):786-792. [60]CANTONI V,GALDI C,NAPPI M,et al.GANT:Gaze analysis technique for human identification [J].Pattern Recognition,2015,48(4):1027-1038. [61]Laboratory of Bilingual Cognition and Development of Guangdong University of Foreign Studies.Professor Bai Xue-jun talks about "Several Basic Characteristics of Chinese Eye Movement Research"[EB/OL].(2017-02-26)[2018-12-03].http://bcdlab.gdufs.edu.cn/info/1017/1463.htm. [62]YAN M,KLIEGL R,RICHTER E M,et al.Flexible saccade-target selection in Chinese reading [J].The Quarterly Journal of Experimental Psychology,2010,63(4):705-725. [63]YU B,ZHANG W,JING Q,et al.STM capacity for Chinese and English language materials [J].Memory & Cognition,1985,13(3):202-207. |
[1] | 杜晓明, 袁清波, 杨帆, 姚奕, 蒋祥. 军事指控保障领域命名实体识别语料库的构建 Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support 计算机科学, 2022, 49(6A): 133-139. https://doi.org/10.11896/jsjkx.210400132 |
[2] | 李野, 陈松灿. 基于物理信息的神经网络:最新进展与展望 Physics-informed Neural Networks:Recent Advances and Prospects 计算机科学, 2022, 49(4): 254-262. https://doi.org/10.11896/jsjkx.210500158 |
[3] | 丛颖男, 王兆毓, 朱金清. 关于法律人工智能数据和算法问题的若干思考 Insights into Dataset and Algorithm Related Problems in Artificial Intelligence for Law 计算机科学, 2022, 49(4): 74-79. https://doi.org/10.11896/jsjkx.210900191 |
[4] | 刘妍, 熊德意. 面向小语种机器翻译的平行语料库构建方法 Construction Method of Parallel Corpus for Minority Language Machine Translation 计算机科学, 2022, 49(1): 41-46. https://doi.org/10.11896/jsjkx.210900012 |
[5] | 朝乐门, 尹显龙. 人工智能治理理论及系统的现状与趋势 AI Governance and System:Current Situation and Trend 计算机科学, 2021, 48(9): 1-8. https://doi.org/10.11896/jsjkx.210600034 |
[6] | 景慧昀, 魏薇, 周川, 贺欣. 人工智能安全框架 Artificial Intelligence Security Framework 计算机科学, 2021, 48(7): 1-8. https://doi.org/10.11896/jsjkx.210300306 |
[7] | 谢宸琪, 张保稳, 易平. 人工智能模型水印研究综述 Survey on Artificial Intelligence Model Watermarking 计算机科学, 2021, 48(7): 9-16. https://doi.org/10.11896/jsjkx.201200204 |
[8] | 景慧昀, 周川, 贺欣. 针对人脸检测对抗攻击风险的安全测评方法 Security Evaluation Method for Risk of Adversarial Attack on Face Detection 计算机科学, 2021, 48(7): 17-24. https://doi.org/10.11896/jsjkx.210300305 |
[9] | 暴雨轩, 芦天亮, 杜彦辉, 石达. 基于i_ResNet34模型和数据增强的深度伪造视频检测方法 Deepfake Videos Detection Method Based on i_ResNet34 Model and Data Augmentation 计算机科学, 2021, 48(7): 77-85. https://doi.org/10.11896/jsjkx.210300258 |
[10] | 吴广智, 郭斌, 丁亚三, 成家慧, 於志文. 假消息认知机理研究综述 Cognitive Mechanisms of Fake News 计算机科学, 2021, 48(6): 306-314. https://doi.org/10.11896/jsjkx.201200194 |
[11] | 秦智慧, 李宁, 刘晓彤, 刘秀磊, 佟强, 刘旭红. 无模型强化学习研究综述 Overview of Research on Model-free Reinforcement Learning 计算机科学, 2021, 48(3): 180-187. https://doi.org/10.11896/jsjkx.200700217 |
[12] | 徐琳宏, 刘鑫, 原伟, 祁瑞华. 俄语多模态情感语料库的构建及应用 Construction and Application of Russian Multimodal Emotion Corpus 计算机科学, 2021, 48(11): 312-318. https://doi.org/10.11896/jsjkx.200900088 |
[13] | 仝鑫, 王斌君, 王润正, 潘孝勤. 面向自然语言处理的深度学习对抗样本综述 Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing 计算机科学, 2021, 48(1): 258-267. https://doi.org/10.11896/jsjkx.200500078 |
[14] | 周蔚, 罗旭东. 一种替代性纠纷在线仲裁系统 Alternative Online Arbitration System for Dispute 计算机科学, 2020, 47(6A): 583-590. https://doi.org/10.11896/JsJkx.190900140 |
[15] | 任仪. 基于区块链与人工智能的网络多服务器SIP信息加密系统设计 Design of Network Multi-server SIP Information Encryption System Based on Block Chain and Artificial Intelligence 计算机科学, 2020, 47(6A): 634-638. https://doi.org/10.11896/JsJkx.190600075 |
|