LREC 2012 Proceedings

TOPICS: Browse articles of the conference sorted by topic

A - C - D - E - G - H - I - K - L - M - N - O - P - Q - S - T - U - V - W

A
Acquisition	Corpus based Semi-Automatic Extraction of Persian Compound Verbs and their Relations A Phonemic Corpus of Polish Child-Directed Speech Affective Common Sense Knowledge Acquisition for Sentiment Analysis Leveraging the Wisdom of the Crowds for the Acquisition of Multilingual Language Resources A Classification of Adjectives for Polarity Lexicons Enhancement MULTIPHONIA: a MULTImodal database of PHONetics teaching methods in classroom InterActions. A large scale annotated child language construction database Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages Highlighting relevant concepts from Topic Signatures Customizable SCF Acquisition in Italian PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases Unsupervised acquisition of concatenative morphology Automatic lexical semantic classification of nouns Source-Language Dictionaries Help Non-Expert Users to Enlarge Target-Language Dictionaries for Machine Translation A contrastive review of paraphrase acquisition techniques Assessing Crowdsourcing Quality through Objective Tasks Morphosyntactic Analysis of the CHILDES and TalkBank Corpora A light way to collect comparable corpora from the Web Adaptive Dictionary for Bilingual Lexicon Extraction from Comparable Corpora Boosting the Coverage of a Semantic Lexicon by Automatically Extracted Event Nominalizations A methodology for the extraction of information about the usage of formulaic expressions in scientific texts Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource Extending the adverbial coverage of a French morphological lexicon Building a learner corpus
Anaphora, Coreference	Creating a Coreference Resolution System for Polish QurAna: Corpus of the Quran annotated with Pronominal Anaphora The Use of Parallel and Comparable Data for Analysis of Abstract Anaphora in German and English Interplay of Coreference and Discourse Relations: Discourse Connectives with a Referential Component Discourse-level Annotation over Europarl for Machine Translation: Connectives and Pronouns A Portuguese-Spanish Corpus Annotated for Subject Realization and Referentiality Announcing Prague Czech-English Dependency Treebank 2.0 Coreference in Spoken vs. Written Texts: a Corpus-based Analysis Annotating Near-Identity from Coreference Disagreements The REX corpora: A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues This also affects the context - Errors in extraction based summaries Annotation of anaphoric relations and topic continuity in Japanese conversation Domain-specific vs. Uniform Modeling for Coreference Resolution
Authoring tools, proofing	Incorporating an Error Corpus into a Spellchecker for Maltese Fast Labeling and Transcription with the Speechalyzer Toolkit Risk Analysis and Prevention: LELIE, a Tool dedicated to Procedure and Requirement Authoring Acquisition of Syntactic Simplification Rules for French A Framework for Evaluating Text Correction Similarity Ranking as Attribute for Machine Learning Approach to Authorship Identification Holaaa!! writin like u talk is kewl but kinda hard 4 NLP Spell Checking in Spanish: The Case of Diacritic Accents DramaBank: Annotating Agency in Narrative Discourse Developing Partially-Transcribed Speech Corpus from Edited Transcriptions

C
Cognitive methods	Pedagogical stances and their multimodal signals. Word Sense Inventories by Non-Experts. Pursing power in Arabic on-line discussion forums Reclassifying subcategorization frames for experimental analysis and stimulus generation Assigning Connotation Values to Events A Repository of Rules and Lexical Resources for Discourse Structure Analysis: the Case of Explanation Structures From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information LIE: Leadership, Influence and Expertise A large scale annotated child language construction database Sense Meets Nonsense - Sense Meets Nonsense - a dual-layer Danish speech corpus for perception studies German and English Treebanks and Lexica for Tree-Adjoining Grammars Is it Useful to Support Users with Lexical Resources? A User Study. Evaluating Hebbian Self-Organizing Memories for Lexical Representation and Access Corpus Annotation as a Scientific Task Translog-II: a Program for Recording User Activity Data for Empirical Reading and Writing Research Polish Multimodal Corpus ― a collection of referential gestures
Controlled languages	Risk Analysis and Prevention: LELIE, a Tool dedicated to Procedure and Requirement Authoring Conventional Orthography for Dialectal Arabic English to Indonesian Transliteration to Support English Pronunciation Practice CLCM - A Linguistic Resource for Effective Simplification of Instructions in the Crisis Management Domain and its Evaluations
Corpus (creation, annotation, etc.)	Alignment-based reordering for SMT Annotations for Power Relations on Email Threads Foundations of a Multilayer Annotation Framework for Twitter Communications During Crisis Events An audiovisual political speech analysis incorporating eye-tracking and perception data Word Sense Inventories by Non-Experts. PAMOCAT: Automatic retrieval of specified postures Constructing a Question Corpus for Textual Semantic Relations Matching Cultural Heritage items to Wikipedia ATLIS: Identifying Locational Information in Text Automatically A Speech and Gesture Spatial Corpus in Assisted Living 3rd party observer gaze as a continuous measure of dialogue flow Project FLY: a multidisciplinary project within Linguistics Pursing power in Arabic on-line discussion forums The Dependency-Parsed FrameNet Corpus Incorporating an Error Corpus into a Spellchecker for Maltese Building a 70 billion word corpus of English from ClueWeb Semantic Annotations in Japanese FrameNet: Comparing Frames in Japanese and English Buildind a Resource of Patterns Using Semantic Types A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic. AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis Causal analysis of task completion errors in spoken music retrieval interactions Investigating Engagement - intercultural and technological aspects of the collection, analysis, and use of the Estonian Multiparty Conversational video data Corpus+WordNet thesaurus generation for ontology enriching LDC Forced Aligner Assessing the Comparability of News Texts Boosting statistical tagger accuracy with simple rule-based grammars A New Twitter Verb Lexicon for Natural Language Processing A Corpus for Research on Deliberation and Debate Annotating progressive aspect constructions in the spoken section of the British National Corpus Annotating Spatial Containment Relations Between Events Annotating Agreement and Disagreement in Threaded Discussion NeoTag: a POS Tagger for Grammatical Neologism Detection Using DiAML and ANVIL for multimodal dialogue annotations Annotated Corpora for Word Alignment between Japanese and English and its Evaluation with MAP-based Word Aligner An Annotated Corpus of Film Dialogue for Learning and Characterizing Character Style SPPAS: a tool for the phonetic segmentation of speech Twenty Years of Language Resource Development and Distribution: A Progress Report on LDC Activities A Phonemic Corpus of Polish Child-Directed Speech The KIT Lecture Corpus for Speech Translation Orthographic Transcription: which enrichment is required for phonetization? The Role of Model Testing in Standards Development: The Case of ISO-Space Automatic Speech Recognition on a Firefighter TETRA Broadcast Channel Ubiquitous Usage of a Broad Coverage French Corpus: Processing the Est Republicain corpus A High-Quality Web Corpus of Czech QurAna: Corpus of the Quran annotated with Pronominal Anaphora MLSA ― A Multi-layered Reference Corpus for German Sentiment Analysis Versatile Speech Databases for High Quality Synthesis for Basque A Gold Standard for Relation Extraction in the Food Domain WebAnnotator, an Annotation Tool for Web Pages SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles Automatic Annotation and Manual Evaluation of the Diachronic German Corpus TüBa-D/DC Grammatical Error Annotation for Korean Learners of Spoken English The Use of Parallel and Comparable Data for Analysis of Abstract Anaphora in German and English Light Verb Constructions in the SzegedParalellFX English--Hungarian Parallel Corpus CoALT: A Software for Comparing Automatic Labelling Tools Balanced data repository of spontaneous spoken Czech The coding and annotation of multimodal dialogue acts DutchSemCor: Targeting the ideal sense-tagged corpus Extracting Directional and Comparable Corpora from a Multilingual Corpus for Translation Studies QurSim: A corpus for evaluation of relatedness in short texts Automatic annotation of head velocity and acceleration in Anvil Building a multilingual parallel corpus for human users EmpaTweet: Annotating and Detecting Emotions on Twitter The BladeMistress Corpus: From Talk to Action in Virtual Worlds Event Nominals: Annotation Guidelines and a Manually Annotated Corpus in French The Parallel-TUT: a multilingual and multiformat treebank AnIta: a powerful morphological analyser for Italian CAT: the CELCT Annotation Tool ROMBAC: The Romanian Balanced Annotated Corpus A French Fairy Tale Corpus syntactically and semantically annotated ConanDoyle-neg: Annotation of negation cues and their scope in Conan Doyle stories GerNED: A German Corpus for Named Entity Disambiguation A voting scheme to detect semantic underspecification Interplay of Coreference and Discourse Relations: Discourse Connectives with a Referential Component Robust clause boundary identification for corpus annotation NKI-CCRT Corpus - Speech Intelligibility Before and After Advanced Head and Neck Cancer Treated with Concomitant Chemoradiotherapy PaCo2: A Fully Automated tool for gathering Parallel Corpora from the Web Making Ellipses Explicit in Dependency Conversion for a German Treebank Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing Identifying equivalents of specialized verbs in a bilingual comparable corpus of judgments: A frame-based methodology TimeBankPT: A TimeML Annotated Corpus of Portuguese Korp ― the corpus infrastructure of Spräkbanken Further Developments in Treebank Error Detection Using Derivation Trees MULTIPHONIA: a MULTImodal database of PHONetics teaching methods in classroom InterActions. Discourse-level Annotation over Europarl for Machine Translation: Connectives and Pronouns Logical metonymies and qualia structures: an annotated database of logical metonymies for German HunOr: A Hungarian―Russian Parallel Corpus Introducing the Swedish Kelly-list, a new lexical e-resource for Swedish A Cross-Lingual Dictionary for English Wikipedia Concepts Language Richness of the Web Feature Discovery for Diachronic Register Analysis: a Semi-Automatic Approach DSim, a Danish Parallel Corpus for Text Simplification Propbank-Br: a Brazilian Treebank annotated with semantic role labels A Universal Part-of-Speech Tagset A large scale annotated child language construction database Parallel Aligned Treebanks at LDC: New Challenges Interfacing Existing Infrastructures Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual Multimodal Corpus of Multi-party Conversations in Second Language A Curated Database for Linguistic Research: The Test Case of Cimbrian Varieties Experiences in Resource Generation for Machine Translation through Crowdsourcing Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis AVATecH ― automated annotation through audio and video analysis An Empirical Study of the Occurrence and Co-Occurrence of Named Entities in Natural Language Corpora Iula2Standoff: a tool for creating standoff documents for the IULACT Temporal Annotation: A Proposal for Guidelines and an Experiment with Inter-annotator Agreement Introducing the Reference Corpus of Contemporary Portuguese Online Rule-Based Detection of Clausal Coordinate Ellipsis Evolution of Event Designation in Media: Preliminary Study Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages Sense Meets Nonsense - Sense Meets Nonsense - a dual-layer Danish speech corpus for perception studies The acquisition and dialog act labeling of the EDECAN-SPORTS corpus A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let's Go Bus Information System An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output A Basic Language Resource Kit for Persian Re-ordering Source Sentences for SMT Extended Named Entities Annotation on OCRed Documents: From Corpus Constitution to Evaluation Campaign Joint Grammar and Treebank Development for Mandarin Chinese with HPSG A tree is a Baum is an árbol is a sach'a: Creating a trilingual treebank Investigating Verbal Intelligence Using the TF-IDF Approach Diachronic Changes in Text Complexity in 20th Century English Language: An NLP Approach An English-Portuguese parallel corpus of questions: translation guidelines and application in SMT SMALLWorlds -- Multilingual Content-Controlled Monologues A database of semantic clusters of verb usages Annotating dropped pronouns in Chinese newswire text Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions Mining Hindi-English Transliteration Pairs from Online Hindi Lyrics The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output Annotating Story Timelines as Temporal Dependency Structures A PropBank for Portuguese: the CINTIL-PropBank DeCour: a corpus of DEceptive statements in Italian COURts Irish Treebanking and Parsing: A Preliminary Evaluation Comparing performance of different set-covering strategies for linguistic content optimization in speech corpora Dysarthric Speech Database for Development of QoLT Software Technology CLTC: A Chinese-English Cross-lingual Topic Corpus Improving corpus annotation productivity: a method and experiment with interactive tagging Semantic Relations Established by Specialized Processes Expressed by Nouns and Verbs: Identification in a Corpus by means of Syntactico-semantic Annotation The Language Library: supporting community effort for collective resource production The Australian National Corpus: National Infrastructure for Language Resources A Grammar-informed Corpus-based Sentence Database for Linguistic and Computational Studies ELAN development, keeping pace with communities' needs Revealing Contentious Concepts Across Social Groups Cost and Benefit of Using WordNet Senses for Sentiment Analysis Rembrandt - a named-entity recognition framework Texto4Science: a Quebec French Database of Annotated Short Text Messages Collecting and Analysing Chats and Tweets in SoNaR Prediction of Non-Linguistic Information of Spontaneous Speech from the Prosodic Annotation: Evaluation of the X-JToBI system Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards HamleDT: To Parse or Not to Parse? The Icelandic Parsed Historical Corpus (IcePaHC) Empty Argument Insertion in the Hindi PropBank A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation The goo300k corpus of historical Slovene Inforex -- a web-based tool for text corpus management and semantic annotation A new semantically annotated corpus with syntactic-semantic and cross-lingual senses Massively Increasing TIMEX3 Resources: A Transduction Approach Prague Dependency Style Treebank for Tamil A Distributed Resource Repository for Cloud-Based Machine Translation Treebanking by Sentence and Tree Transformation: Building a Treebank to support Question Answering in Portuguese Parallel Data, Tools and Interfaces in OPUS SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks Annotating and Learning Morphological Segmentation of Egyptian Colloquial Arabic Kitten: a tool for normalizing HTML and extracting its textual content A Portuguese-Spanish Corpus Annotated for Subject Realization and Referentiality A Galician Syntactic Corpus with Application to Intonation Modeling A Reference Dependency Bank for Analyzing Complex Predicates The Influence of Corpus Quality on Statistical Measurements on Language Resources Annotating Qualia Relations in Italian and French Complex Nominals Terra: a Collection of Translation Error-Annotated Corpora Speech and Language Resources for LVCSR of Russian Automatic word alignment tools to scale production of manually aligned parallel texts Developing and evaluating an emergency scenario dialogue corpus A Framework for Evaluating Text Correction Large Scale Lexical Analysis NTUSocialRec: An Evaluation Dataset Constructed from Microblogs for Recommendation Applications in Social Networks Creation and use of Language Resources in a Question-Answering eHealth System Building and Exploiting a Corpus of Dialog Interactions between French Speaking Virtual and Human Agents Collection of a Large Database of French-English SMT Output Corrections Announcing Prague Czech-English Dependency Treebank 2.0 First Results in a Study Evaluating Pre-annotation and Correction Propagation for Machine-Assisted Syriac Morphological Analysis A Corpus of Spontaneous Multi-party Conversation in Bosnian Serbo-Croatian and British English French and German Corpora for Audience-based Text Type Classification The IULA Treebank Modality in Text: a Proposal for Corpus Annotation Rule-based Entity Recognition and Coverage of SNOMED CT in Swedish Clinical Text The Herme Database of Spontaneous Multimodal Human-Robot Dialogues EXMARaLDA and the FOLK tools ― two toolsets for transcribing and annotating spoken language A review corpus annotated for negation, speculation and their scope Developing a large semantically annotated corpus Collection of a corpus of Dutch SMS RIDIRE-CPI: an Open Source Crawling and Processing Infrastructure for Supervised Web-Corpora Building Analyzing the Impact of Prevalence on the Evaluation of a Manual Annotation Campaign LAST MINUTE: a Multimodal Corpus of Speech-based User-Companion Interactions Semantic annotation of French corpora: animacy and verb semantic classes A contrastive review of paraphrase acquisition techniques Expanding Arabic Treebank to Speech: Results from Broadcast News Typing Race Games as a Method to Create Spelling Error Corpora A Search Tool for FrameNet Constructicon Corpus Annotation as a Scientific Task DBpedia: A Multilingual Cross-domain Knowledge Base Designing a search interface for a Spanish learner spoken corpus: the end-user's evaluation Design and compilation of a specialized Spanish-German parallel corpus Conventional Orthography for Dialectal Arabic The annotation of the C-ORAL-BRASIL oral through the implementation of the Palavras Parser Bulgarian X-language Parallel Corpus The MASC Word Sense Corpus A Multilingual Natural Stress Emotion Database The Twins Corpus of Museum Visitor Questions Development of a Web-Scale Chinese Word N-gram Corpus with Parts of Speech Information Medical Term Extraction in an Arabic Medical Corpus Annotation of response tokens and their triggering expressions in Japanese multi-party conversations Method for Collection of Acted Speech Using Various Situation Scripts The Minho Quotation Resource Translog-II: a Program for Recording User Activity Data for Empirical Reading and Writing Research Morphosyntactic Analysis of the CHILDES and TalkBank Corpora Challenges in the development of annotated corpora of computer-mediated communication in Indian Languages: A Case of Hindi Annotating Football Matches: Influence of the Source Medium on Manual Annotation The C-ORAL-BRASIL I: Reference Corpus for Spoken Brazilian Portuguese Coreference in Spoken vs. Written Texts: a Corpus-based Analysis Towards Fully Automatic Annotation of Audio Books for TTS Centroids: Gold standards with distributional variation Multimodal Behaviour and Feedback in Different Types of Interaction Multimedia database of the cultural heritage of the Balkans ANALEC: a New Tool for the Dynamic Annotation of Textual Data Feedback in Nordic First-Encounters: a Comparative Study Annotating Opinions in German Political News MultiUN v2: UN Documents with Multilingual Alignments The CONCISUS Corpus of Event Summaries IDENTIC Corpus: Morphologically Enriched Indonesian-English Parallel Corpus The Joy of Parallelism with CzEng 1.0 LAMP: A Multimodal Web Platform for Collaborative Linguistic Analysis The Polish Sejm Corpus Creating and Curating a Cross-Language Person-Entity Linking Collection A corpus of general and specific sentences from news Annotation Trees: LDC's customizable, extensible, scalable, annotation infrastructure Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing YADAC: Yet another Dialectal Arabic Corpus Hindi Subjective Lexicon: A Lexical Resource for Hindi Adjective Polarity Classification Annotating Near-Identity from Coreference Disagreements The REX corpora: A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues Brand Pitt: A Corpus to Explore the Art of Naming Evaluating automatic cross-domain Dutch semantic role annotation Syntactic annotation of spontaneous speech: application to call-center conversation data Korean Children's Spoken English Corpus and an Analysis of its Pronunciation Variability DECODA: a call-centre human-human spoken conversation corpus The Trilingual ALLEGRA Corpus: Presentation and Possible Use for Lexicon Induction Intelligibility assessment in forensic applications Spontaneous Speech Corpora for language learners of Spanish, Chinese and Japanese TED-LIUM: an Automatic Speech Recognition dedicated corpus The SYNC3 Collaborative Annotation Tool Automatic Translation of Scientific Documents in the HAL Archive The REPERE Corpus : a multimodal corpus for person recognition Efficient Dependency Graph Matching with the IMS Open Corpus Workbench Croatian Dependency Treebank: Recent Development and Initial Experiments A Treebank-driven Creation of an OntoValence Verb lexicon for Bulgarian Customization of the Europarl Corpus for Translation Studies A Parallel Corpus of Music and Lyrics Annotated with Emotions Creation of a bottom-up corpus-based ontology for Italian Linguistics Rhetorical Move Detection in English Abstracts: Multi-label Sentence Classifiers and their Annotated Corpora Polish Multimodal Corpus ― a collection of referential gestures DEGELS1: A comparable corpus of French Sign Language and co-speech gestures Semi-Automatic Sign Language Corpora Annotation using Lexical Representations of Signs Expanding Parallel Resources for Medium-Density Languages for Free Beyond SoNaR: towards the facilitation of large corpus building efforts A GUI to Detect and Correct Errors in Hindi Dependency Treebank Iterative Refinement and Quality Checking of Annotation Guidelines ― How to Deal Effectively with Semantically Sloppy Named Entity Types, such as Pathological Phenomena Annotating Factive Verbs Annotating Errors in a Hungarian Learner Corpus Specifying Treebanks, Outsourcing Parsebanks: FinnTreeBank 3 Romanian TimeBank: An Annotated Parallel Corpus for Temporal Information Chinese Whispers: Cooperative Paraphrase Acquisition The Nordic Dialect Corpus The WeSearch Corpus, Treebank, and Treecache -- A Comprehensive Sample of User-Generated Content Yes we can!? Annotating English modal verbs Building a Multimodal Laughter Database for Emotion Recognition Towards Emotion and Affect Detection in the Multimodal LAST MINUTE Corpus Rapid creation of large-scale corpora and frequency dictionaries Adaptive Dictionary for Bilingual Lexicon Extraction from Comparable Corpora Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification METU Turkish Discourse Bank Browser Evaluating Multi-focus Natural Language Queries over Data Services Development and Application of a Cross-language Document Comparability Metric Document Attrition in Web Corpora: an Exploration A Repository of Data and Evaluation Resources for Natural Language Generation The Quaero Evaluation Initiative on Term Extraction Italian and Spanish Null Subjects. A Case Study Evaluation in an MT Perspective. DGT-TM: A freely available Translation Memory in 22 languages Simplified guidelines for the creation of Large Scale Dialectal Arabic Annotations A Tool/Database Interface for Multi-Level Analyses A Corpus of Scientific Biomedical Texts Spanning over 168 Years Annotated for Uncertainty New language resources for the Pashto language Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language CALBC: Releasing the Final Corpora Getting more data -- Schoolkids as annotators Building Large Corpora from the Web Using a New Efficient Tool Chain RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus Annotated Bibliographical Reference Corpora in Digital Humanities Corpus of Children Voices for Mid-level Markers and Affect Bursts Analysis Fivehundredmillionandone Tokens. Loading the AAC Container with Text Resources for Text Studies. The I3MEDIA speech database: a trilingual annotated corpus for the analysis and synthesis of emotional speech DramaBank: Annotating Agency in Narrative Discourse JRC Eurovoc Indexer JEX - A freely available multi-label categorisation tool Designing French Tale Corpora for Entertaining Text To Speech Synthesis Le Petit Prince in UNL Creating HAVIC: Heterogeneous Audio Visual Internet Collection Multi-Layer Discourse Annotation of a Dutch Text Corpus The Language Archive ― a new hub for language resources A Holistic Approach to Bilingual Sentence Fragment Extraction from Comparable Corpora Evaluation of Discourse Relation Annotation in the Hindi Discourse Relation Bank From Grammar Rule Extraction to Treebanking: A Bootstrapping Approach ULex: new data models and a mobile environment for corpus enrichment. UniDic for Early Middle Japanese: a Dictionary for Morphological Analysis of Classical Japanese An Annotation Scheme for Quantifier Scope Disambiguation A generic formalism to represent linguistic corpora in RDF and OWL/DL A Concise Query Language with Search and Transform Operations for Corpora with Multiple Levels of Annotation Large aligned treebanks for syntax-based machine translation Collecting and Using Comparable Corpora for Statistical Machine Translation The Netlog Corpus. A Resource for the Study of Flemish Dutch Internet Language Clause-based Discourse Segmentation of Arabic Texts Building Japanese Predicate-argument Structure Corpus using Lexical Conceptual Structure An Examination of Cross-Cultural Similarities and Differences from Social Media Data with respect to Language Use Latvian and Lithuanian Named Entity Recognition with TildeNER Collecting humorous expressions from a community-based question-answering-service corpus The Political Speech Corpus of Bulgarian A Database of Attribution Relations A Mandarin-English Code-Switching Corpus KPWr: Towards a Free Corpus of Polish Structural alignment of plain text books Turkish Paraphrase Corpus Resource Evaluation for Usable Speech Interfaces: Utilizing Human-Human Dialogue GATEtoGerManC: A GATE-based Annotation Pipeline for Historical German PET: a Tool for Post-editing and Assessing Machine Translation Enriching the ISST-TANL Corpus with Semantic Frames Construction of the Turkish National Corpus (TNC) Building a learner corpus

D
Dialogue	Annotations for Power Relations on Email Threads A Speech and Gesture Spatial Corpus in Assisted Living 3rd party observer gaze as a continuous measure of dialogue flow Pursing power in Arabic on-line discussion forums Causal analysis of task completion errors in spoken music retrieval interactions A Corpus for Research on Deliberation and Debate Using DiAML and ANVIL for multimodal dialogue annotations An Annotated Corpus of Film Dialogue for Learning and Characterizing Character Style Constructive Interaction for Talking about Interesting Topics Adaptive Speech Understanding for Intuitive Model-based Spoken Dialogues A Tool for Extracting Conversational Implicatures The coding and annotation of multimodal dialogue acts Using multimodal resources for explanation approaches in intelligent systems Multimodal Corpus of Multi-party Conversations in Second Language Evaluation of Online Dialogue Policy Learning Techniques The acquisition and dialog act labeling of the EDECAN-SPORTS corpus Relating Dominance of Dialogue Participants with their Verbal Intelligence Scores Developing and evaluating an emergency scenario dialogue corpus Building and Exploiting a Corpus of Dialog Interactions between French Speaking Virtual and Human Agents A Corpus of Spontaneous Multi-party Conversation in Bosnian Serbo-Croatian and British English The Herme Database of Spontaneous Multimodal Human-Robot Dialogues ISO 24617-2: A semantically-based standard for dialogue annotation Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems The Twins Corpus of Museum Visitor Questions Annotation of response tokens and their triggering expressions in Japanese multi-party conversations The Minho Quotation Resource The REX corpora: A collection of multimodal corpora of referring expressions in collaborative problem solving dialogues Evaluation of the KomParse Conversational Non-Player Characters in a Commercial Virtual World Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora Annotation of anaphoric relations and topic continuity in Japanese conversation Resource Evaluation for Usable Speech Interfaces: Utilizing Human-Human Dialogue
Digital libraries	Matching Cultural Heritage items to Wikipedia First Steps towards the Semi-automatic Development of a Wordformation-based Lexicon of Latin A Curated Database for Linguistic Research: The Test Case of Cimbrian Varieties A tool for enhanced search of multilingual digital libraries of e-journals A Repository for the Sustainable Management of Research Data Proper Language Resource Centers From medical language processing to BioNLP domain A Graphical Citation Browser for the ACL Anthology Document Attrition in Web Corpora: an Exploration META-SHARE v2: An Open Network of Repositories for Language Resources including Data and Tools Accessing and standardizing Wiktionary lexical entries for the translation of labels in Cultural Heritage taxonomies Annotated Bibliographical Reference Corpora in Digital Humanities LDC Language Resource Database: Building a Bibliographic Database
Discourse annotation, representation and processing	Building a database of French frozen adverbial phrases 3rd party observer gaze as a continuous measure of dialogue flow Project FLY: a multidisciplinary project within Linguistics AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis Investigating Engagement - intercultural and technological aspects of the collection, analysis, and use of the Estonian Multiparty Conversational video data A Corpus for Research on Deliberation and Debate Annotating Agreement and Disagreement in Threaded Discussion An Annotated Corpus of Film Dialogue for Learning and Characterizing Character Style German Verb Patterns and Their Implementation in an Electronic Dictionary TIMEN: An Open Temporal Expression Normalisation Resource DISLOG: A logic-based language for processing discourse structures A Repository of Rules and Lexical Resources for Discourse Structure Analysis: the Case of Explanation Structures A Tool for Extracting Conversational Implicatures Automatic annotation of head velocity and acceleration in Anvil Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu Interplay of Coreference and Discourse Relations: Discourse Connectives with a Referential Component Discourse-level Annotation over Europarl for Machine Translation: Connectives and Pronouns Feature Discovery for Diachronic Register Analysis: a Semi-Automatic Approach A topologic view of Topic and Focus marking in Italian Improving the Recall of a Discourse Parser by Constraint-based Postprocessing A Corpus-based Study of the German Recipient Passive A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let's Go Bus Information System Annotating dropped pronouns in Chinese newswire text Capturing syntactico-semantic regularities among terms: An application of the FrameNet methodology to terminology Annotating Story Timelines as Temporal Dependency Structures DeCour: a corpus of DEceptive statements in Italian COURts MISTRAL+: A Melody Intonation Speaker Tonal Range semi-automatic Analysis using variable Levels Annotating a corpus of human interaction with prosodic profiles ― focusing on Mandarin repair/disfluency ELAN development, keeping pace with communities' needs Revealing Contentious Concepts Across Social Groups Alternative Lexicalizations of Discourse Connectives in Czech Massively Increasing TIMEX3 Resources: A Transduction Approach Developing a large semantically annotated corpus Annotation Facilities for the Reliable Analysis of Human Motion Annotation of response tokens and their triggering expressions in Japanese multi-party conversations ANALEC: a New Tool for the Dynamic Annotation of Textual Data METU Turkish Discourse Bank Browser A Graphical Citation Browser for the ACL Anthology An empirical resource for discovering cognitive principles of discourse organisation: the ANNODIS corpus Annotation of anaphoric relations and topic continuity in Japanese conversation DramaBank: Annotating Agency in Narrative Discourse Multi-Layer Discourse Annotation of a Dutch Text Corpus Evaluation of Discourse Relation Annotation in the Hindi Discourse Relation Bank Clause-based Discourse Segmentation of Arabic Texts Domain-specific vs. Uniform Modeling for Coreference Resolution A Database of Attribution Relations
Document Classification, Text categorisation	Statistical Section Segmentation in Free-Text Clinical Records NLP Challenges for Eunomos a Tool to Build and Manage Legal Knowledge Building a 70 billion word corpus of English from ClueWeb Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank Measuring Interlanguage: Native Language Identification with L1-influence Metrics QurSim: A corpus for evaluation of relatedness in short texts Quantising Opinions for Political Tweets Analysis Mapping WordNet synsets to Wikipedia articles “Vreselijk mooi!” (terribly beautiful): A Subjectivity Lexicon for Dutch Adjectives. Investigating Verbal Intelligence Using the TF-IDF Approach DeCour: a corpus of DEceptive statements in Italian COURts Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles Is it Useful to Support Users with Lexical Resources? A User Study. French and German Corpora for Audience-based Text Type Classification Similarity Ranking as Attribute for Machine Learning Approach to Authorship Identification Coreference in Spoken vs. Written Texts: a Corpus-based Analysis Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing Effects of Document Clustering in Modeling Wikipedia-style Term Descriptions Rhetorical Move Detection in English Abstracts: Multi-label Sentence Classifiers and their Annotated Corpora Assessing Divergence Measures for Automated Document Routing in an Adaptive MT System JRC Eurovoc Indexer JEX - A freely available multi-label categorisation tool Unsupervised document zone identification using probabilistic graphical models The Netlog Corpus. A Resource for the Study of Flemish Dutch Internet Language Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text The Political Speech Corpus of Bulgarian

E
Emotion Recognition/Generation	AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis “You Seem Aggressive!” Monitoring Anger in a Practical Application Learning Sentiment Lexicons in Spanish Assigning Connotation Values to Events Mining Sentiment Words from Microblogs for Predicting Writer-Reader Emotion Transition MLSA ― A Multi-layered Reference Corpus for German Sentiment Analysis Affective Common Sense Knowledge Acquisition for Sentiment Analysis EmpaTweet: Annotating and Detecting Emotions on Twitter Quantising Opinions for Political Tweets Analysis A Classification of Adjectives for Polarity Lexicons Enhancement SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis “Vreselijk mooi!” (terribly beautiful): A Subjectivity Lexicon for Dutch Adjectives. Visualizing Sentiment Analysis on a User Forum LAST MINUTE: a Multimodal Corpus of Speech-based User-Companion Interactions Method for Collection of Acted Speech Using Various Situation Scripts Annotating Opinions in German Political News Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing Brand Pitt: A Corpus to Explore the Art of Naming A Parallel Corpus of Music and Lyrics Annotated with Emotions Building a Multimodal Laughter Database for Emotion Recognition Towards Emotion and Affect Detection in the Multimodal LAST MINUTE Corpus The I3MEDIA speech database: a trilingual annotated corpus for the analysis and synthesis of emotional speech A hierarchical approach with feature selection for emotion recognition from speech An Examination of Cross-Cultural Similarities and Differences from Social Media Data with respect to Language Use Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text Fine-grained German Sentiment Analysis on Social Media
Endangered languages	Building a Basque-Chinese Dictionary by Using English as Pivot A Rule-based Morphological Analyzer for Murrinh-Patha Free/Open Source Shallow-Transfer Based Machine Translation for Spanish and Aragonese Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages A tree is a Baum is an árbol is a sach'a: Creating a trilingual treebank Irish Treebanking and Parsing: A Preliminary Evaluation The Trilingual ALLEGRA Corpus: Presentation and Possible Use for Lexicon Induction Glottolog/Langdoc:Increasing the visibility of grey literature for low-density languages Rapid creation of large-scale corpora and frequency dictionaries Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms
Evaluation methodologies	Same domain different discourse style - A case study on Language Resources for data-driven Machine Translation Building a database of French frozen adverbial phrases Assessing the Comparability of News Texts Parsing Any Domain English text to CoNLL dependencies The IWSLT 2011 Evaluation Campaign on Automatic Talk Translation WebAnnotator, an Annotation Tool for Web Pages A new dynamic approach for lexical networks evaluation QurSim: A corpus for evaluation of relatedness in short texts Eye Tracking as a Tool for Machine Translation Error Analysis Can Statistical Post-Editing with a Small Parallel Corpus Save a Weak MT Engine? EVALIEX ― A Proposal for an Extended Evaluation Methodology for Information Extraction Systems BLEU Evaluation of Machine-Translated English-Croatian Legislation Predicting Phrase Breaks in Classical and Modern Standard Arabic Text Involving Language Professionals in the Evaluation of Machine Translation A New Method for Evaluating Automatically Learned Terminological Taxonomies Cross-lingual studies of ASR errors: paradigms for perceptual evaluations Temporal Annotation: A Proposal for Guidelines and an Experiment with Inter-annotator Agreement Evaluating Machine Reading Systems through Comprehension Tests Pragmatic identification of the witness sets A good space: Lexical predictors in word space evaluation Automatic MT Error Analysis: Hjerson Helping Addicter Extended Named Entities Annotation on OCRed Documents: From Corpus Constitution to Evaluation Campaign Investigating Verbal Intelligence Using the TF-IDF Approach Relating Dominance of Dialogue Participants with their Verbal Intelligence Scores Cost and Benefit of Using WordNet Senses for Sentiment Analysis An Adaptive Framework for Named Entity Combination Prediction of Non-Linguistic Information of Spontaneous Speech from the Prosodic Annotation: Evaluation of the X-JToBI system Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles HamleDT: To Parse or Not to Parse? A Rough Set Formalization of Quantitative Evaluation with Ambiguity The Influence of Corpus Quality on Statistical Measurements on Language Resources Terra: a Collection of Translation Error-Annotated Corpora Identifying Nuggets of Information in GALE Distillation Evaluation A Framework for Evaluating Text Correction NTUSocialRec: An Evaluation Dataset Constructed from Microblogs for Recommendation Applications in Social Networks Evaluating and improving syntactic lexica by plugging them within a parser Towards a methodology for automatic identification of hypernyms in the definitions of large-scale dictionary Analyzing the Impact of Prevalence on the Evaluation of a Manual Annotation Campaign Evaluation of Unsupervised Information Extraction Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems Corpus Annotation as a Scientific Task Assessing Crowdsourcing Quality through Objective Tasks Págico: Evaluating Wikipedia-based information retrieval in Portuguese Annotating Football Matches: Influence of the Source Medium on Manual Annotation Centroids: Gold standards with distributional variation Using semi-experts to derive judgments on word sense alignment: a pilot study Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task On the practice of error analysis for machine translation evaluation Evaluation of the KomParse Conversational Non-Player Characters in a Commercial Virtual World Error profiling for evaluation of machine-translated text: a Polish-English case study Adapting and evaluating a generic term extraction tool VERTa: Linguistic features in MT evaluation Evaluating Query Languages for a Corpus Processing System Two Phase Evaluation for Selecting Machine Translation Services Development and Application of a Cross-language Document Comparability Metric The Quaero Evaluation Initiative on Term Extraction Summarizing a multimodal set of documents in a Smart Room Creating HAVIC: Heterogeneous Audio Visual Internet Collection CLCM - A Linguistic Resource for Effective Simplification of Instructions in the Crisis Management Domain and its Evaluations Improving K-Nearest Neighbor Efficacy for Farsi Text Classification LG-Eval: A Toolkit for Creating Online Language Evaluation Experiments Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench International Multicultural Name Matching Competition: Design, Execution, Results, and Lessons Learned Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz Resource Evaluation for Usable Speech Interfaces: Utilizing Human-Human Dialogue

G
Grammar and Syntax	An Open Source Persian Computational Grammar Reclassifying subcategorization frames for experimental analysis and stimulus generation Annotating progressive aspect constructions in the spoken section of the British National Corpus DISLOG: A logic-based language for processing discourse structures Towards an LFG parser for Polish: An exercise in parasitic grammar development Automatic Annotation and Manual Evaluation of the Diachronic German Corpus TüBa-D/DC Grammatical Error Annotation for Korean Learners of Spoken English The Parallel-TUT: a multilingual and multiformat treebank ConanDoyle-neg: Annotation of negation cues and their scope in Conan Doyle stories Robust clause boundary identification for corpus annotation Further Developments in Treebank Error Detection Using Derivation Trees Acquisition of Syntactic Simplification Rules for French A Corpus-based Study of the German Recipient Passive Joint Grammar and Treebank Development for Mandarin Chinese with HPSG Wordnet Based Lexicon Grammar for Polish Combining Language Resources Into A Grammar-Driven Swedish Parser A PropBank for Portuguese: the CINTIL-PropBank Irish Treebanking and Parsing: A Preliminary Evaluation German and English Treebanks and Lexica for Tree-Adjoining Grammars Alternative Lexicalizations of Discourse Connectives in Czech HamleDT: To Parse or Not to Parse? The Icelandic Parsed Historical Corpus (IcePaHC) Prague Dependency Style Treebank for Tamil Treebanking by Sentence and Tree Transformation: Building a Treebank to support Question Answering in Portuguese A Galician Syntactic Corpus with Application to Intonation Modeling A Reference Dependency Bank for Analyzing Complex Predicates German """"nach""""-Particle Verbs in Semantic Theory and Corpus Data Announcing Prague Czech-English Dependency Treebank 2.0 Evaluating and improving syntactic lexica by plugging them within a parser Expanding Arabic Treebank to Speech: Results from Broadcast News A treebank-based study on the influence of Italian word order on parsing performance A Search Tool for FrameNet Constructicon A Morphological Analyzer For Wolof Using Finite-State Techniques Efficient Dependency Graph Matching with the IMS Open Corpus Workbench Croatian Dependency Treebank: Recent Development and Initial Experiments Spell Checking in Spanish: The Case of Diacritic Accents Example-Based Treebank Querying Annotating Errors in a Hungarian Learner Corpus Text Simplification Tools for Spanish CLIMB grammars: three projects using metagrammar engineering From Grammar Rule Extraction to Treebanking: A Bootstrapping Approach Large aligned treebanks for syntax-based machine translation Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms An implementation of a Latvian resource grammar in Grammatical Framework

H
Handwriting recognition	Linguistic Resources for Handwriting Recognition and Translation Evaluation A methodology for the extraction of information about the usage of formulaic expressions in scientific texts

I
Information Extraction, Information Retrieval	Foundations of a Multilayer Annotation Framework for Twitter Communications During Crisis Events Corpus based Semi-Automatic Extraction of Persian Compound Verbs and their Relations Statistical Section Segmentation in Free-Text Clinical Records NLP Challenges for Eunomos a Tool to Build and Manage Legal Knowledge Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results Dependency parsing for interaction detection in pharmacogenomics Buildind a Resource of Patterns Using Semantic Types A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic. A New Twitter Verb Lexicon for Natural Language Processing Annotating Spatial Containment Relations Between Events Aleda, a free large-scale entity database for French Automatic Speech Recognition on a Firefighter TETRA Broadcast Channel Large Scale Semantic Annotation, Indexing and Search at The National Archives TIMEN: An Open Temporal Expression Normalisation Resource A Gold Standard for Relation Extraction in the Food Domain Learning Categories and their Instances by Contextual Features Textual Characteristics for Language Engineering A Survey of Text Mining Architectures and the UIMA Standard Automatic annotation of head velocity and acceleration in Anvil Detecting Reduplication in Videos of American Sign Language EmpaTweet: Annotating and Detecting Emotions on Twitter EVALIEX ― A Proposal for an Extended Evaluation Methodology for Information Extraction Systems Event Nominals: Annotation Guidelines and a Manually Annotated Corpus in French Task-Driven Linguistic Analysis based on an Underspecified Features Representation GerNED: A German Corpus for Named Entity Disambiguation Distractorless Authorship Verification Automatically Extracting Procedural Knowledge from Instructional Texts using Natural Language Processing Challenges in the Knowledge Base Population Slot Filling Task A Cross-Lingual Dictionary for English Wikipedia Concepts Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual SUTime: A library for recognizing and normalizing time expressions AVATecH ― automated annotation through audio and video analysis Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese Evolution of Event Designation in Media: Preliminary Study Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries A good space: Lexical predictors in word space evaluation Semi-Supervised Technical Term Tagging With Minimal User Feedback Relating Dominance of Dialogue Participants with their Verbal Intelligence Scores Diachronic Changes in Text Complexity in 20th Century English Language: An NLP Approach Expertise Mining for Enterprise Content Management Comparing performance of different set-covering strategies for linguistic content optimization in speech corpora CLTC: A Chinese-English Cross-lingual Topic Corpus Revealing Contentious Concepts Across Social Groups Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards Constructing Large Proposition Databases Semantic Role Labeling with the Swedish FrameNet A Resource-light Approach to Phrase Extraction for English and German Documents from the Patent Domain and User Generated Content Identifying Nuggets of Information in GALE Distillation Evaluation Creation and use of Language Resources in a Question-Answering eHealth System Using Wikipedia to Validate the Terminology found in a Corpus of Basic Textbooks The SERENOA Project: Multidimensional Context-Aware Adaptation of Service Front-Ends An Evaluation of the Effect of Automatic Preprocessing on Syntactic Parsing for Biomedical Relation Extraction Federated Search: Towards a Common Search Infrastructure A review corpus annotated for negation, speculation and their scope Evaluating the Impact of Phrase Recognition on Concept Tagging Evaluation of Unsupervised Information Extraction Extraction of unmarked quotations in Newspapers Ontoterminology: How to unify terminology and ontology into a single paradigm Págico: Evaluating Wikipedia-based information retrieval in Portuguese Addressing polysemy in bilingual lexicon extraction from comparable corpora Applying Random Indexing to Structured Data to Find Contextually Similar Words The CONCISUS Corpus of Event Summaries Building and Exploring Semantic Equivalences Resources Creating and Curating a Cross-Language Person-Entity Linking Collection The TARSQI Toolkit YADAC: Yet another Dialectal Arabic Corpus Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization From medical language processing to BioNLP domain Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task Effects of Document Clustering in Modeling Wikipedia-style Term Descriptions Evaluation of a Complex Information Extraction Application in Specific Domain The WeSearch Corpus, Treebank, and Treecache -- A Comprehensive Sample of User-Generated Content Evaluating Query Languages for a Corpus Processing System Identification of Manner in Bio-Events A Corpus of Scientific Biomedical Texts Spanning over 168 Years Annotated for Uncertainty Corpus of Children Voices for Mid-level Markers and Affect Bursts Analysis Creating a Data Collection for Evaluating Rich Speech Retrieval A hierarchical approach with feature selection for emotion recognition from speech Combining Formal Concept Analysis and semantic information for building ontological structures from texts : an exploratory study Collecting humorous expressions from a community-based question-answering-service corpus Structural alignment of plain text books Dealing with unknown words in statistical machine translation

K
Knowledge Discovery/Representation	Buildind a Resource of Patterns Using Semantic Types Representing General Relational Knowledge in ConceptNet 5 Logic Based Methods for Terminological Assessment Affective Common Sense Knowledge Acquisition for Sentiment Analysis A new dynamic approach for lexical networks evaluation A Tool for Extracting Conversational Implicatures Polaris: Lymba's Semantic Parser Using multimodal resources for explanation approaches in intelligent systems Automatically Extracting Procedural Knowledge from Instructional Texts using Natural Language Processing Feature Discovery for Diachronic Register Analysis: a Semi-Automatic Approach Concept-based Selectional Preferences and Distributional Representations from Wikipedia Articles An ontological approach to model and query multimodal concurrent linguistic annotations Constructing Large Proposition Databases Towards a methodology for automatic identification of hypernyms in the definitions of large-scale dictionary Associative and Semantic Features Extracted From Web-Harvested Corpora A treebank-based study on the influence of Italian word order on parsing performance Ontoterminology: How to unify terminology and ontology into a single paradigm Detection of Peculiar Word Sense by Distance Metric Learning with Labeled Examples YADAC: Yet another Dialectal Arabic Corpus Knowledge-Rich Context Extraction and Ranking with KnowPipe Irregularity Detection in Categorized Document Corpora Spell Checking for Chinese W-PhAMT: A web tool for phonetic multilevel timeline visualization Application of a Semantic Search Algorithm to Semi-Automatic GUI Generation Identification of Manner in Bio-Events The KnowledgeStore: an Entity-Based Storage System Le Petit Prince in UNL Recognition of Polish Derivational Relations Based on Supervised Learning Scheme Combining Formal Concept Analysis and semantic information for building ontological structures from texts : an exploratory study RELcat: a Relation Registry for ISOcat data categories A disambiguation resource extracted from Wikipedia for semantic annotation

L
Language Identification	Language Richness of the Web FreeLing 3.0: Towards Wider Multilinguality KALAKA-2: a TV Broadcast Speech Database for the Recognition of Iberian Languages in Clean and Noisy Environments Development of a Web-Scale Chinese Word N-gram Corpus with Parts of Speech Information Simplified guidelines for the creation of Large Scale Dialectal Arabic Annotations Using the International Standard Language Resource Number: Practical and Technical Aspects An Analytical Model of Language Resource Sustainability
Language modelling	An Open Source Persian Computational Grammar Boosting statistical tagger accuracy with simple rule-based grammars MLSA ― A Multi-layered Reference Corpus for German Sentiment Analysis Measuring Interlanguage: Native Language Identification with L1-influence Metrics DISLOG: A logic-based language for processing discourse structures Corpus-based Referring Expressions Generation Portuguese Text Generation from Large Corpora LIE: Leadership, Influence and Expertise Item Development and Scoring for Japanese Oral Proficiency Testing Using Verb Subcategorization for Word Sense Disambiguation Rule-Based Detection of Clausal Coordinate Ellipsis Sense Meets Nonsense - Sense Meets Nonsense - a dual-layer Danish speech corpus for perception studies Concept-based Selectional Preferences and Distributional Representations from Wikipedia Articles Word Alignment for English-Turkish Language Pair Dbnary: Wiktionary as a LMF based Multilingual RDF network Improving corpus annotation productivity: a method and experiment with interactive tagging The goo300k corpus of historical Slovene Unsupervised acquisition of concatenative morphology Speech and Language Resources for LVCSR of Russian Arabic Word Generation and Modelling for Spell Checking Multimodal Behaviour and Feedback in Different Types of Interaction Feedback in Nordic First-Encounters: a Comparative Study Suffix Trees as Language Models The Romanian Neuter Examined Through A Two-Gender N-Gram Classification System Spell Checking for Chinese Semi-Automatic Sign Language Corpora Annotation using Lexical Representations of Signs Specifying Treebanks, Outsourcing Parsebanks: FinnTreeBank 3 The New IDS Corpus Analysis Platform: Challenges and Prospects Simplified guidelines for the creation of Large Scale Dialectal Arabic Annotations Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language Linguistic Analysis Processing Line for Bulgarian CLIMB grammars: three projects using metagrammar engineering A platform-independent user-friendly dictionary from Italian to LIS
Lexicon, lexical database	Corpus based Semi-Automatic Extraction of Persian Compound Verbs and their Relations Word Sense Inventories by Non-Experts. Building a fine-grained subjectivity lexicon from a web corpus Building a database of French frozen adverbial phrases Constraint Based Description of Polish Multiword Expressions The Common Orthographic Vocabulary of the Portuguese Language: a set of open lexical resources for a pluricentric language The Dependency-Parsed FrameNet Corpus Automatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques Semantic Annotations in Japanese FrameNet: Comparing Frames in Japanese and English Generation of Verbal Stems in Derivationally Rich Language Corpus+WordNet thesaurus generation for ontology enriching A New Twitter Verb Lexicon for Natural Language Processing Learning Sentiment Lexicons in Spanish Logic Based Methods for Terminological Assessment NeoTag: a POS Tagger for Grammatical Neologism Detection Assigning Connotation Values to Events Aleda, a free large-scale entity database for French Cleaning noisy wordnets Wordnet extension made simple: A multilingual lexicon-based approach using wiki resources Building a Basque-Chinese Dictionary by Using English as Pivot Mining Sentiment Words from Microblogs for Predicting Writer-Reader Emotion Transition German Verb Patterns and Their Implementation in an Electronic Dictionary Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank Towards an LFG parser for Polish: An exercise in parasitic grammar development First Steps towards the Semi-automatic Development of a Wordformation-based Lexicon of Latin Representing the Translation Relation in a Bilingual Wordnet AnIta: a powerful morphological analyser for Italian Automatic classification of German """"an"""" particle verbs A Classification of Adjectives for Polarity Lexicons Enhancement Mapping WordNet synsets to Wikipedia articles SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis Identifying equivalents of specialized verbs in a bilingual comparable corpus of judgments: A frame-based methodology Towards a richer wordnet representation of properties The open lexical infrastructure of Spräkbanken Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource for Lexical Substitution PoliMorf: a (not so) new open morphological dictionary for Polish Introducing the Swedish Kelly-list, a new lexical e-resource for Swedish Propbank-Br: a Brazilian Treebank annotated with semantic role labels Constructing a Class-Based Lexical Dictionary using Interactive Topic Models Dictionary Look-up with Katakana Variant Recognition Two Database Resources for Processing Social Media English Text Multilingual Central Repository version 3.0 A New Method for Evaluating Automatically Learned Terminological Taxonomies Adding Morpho-semantic Relations to the Romanian Wordnet The Rocky Road towards a Swedish FrameNet - Creating SweFN An Empirical Study of the Occurrence and Co-Occurrence of Named Entities in Natural Language Corpora “Vreselijk mooi!” (terribly beautiful): A Subjectivity Lexicon for Dutch Adjectives. A proposal for improving WordNet Domains Free/Open Source Shallow-Transfer Based Machine Translation for Spanish and Aragonese Wordnet Based Lexicon Grammar for Polish A database of semantic clusters of verb usages Capturing syntactico-semantic regularities among terms: An application of the FrameNet methodology to terminology Highlighting relevant concepts from Topic Signatures Extending a wordnet framework for simplicity and scalability A Framework for Spelling Correction in Persian Language Using Noisy Channel Model Dbnary: Wiktionary as a LMF based Multilingual RDF network Customizable SCF Acquisition in Italian Statistical Evaluation of Pronunciation Encoding Semantic Relations Established by Specialized Processes Expressed by Nouns and Verbs: Identification in a Corpus by means of Syntactico-semantic Annotation German and English Treebanks and Lexica for Tree-Adjoining Grammars Texto4Science: a Quebec French Database of Annotated Short Text Messages Linguistic knowledge for specialized text production Empty Argument Insertion in the Hindi PropBank Visualizing Sentiment Analysis on a User Forum Semantic Role Labeling with the Swedish FrameNet Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects UBY-LMF -- A Uniform Model for Standardizing Heterogeneous Lexical-Semantic Resources in ISO-LMF Applying cross-lingual WSD to wordnet development Annotating Qualia Relations in Italian and French Complex Nominals Large Scale Lexical Analysis Automatic lexical semantic classification of nouns In the same boat and other idiomatic seafaring expressions Evaluating and improving syntactic lexica by plugging them within a parser Collaborative semantic editing of linked data lexica Extraction of unmarked quotations in Newspapers Association Norms of German Noun Compounds Word Sketches for Turkish The MASC Word Sense Corpus Representation of linguistic and domain knowledge for second language learning in virtual worlds Addressing polysemy in bilingual lexicon extraction from comparable corpora Automatically Generated Online Dictionaries Automatic Extraction and Evaluation of Arabic LFG Resources The Minho Quotation Resource LexIt: A Computational Resource on Italian Argument Structure Applying Random Indexing to Structured Data to Find Contextually Similar Words Using semi-experts to derive judgments on word sense alignment: a pilot study Hindi Subjective Lexicon: A Lexical Resource for Hindi Adjective Polarity Classification Knowledge-Rich Context Extraction and Ranking with KnowPipe The Trilingual ALLEGRA Corpus: Presentation and Possible Use for Lexicon Induction A Treebank-driven Creation of an OntoValence Verb lexicon for Bulgarian NgramQuery - Smart Information Extraction from Google N-gram using External Resources Adapting and evaluating a generic term extraction tool Legal electronic dictionary for Czech A Corpus of Scientific Biomedical Texts Spanning over 168 Years Annotated for Uncertainty Visualizing word senses in WordNet Atlas Boosting the Coverage of a Semantic Lexicon by Automatically Extracted Event Nominalizations ”Rendering Endangered Lexicons Interoperable through Standards Harmonization”: the RELISH project Empirical Comparisons of MASC Word Sense Annotations Identifying Word Translations from Comparable Documents Without a Seed Lexicon Detecting Japanese Compound Functional Expressions using Canonical/Derivational Relation ULex: new data models and a mobile environment for corpus enrichment. UniDic for Early Middle Japanese: a Dictionary for Morphological Analysis of Classical Japanese A platform-independent user-friendly dictionary from Italian to LIS Mapping WordNet to the Kyoto ontology Tools for plWordNet Development. Presentation and Perspectives Recognition of Polish Derivational Relations Based on Supervised Learning Scheme Building Japanese Predicate-argument Structure Corpus using Lexical Conceptual Structure Extending the adverbial coverage of a French morphological lexicon Reconstructing the Diachronic Morphology of Romanian from Dictionary Citations A Fast, Memory Efficient, Scalable and Multilingual Dictionary Retriever A disambiguation resource extracted from Wikipedia for semantic annotation Fine-grained German Sentiment Analysis on Social Media
LR Infrastructures and Architectures	The Common Orthographic Vocabulary of the Portuguese Language: a set of open lexical resources for a pluricentric language Building Synthetic Voices in the META-NET Framework Representing General Relational Knowledge in ConceptNet 5 The META-SHARE Language Resources Sharing Infrastructure: Principles, Challenges, Solutions Smooth Sailing for STEVIN Tackling interoperability issues within UIMA workflows A High-Quality Web Corpus of Czech Towards a comprehensive open repository of Polish language resources Textual Characteristics for Language Engineering A Survey of Text Mining Architectures and the UIMA Standard Cloud Logic Programming for Integrating Language Technology Resources Aspects of a Legal Framework for Language Resource Management Korp ― the corpus infrastructure of Spräkbanken The open lexical infrastructure of Spräkbanken The Rocky Road towards a Swedish FrameNet - Creating SweFN A Basic Language Resource Kit for Persian The Language Library: supporting community effort for collective resource production The Australian National Corpus: National Infrastructure for Language Resources A Grammar-informed Corpus-based Sentence Database for Linguistic and Computational Studies PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases A Scalable Architecture For Web Deployment of Spoken Dialogue Systems Semantic metadata mapping in practice: the Virtual Language Observatory Recent Developments in CLARIN-NL A Distributed Resource Repository for Cloud-Based Machine Translation Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects Parallel Data, Tools and Interfaces in OPUS A Metadata Editor to Support the Description of Linguistic Resources A Repository for the Sustainable Management of Research Data UBY-LMF -- A Uniform Model for Standardizing Heterogeneous Lexical-Semantic Resources in ISO-LMF German """"nach""""-Particle Verbs in Semantic Theory and Corpus Data Federated Search: Towards a Common Search Infrastructure EXMARaLDA and the FOLK tools ― two toolsets for transcribing and annotating spoken language Dynamic web service deployment in a cloud environment Towards a User-Friendly Platform for Building Language Resources based on Web Services Proper Language Resource Centers RIDIRE-CPI: an Open Source Crawling and Processing Infrastructure for Supervised Web-Corpora Building Typing Race Games as a Method to Create Spelling Error Corpora Standardizing a Component Metadata Infrastructure Citing on-line Language Resources The C-ORAL-BRASIL I: Reference Corpus for Spoken Brazilian Portuguese On Using Linked Data for Language Resource Sharing in the Long Tail of the Localisation Market LAMP: A Multimodal Web Platform for Collaborative Linguistic Analysis An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines) Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension The SYNC3 Collaborative Annotation Tool ELRA in the heart of a cooperative HLT world Using Language Resources in Humanities research Integrating NLP Tools in a Distributed Environment: A Case Study Chaining a Tagger with a Dependency Parser Glottolog/Langdoc:Increasing the visibility of grey literature for low-density languages Creation of an Open Shared Language Resource Repository in the Nordic and Baltic Countries Beyond SoNaR: towards the facilitation of large corpus building efforts Example-Based Treebank Querying The LRE Map. Harmonising Community Descriptions of Resources The New IDS Corpus Analysis Platform: Challenges and Prospects Evaluating Query Languages for a Corpus Processing System Towards automation in using multi-modal language resources: compatibility and interoperability for multi-modal features in Kachako A Tool/Database Interface for Multi-Level Analyses Linguistic Analysis Processing Line for Bulgarian The KnowledgeStore: an Entity-Based Storage System Classifying Standard Linguistic Processing Functionalities based on Fundamental Data Operation Types Linguagrid: a network of Linguistic and Semantic Services for the Italian Language. The Language Archive ― a new hub for language resources Ontologies of Linguistic Annotation: Survey and perspectives A generic formalism to represent linguistic corpora in RDF and OWL/DL RELcat: a Relation Registry for ISOcat data categories International Multicultural Name Matching Competition: Design, Execution, Results, and Lessons Learned GATEtoGerManC: A GATE-based Annotation Pipeline for Historical German Building a learner corpus
LR national/international projects, organizational/policy issues	The Common Orthographic Vocabulary of the Portuguese Language: a set of open lexical resources for a pluricentric language Building Synthetic Voices in the META-NET Framework Smooth Sailing for STEVIN Towards a comprehensive open repository of Polish language resources Balanced data repository of spontaneous spoken Czech Aspects of a Legal Framework for Language Resource Management PoliMorf: a (not so) new open morphological dictionary for Polish Introducing the Swedish Kelly-list, a new lexical e-resource for Swedish The Language Library: supporting community effort for collective resource production The Australian National Corpus: National Infrastructure for Language Resources Texto4Science: a Quebec French Database of Annotated Short Text Messages Recent Developments in CLARIN-NL Proper Language Resource Centers Medical Term Extraction in an Arabic Medical Corpus Web Service integration platform for Polish linguistic resources The Polish Sejm Corpus ELRA in the heart of a cooperative HLT world Creation of an Open Shared Language Resource Repository in the Nordic and Baltic Countries Beyond SoNaR: towards the facilitation of large corpus building efforts The LRE Map. Harmonising Community Descriptions of Resources Romanian TimeBank: An Annotated Parallel Corpus for Temporal Information Legal electronic dictionary for Czech The FLaReNet Strategic Language Resource Agenda Fivehundredmillionandone Tokens. Loading the AAC Container with Text Resources for Text Studies. The Open Linguistics Working Group Collecting and Using Comparable Corpora for Statistical Machine Translation Enriching the ISST-TANL Corpus with Semantic Frames

M
Machine Translation, SpeechToSpeech Translation	Alignment-based reordering for SMT Same domain different discourse style - A case study on Language Resources for data-driven Machine Translation Tajik-Farsi Persian Transliteration Using Statistical Machine Translation Assessing the Comparability of News Texts A finite-state morphological transducer for Kyrgyz Annotated Corpora for Word Alignment between Japanese and English and its Evaluation with MAP-based Word Aligner The KIT Lecture Corpus for Speech Translation The IWSLT 2011 Evaluation Campaign on Automatic Talk Translation Building a Basque-Chinese Dictionary by Using English as Pivot SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles Eye Tracking as a Tool for Machine Translation Error Analysis Can Statistical Post-Editing with a Small Parallel Corpus Save a Weak MT Engine? PaCo2: A Fully Automated tool for gathering Parallel Corpora from the Web BLEU Evaluation of Machine-Translated English-Croatian Legislation Parallel Aligned Treebanks at LDC: New Challenges Interfacing Existing Infrastructures Experiences in Resource Generation for Machine Translation through Crowdsourcing Involving Language Professionals in the Evaluation of Machine Translation Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese Free/Open Source Shallow-Transfer Based Machine Translation for Spanish and Aragonese Automatic MT Error Analysis: Hjerson Helping Addicter An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output Re-ordering Source Sentences for SMT An English-Portuguese parallel corpus of questions: translation guidelines and application in SMT The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output Word Alignment for English-Turkish Language Pair PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora Evaluating Appropriateness Of System Responses In A Spoken CALL Game A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation A Distributed Resource Repository for Cloud-Based Machine Translation Terra: a Collection of Translation Error-Annotated Corpora Automatic word alignment tools to scale production of manually aligned parallel texts In the same boat and other idiomatic seafaring expressions Collection of a Large Database of French-English SMT Output Corrections Arabic-Segmentation Combination Strategies for Statistical Machine Translation Source-Language Dictionaries Help Non-Expert Users to Enlarge Target-Language Dictionaries for Machine Translation A light way to collect comparable corpora from the Web MultiUN v2: UN Documents with Multilingual Alignments The Joy of Parallelism with CzEng 1.0 Suffix Trees as Language Models Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension Automatic Translation of Scientific Documents in the HAL Archive On the practice of error analysis for machine translation evaluation Error profiling for evaluation of machine-translated text: a Polish-English case study Expanding Parallel Resources for Medium-Density Languages for Free VERTa: Linguistic features in MT evaluation Development and Application of a Cross-language Document Comparability Metric Italian and Spanish Null Subjects. A Case Study Evaluation in an MT Perspective. DGT-TM: A freely available Translation Memory in 22 languages New language resources for the Pashto language Assessing Divergence Measures for Automated Document Routing in an Adaptive MT System Identifying bilingual Multi-Word Expressions for Statistical Machine Translation Identifying Word Translations from Comparable Documents Without a Seed Lexicon A Study of Word-Classing for MT Reordering Large aligned treebanks for syntax-based machine translation Dealing with unknown words in statistical machine translation PET: a Tool for Post-editing and Assessing Machine Translation The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation
Metadata	“You Seem Aggressive!” Monitoring Anger in a Practical Application Fast Labeling and Transcription with the Speechalyzer Toolkit Introducing the Reference Corpus of Contemporary Portuguese Online Collecting and Analysing Chats and Tweets in SoNaR Semantic metadata mapping in practice: the Virtual Language Observatory A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation A Metadata Editor to Support the Description of Linguistic Resources Collection of a corpus of Dutch SMS Standardizing a Component Metadata Infrastructure Bulgarian X-language Parallel Corpus Challenges in the development of annotated corpora of computer-mediated communication in Indian Languages: A Case of Hindi Iterative Refinement and Quality Checking of Annotation Guidelines ― How to Deal Effectively with Semantically Sloppy Named Entity Types, such as Pathological Phenomena The LRE Map. Harmonising Community Descriptions of Resources META-SHARE v2: An Open Network of Repositories for Language Resources including Data and Tools The KnowledgeStore: an Entity-Based Storage System The Open Linguistics Working Group LDC Language Resource Database: Building a Bibliographic Database The META-SHARE Metadata Schema for the Description of Language Resources
Morphology	Constraint Based Description of Polish Multiword Expressions Generation of Verbal Stems in Derivationally Rich Language A finite-state morphological transducer for Kyrgyz NeoTag: a POS Tagger for Grammatical Neologism Detection First Steps towards the Semi-automatic Development of a Wordformation-based Lexicon of Latin A Rule-based Morphological Analyzer for Murrinh-Patha The Impact of Automatic Morphological Analysis & Disambiguation on Dependency Parsing of Turkish AnIta: a powerful morphological analyser for Italian PoliMorf: a (not so) new open morphological dictionary for Polish Word Alignment for English-Turkish Language Pair Unsupervised acquisition of concatenative morphology Annotating and Learning Morphological Segmentation of Egyptian Colloquial Arabic Arabic-Segmentation Combination Strategies for Statistical Machine Translation First Results in a Study Evaluating Pre-annotation and Correction Propagation for Machine-Assisted Syriac Morphological Analysis Source-Language Dictionaries Help Non-Expert Users to Enlarge Target-Language Dictionaries for Machine Translation Evaluating Hebbian Self-Organizing Memories for Lexical Representation and Access A Morphological Analyzer For Wolof Using Finite-State Techniques Arabic Word Generation and Modelling for Spell Checking Automatic Extraction and Evaluation of Arabic LFG Resources IDENTIC Corpus: Morphologically Enriched Indonesian-English Parallel Corpus The Romanian Neuter Examined Through A Two-Gender N-Gram Classification System Expanding Parallel Resources for Medium-Density Languages for Free Annotating Errors in a Hungarian Learner Corpus Analyzing and Aligning German compound nouns Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier Recognition of Polish Derivational Relations Based on Supervised Learning Scheme The Netlog Corpus. A Resource for the Study of Flemish Dutch Internet Language Reconstructing the Diachronic Morphology of Romanian from Dictionary Citations Construction of the Turkish National Corpus (TNC)
Multilinguality	Alignment-based reordering for SMT Tajik-Farsi Persian Transliteration Using Statistical Machine Translation An Open Source Persian Computational Grammar Automatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques Building Text-to-Speech Systems for Resource Poor Languages Representing General Relational Knowledge in ConceptNet 5 Learning Sentiment Lexicons in Spanish Unsupervised Word Sense Disambiguation with Multilingual Representations Measuring Interlanguage: Native Language Identification with L1-influence Metrics Light Verb Constructions in the SzegedParalellFX English--Hungarian Parallel Corpus Extracting Directional and Comparable Corpora from a Multilingual Corpus for Translation Studies Representing the Translation Relation in a Bilingual Wordnet Building a multilingual parallel corpus for human users BiBiKit - A Bilingual Bimodal Reading and Writing Tool for Sign Language Users The Parallel-TUT: a multilingual and multiformat treebank Identifying equivalents of specialized verbs in a bilingual comparable corpus of judgments: A frame-based methodology HunOr: A Hungarian―Russian Parallel Corpus Language Richness of the Web A Universal Part-of-Speech Tagset Parallel Aligned Treebanks at LDC: New Challenges Interfacing Existing Infrastructures Linguistic Resources for Entity Linking Evaluation: from Monolingual to Cross-lingual Multimodal Corpus of Multi-party Conversations in Second Language Multilingual Central Repository version 3.0 Cross-lingual studies of ASR errors: paradigms for perceptual evaluations Chinese Characters Mapping Table of Japanese, Traditional Chinese and Simplified Chinese Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries A tree is a Baum is an árbol is a sach'a: Creating a trilingual treebank SMALLWorlds -- Multilingual Content-Controlled Monologues Mining Hindi-English Transliteration Pairs from Online Hindi Lyrics A tool for enhanced search of multilingual digital libraries of e-journals Dbnary: Wiktionary as a LMF based Multilingual RDF network CLTC: A Chinese-English Cross-lingual Topic Corpus Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles FreeLing 3.0: Towards Wider Multilinguality A new semantically annotated corpus with syntactic-semantic and cross-lingual senses A Resource-light Approach to Phrase Extraction for English and German Documents from the Patent Domain and User Generated Content Discovering Missing Wikipedia Inter-language Links by means of Cross-lingual Word Sense Disambiguation French and German Corpora for Audience-based Text Type Classification Bulgarian X-language Parallel Corpus Automatically Generated Online Dictionaries Feedback in Nordic First-Encounters: a Comparative Study MultiUN v2: UN Documents with Multilingual Alignments The CONCISUS Corpus of Event Summaries Knowledge-Rich Context Extraction and Ranking with KnowPipe Customization of the Europarl Corpus for Translation Studies DEGELS1: A comparable corpus of French Sign Language and co-speech gestures Spell Checking in Spanish: The Case of Diacritic Accents Adaptive Dictionary for Bilingual Lexicon Extraction from Comparable Corpora Linguistic Resources for Handwriting Recognition and Translation Evaluation Italian and Spanish Null Subjects. A Case Study Evaluation in an MT Perspective. DGT-TM: A freely available Translation Memory in 22 languages Analyzing and Aligning German compound nouns Accessing and standardizing Wiktionary lexical entries for the translation of labels in Cultural Heritage taxonomies Extending the MPC corpus to Chinese and Urdu - A Multiparty Multi-Lingual Chat Corpus for Modeling Social Phenomena in Language BUCEADOR, a multi-language search engine for digital libraries CLIMB grammars: three projects using metagrammar engineering Le Petit Prince in UNL Identifying bilingual Multi-Word Expressions for Statistical Machine Translation Identifying Word Translations from Comparable Documents Without a Seed Lexicon A Mandarin-English Code-Switching Corpus A Fast, Memory Efficient, Scalable and Multilingual Dictionary Retriever Measuring the Divergence of Dependency Structures Cross-Linguistically to Improve Syntactic Projection Algorithms An implementation of a Latvian resource grammar in Grammatical Framework The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation
Multimedia Document Processing	PAMOCAT: Automatic retrieval of specified postures AVATecH ― automated annotation through audio and video analysis Comparing computer vision analysis of signed language video with motion capture recordings The REPERE Corpus : a multimodal corpus for person recognition A Parallel Corpus of Music and Lyrics Annotated with Emotions BUCEADOR, a multi-language search engine for digital libraries Summarizing a multimodal set of documents in a Smart Room Creating HAVIC: Heterogeneous Audio Visual Internet Collection Creating a Data Collection for Evaluating Rich Speech Retrieval
MultiWord Expressions & Collocations	Constraint Based Description of Polish Multiword Expressions Annotated Corpora for Word Alignment between Japanese and English and its Evaluation with MAP-based Word Aligner German Verb Patterns and Their Implementation in an Electronic Dictionary Light Verb Constructions in the SzegedParalellFX English--Hungarian Parallel Corpus Wordnet Based Lexicon Grammar for Polish Linguistic knowledge for specialized text production Automatic word alignment tools to scale production of manually aligned parallel texts Measuring the compositionality of NV expressions in Basque by means of distributional similarity techniques Association Norms of German Noun Compounds Evaluating the Impact of External Lexical Resources into a CRF-based Multiword Segmenter and Part-of-Speech Tagger Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization Statistical Machine Translation without Source-side Parallel Corpus Using Word Lattice and Phrase Extension Evaluation of Classification Algorithms and Features for Collocation Extraction in Croatian Analyzing and Aligning German compound nouns Using Noun Similarity to Adapt an Acceptability Measure for Persian Light Verb Constructions Identifying bilingual Multi-Word Expressions for Statistical Machine Translation Automatic Term Recognition Needs Multiple Evidence Detecting Japanese Compound Functional Expressions using Canonical/Derivational Relation

N
Named Entity recognition	Foundations of a Multilayer Annotation Framework for Twitter Communications During Crisis Events Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results Aleda, a free large-scale entity database for French Learning Categories and their Instances by Contextual Features HunOr: A Hungarian―Russian Parallel Corpus An Empirical Study of the Occurrence and Co-Occurrence of Named Entities in Natural Language Corpora Extended Named Entities Annotation on OCRed Documents: From Corpus Constitution to Evaluation Campaign Rembrandt - a named-entity recognition framework An Adaptive Framework for Named Entity Combination Temporal Tagging on Different Domains: Challenges, Strategies, and Gold Standards The ETAPE corpus for the evaluation of speech-based TV content processing in the French language Rule-based Entity Recognition and Coverage of SNOMED CT in Swedish Clinical Text Evaluating the Impact of Phrase Recognition on Concept Tagging Centroids: Gold standards with distributional variation Iterative Refinement and Quality Checking of Annotation Guidelines ― How to Deal Effectively with Semantically Sloppy Named Entity Types, such as Pathological Phenomena Evaluation of a Complex Information Extraction Application in Specific Domain CALBC: Releasing the Final Corpora Latvian and Lithuanian Named Entity Recognition with TildeNER KPWr: Towards a Free Corpus of Polish
Natural Language Generation	Generation of Verbal Stems in Derivationally Rich Language Corpus-based Referring Expressions Generation Portuguese Text Generation from Large Corpora SemScribe: Natural Language Generation for Medical Reports DSim, a Danish Parallel Corpus for Text Simplification Representation of linguistic and domain knowledge for second language learning in virtual worlds A Repository of Data and Evaluation Resources for Natural Language Generation Unsupervised document zone identification using probabilistic graphical models LG-Eval: A Toolkit for Creating Online Language Evaluation Experiments

O
Ontologies	NLP Challenges for Eunomos a Tool to Build and Manage Legal Knowledge Corpus+WordNet thesaurus generation for ontology enriching Reclassifying subcategorization frames for experimental analysis and stimulus generation SemScribe: Natural Language Generation for Medical Reports Adaptive Speech Understanding for Intuitive Model-based Spoken Dialogues A French Fairy Tale Corpus syntactically and semantically annotated Constructing a Class-Based Lexical Dictionary using Interactive Topic Models Multilingual Central Repository version 3.0 Adding Morpho-semantic Relations to the Romanian Wordnet Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries A proposal for improving WordNet Domains An ontological approach to model and query multimodal concurrent linguistic annotations PEARL: ProjEction of Annotations Rule Language, a Language for Projecting (UIMA) Annotations over RDF Knowledge Bases The IMAGACT Cross-linguistic Ontology of Action. A new infrastructure for natural language disambiguation Rule-based Entity Recognition and Coverage of SNOMED CT in Swedish Clinical Text Towards a methodology for automatic identification of hypernyms in the definitions of large-scale dictionary Collaborative semantic editing of linked data lexica Ontoterminology: How to unify terminology and ontology into a single paradigm DBpedia: A Multilingual Cross-domain Knowledge Base Representation of linguistic and domain knowledge for second language learning in virtual worlds Applying Random Indexing to Structured Data to Find Contextually Similar Words A Treebank-driven Creation of an OntoValence Verb lexicon for Bulgarian Creation of a bottom-up corpus-based ontology for Italian Linguistics Boosting the Coverage of a Semantic Lexicon by Automatically Extracted Event Nominalizations Mapping WordNet to the Kyoto ontology Ontologies of Linguistic Annotation: Survey and perspectives A generic formalism to represent linguistic corpora in RDF and OWL/DL Combining Formal Concept Analysis and semantic information for building ontological structures from texts : an exploratory study Extending the EmotiNet Knowledge Base to Improve the Automatic Detection of Implicitly Expressed Emotions from Text
Other	An audiovisual political speech analysis incorporating eye-tracking and perception data Project FLY: a multidisciplinary project within Linguistics Effort of Genre Variation and Prediction of System Performance Parsing Any Domain English text to CoNLL dependencies Smooth Sailing for STEVIN A Gold Standard for Relation Extraction in the Food Domain Body-conductive acoustic sensors in human-robot communication From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information SemScribe: Natural Language Generation for Medical Reports Grammatical Error Annotation for Korean Learners of Spoken English Eye Tracking as a Tool for Machine Translation Error Analysis Cloud Logic Programming for Integrating Language Technology Resources Aspects of a Legal Framework for Language Resource Management Towards a richer wordnet representation of properties Recognition of Nonmanual Markers in American Sign Language (ASL) Using Non-Parametric Adaptive 2D-3D Face Tracking Acquisition of Syntactic Simplification Rules for French Evolution of Event Designation in Media: Preliminary Study Dysarthric Speech Database for Development of QoLT Software Technology Cost and Benefit of Using WordNet Senses for Sentiment Analysis A Corpus of Spontaneous Multi-party Conversation in Bosnian Serbo-Croatian and British English Using Wikipedia to Validate the Terminology found in a Corpus of Basic Textbooks An Evaluation of the Effect of Automatic Preprocessing on Syntactic Parsing for Biomedical Relation Extraction A review corpus annotated for negation, speculation and their scope Collection of a corpus of Dutch SMS Assessing Crowdsourcing Quality through Objective Tasks Medical Term Extraction in an Arabic Medical Corpus Challenges in the development of annotated corpora of computer-mediated communication in Indian Languages: A Case of Hindi Multimodal Behaviour and Feedback in Different Types of Interaction An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines) Brand Pitt: A Corpus to Explore the Art of Naming Evaluating automatic cross-domain Dutch semantic role annotation Spontaneous Speech Corpora for language learners of Spanish, Chinese and Japanese Strategies to Improve a Speaker Diarisation Tool On the practice of error analysis for machine translation evaluation NgramQuery - Smart Information Extraction from Google N-gram using External Resources DEGELS1: A comparable corpus of French Sign Language and co-speech gestures The FLaReNet Strategic Language Resource Agenda Building a Multimodal Laughter Database for Emotion Recognition A Repository of Data and Evaluation Resources for Natural Language Generation Visualizing word senses in WordNet Atlas Classifying Standard Linguistic Processing Functionalities based on Fundamental Data Operation Types On the Way to a Legal Sharing of Web Applications in NLP Multi-Layer Discourse Annotation of a Dutch Text Corpus Evaluation of Discourse Relation Annotation in the Hindi Discourse Relation Bank Collecting humorous expressions from a community-based question-answering-service corpus Turkish Paraphrase Corpus Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation Fine-grained German Sentiment Analysis on Social Media

P
Parsing	Effort of Genre Variation and Prediction of System Performance Parsing Any Domain English text to CoNLL dependencies Ubiquitous Usage of a Broad Coverage French Corpus: Processing the Est Republicain corpus Towards an LFG parser for Polish: An exercise in parasitic grammar development The Impact of Automatic Morphological Analysis & Disambiguation on Dependency Parsing of Turkish Task-Driven Linguistic Analysis based on an Underspecified Features Representation Making Ellipses Explicit in Dependency Conversion for a German Treebank Using Verb Subcategorization for Word Sense Disambiguation Rule-Based Detection of Clausal Coordinate Ellipsis A Basic Language Resource Kit for Persian Combining Language Resources Into A Grammar-Driven Swedish Parser The Icelandic Parsed Historical Corpus (IcePaHC) Prague Dependency Style Treebank for Tamil A Reference Dependency Bank for Analyzing Complex Predicates Expanding Arabic Treebank to Speech: Results from Broadcast News A treebank-based study on the influence of Italian word order on parsing performance The annotation of the C-ORAL-BRASIL oral through the implementation of the Palavras Parser Automatic Extraction and Evaluation of Arabic LFG Resources MaltOptimizer: A System for MaltParser Optimization Croatian Dependency Treebank: Recent Development and Initial Experiments Integrating NLP Tools in a Distributed Environment: A Case Study Chaining a Tagger with a Dependency Parser The WeSearch Corpus, Treebank, and Treecache -- A Comprehensive Sample of User-Generated Content From Grammar Rule Extraction to Treebanking: A Bootstrapping Approach
Part of speech tagging	Boosting statistical tagger accuracy with simple rule-based grammars Automatic Annotation and Manual Evaluation of the Diachronic German Corpus TüBa-D/DC The Impact of Automatic Morphological Analysis & Disambiguation on Dependency Parsing of Turkish Building a multilingual parallel corpus for human users BiBiKit - A Bilingual Bimodal Reading and Writing Tool for Sign Language Users ROMBAC: The Romanian Balanced Annotated Corpus A Universal Part-of-Speech Tagset A Curated Database for Linguistic Research: The Test Case of Cimbrian Varieties Improving corpus annotation productivity: a method and experiment with interactive tagging MISTRAL+: A Melody Intonation Speaker Tonal Range semi-automatic Analysis using variable Levels The goo300k corpus of historical Slovene First Results in a Study Evaluating Pre-annotation and Correction Propagation for Machine-Assisted Syriac Morphological Analysis The annotation of the C-ORAL-BRASIL oral through the implementation of the Palavras Parser Evaluating the Impact of External Lexical Resources into a CRF-based Multiword Segmenter and Part-of-Speech Tagger Morphosyntactic Analysis of the CHILDES and TalkBank Corpora Lemmatising Serbian as Category Tagging with Bidirectional Sequence Classification Integrating NLP Tools in a Distributed Environment: A Case Study Chaining a Tagger with a Dependency Parser Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier UniDic for Early Middle Japanese: a Dictionary for Morphological Analysis of Classical Japanese A Study of Word-Classing for MT Reordering Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation Construction of the Turkish National Corpus (TNC)
Person Identification	GerNED: A German Corpus for Named Entity Disambiguation Distractorless Authorship Verification Creating and Curating a Cross-Language Person-Entity Linking Collection The REPERE Corpus : a multimodal corpus for person recognition Strategies to Improve a Speaker Diarisation Tool A Database of Attribution Relations International Multicultural Name Matching Competition: Design, Execution, Results, and Lessons Learned
Phonetic Databases, Phonology	Building Text-to-Speech Systems for Resource Poor Languages SPPAS: a tool for the phonetic segmentation of speech A Phonemic Corpus of Polish Child-Directed Speech Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu Two Database Resources for Processing Social Media English Text A topologic view of Topic and Focus marking in Italian SUTAV: A Turkish Audio-Visual Database Korean Children's Spoken English Corpus and an Analysis of its Pronunciation Variability English to Indonesian Transliteration to Support English Pronunciation Practice ULex: new data models and a mobile environment for corpus enrichment.
Profiling	Joint Grammar and Treebank Development for Mandarin Chinese with HPSG Error profiling for evaluation of machine-translated text: a Polish-English case study
Prosody	An audiovisual political speech analysis incorporating eye-tracking and perception data Predicting Phrase Breaks in Classical and Modern Standard Arabic Text Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing MULTIPHONIA: a MULTImodal database of PHONetics teaching methods in classroom InterActions. A topologic view of Topic and Focus marking in Italian Annotating a corpus of human interaction with prosodic profiles ― focusing on Mandarin repair/disfluency Prediction of Non-Linguistic Information of Spontaneous Speech from the Prosodic Annotation: Evaluation of the X-JToBI system A Galician Syntactic Corpus with Application to Intonation Modeling Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification Evaluating expressive speech synthesis from audiobook corpora for conversational phrases Designing French Tale Corpora for Entertaining Text To Speech Synthesis

Q
Question Answering	Constructing a Question Corpus for Textual Semantic Relations Constructive Interaction for Talking about Interesting Topics Evaluating Machine Reading Systems through Comprehension Tests An English-Portuguese parallel corpus of questions: translation guidelines and application in SMT Treebanking by Sentence and Tree Transformation: Building a Treebank to support Question Answering in Portuguese Identifying Nuggets of Information in GALE Distillation Evaluation Creation and use of Language Resources in a Question-Answering eHealth System Págico: Evaluating Wikipedia-based information retrieval in Portuguese Building and Exploring Semantic Equivalences Resources Evaluating Multi-focus Natural Language Queries over Data Services

S
Semantics	Matching Cultural Heritage items to Wikipedia The Dependency-Parsed FrameNet Corpus Tree-Structured Named Entity Recognition on OCR Data: Analysis, Processing and Results Semantic Annotations in Japanese FrameNet: Comparing Frames in Japanese and English Investigating Engagement - intercultural and technological aspects of the collection, analysis, and use of the Estonian Multiparty Conversational video data Annotating Spatial Containment Relations Between Events The Role of Model Testing in Standards Development: The Case of ISO-Space A Repository of Rules and Lexical Resources for Discourse Structure Analysis: the Case of Explanation Structures A Comparative Evaluation of Word Sense Disambiguation Algorithms for German A new dynamic approach for lexical networks evaluation Polaris: Lymba's Semantic Parser Automatic classification of German """"an"""" particle verbs LIE: Leadership, Influence and Expertise CAT: the CELCT Annotation Tool A French Fairy Tale Corpus syntactically and semantically annotated ConanDoyle-neg: Annotation of negation cues and their scope in Conan Doyle stories A voting scheme to detect semantic underspecification SentiSense: An easily scalable concept-based affective lexicon for sentiment analysis Towards a richer wordnet representation of properties Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource for Lexical Substitution Logical metonymies and qualia structures: an annotated database of logical metonymies for German Propbank-Br: a Brazilian Treebank annotated with semantic role labels SUTime: A library for recognizing and normalizing time expressions Adding Morpho-semantic Relations to the Romanian Wordnet The Rocky Road towards a Swedish FrameNet - Creating SweFN Evaluating Machine Reading Systems through Comprehension Tests A proposal for improving WordNet Domains Pragmatic identification of the witness sets Concept-based Selectional Preferences and Distributional Representations from Wikipedia Articles Annotating dropped pronouns in Chinese newswire text Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions Capturing syntactico-semantic regularities among terms: An application of the FrameNet methodology to terminology The FAUST Corpus of Adequacy Assessments for Real-World Machine Translation Output Annotating Story Timelines as Temporal Dependency Structures A PropBank for Portuguese: the CINTIL-PropBank Highlighting relevant concepts from Topic Signatures Extending a wordnet framework for simplicity and scalability Semantic Relations Established by Specialized Processes Expressed by Nouns and Verbs: Identification in a Corpus by means of Syntactico-semantic Annotation Alternative Lexicalizations of Discourse Connectives in Czech The IMAGACT Cross-linguistic Ontology of Action. A new infrastructure for natural language disambiguation Empty Argument Insertion in the Hindi PropBank Constructing Large Proposition Databases Semantic Role Labeling with the Swedish FrameNet Applying cross-lingual WSD to wordnet development Annotating Qualia Relations in Italian and French Complex Nominals Is it Useful to Support Users with Lexical Resources? A User Study. In the same boat and other idiomatic seafaring expressions German """"nach""""-Particle Verbs in Semantic Theory and Corpus Data Measuring the compositionality of NV expressions in Basque by means of distributional similarity techniques Modality in Text: a Proposal for Corpus Annotation ISO 24617-2: A semantically-based standard for dialogue annotation Developing a large semantically annotated corpus Associative and Semantic Features Extracted From Web-Harvested Corpora Semantic annotation of French corpora: animacy and verb semantic classes Detection of Peculiar Word Sense by Distance Metric Learning with Labeled Examples Association Norms of German Noun Compounds LexIt: A Computational Resource on Italian Argument Structure Building and Exploring Semantic Equivalences Resources The TARSQI Toolkit Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization Annotating Near-Identity from Coreference Disagreements Evaluating automatic cross-domain Dutch semantic role annotation Polish Multimodal Corpus ― a collection of referential gestures Annotating Factive Verbs Chinese Whispers: Cooperative Paraphrase Acquisition Yes we can!? Annotating English modal verbs Evaluation of Classification Algorithms and Features for Collocation Extraction in Croatian Using Noun Similarity to Adapt an Acceptability Measure for Persian Light Verb Constructions Empirical Comparisons of MASC Word Sense Annotations Mapping WordNet to the Kyoto ontology Tools for plWordNet Development. Presentation and Perspectives An Annotation Scheme for Quantifier Scope Disambiguation Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource Building Japanese Predicate-argument Structure Corpus using Lexical Conceptual Structure Enriching the ISST-TANL Corpus with Semantic Frames
Semantic Web	Large Scale Semantic Annotation, Indexing and Search at The National Archives Expertise Mining for Enterprise Content Management Collaborative semantic editing of linked data lexica DBpedia: A Multilingual Cross-domain Knowledge Base Holaaa!! writin like u talk is kewl but kinda hard 4 NLP On Using Linked Data for Language Resource Sharing in the Long Tail of the Localisation Market Creation of a bottom-up corpus-based ontology for Italian Linguistics Glottolog/Langdoc:Increasing the visibility of grey literature for low-density languages Application of a Semantic Search Algorithm to Semi-Automatic GUI Generation The Open Linguistics Working Group A disambiguation resource extracted from Wikipedia for semantic annotation
Sign Language Recognition/Generation	Detecting Reduplication in Videos of American Sign Language Recognition of Nonmanual Markers in American Sign Language (ASL) Using Non-Parametric Adaptive 2D-3D Face Tracking Comparing computer vision analysis of signed language video with motion capture recordings Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber) Semi-Automatic Sign Language Corpora Annotation using Lexical Representations of Signs RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus A platform-independent user-friendly dictionary from Italian to LIS
Speech Recognition/Understanding	Automatic Speech Recognition on a Firefighter TETRA Broadcast Channel Body-conductive acoustic sensors in human-robot communication Adaptive Speech Understanding for Intuitive Model-based Spoken Dialogues Cross-lingual studies of ASR errors: paradigms for perceptual evaluations The acquisition and dialog act labeling of the EDECAN-SPORTS corpus Evaluating Appropriateness Of System Responses In A Spoken CALL Game A Scalable Architecture For Web Deployment of Spoken Dialogue Systems Speech and Language Resources for LVCSR of Russian The ETAPE corpus for the evaluation of speech-based TV content processing in the French language A Corpus for a Gesture-Controlled Mobile Spoken Dialogue System Intelligibility assessment in forensic applications TED-LIUM: an Automatic Speech Recognition dedicated corpus Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora The DISCO ASR-based CALL system: practicing L2 oral skills and beyond Using an ASR database to design a pronunciation evaluation system in Basque Developing Partially-Transcribed Speech Corpus from Edited Transcriptions
Speech resource/database	A Speech and Gesture Spatial Corpus in Assisted Living Causal analysis of task completion errors in spoken music retrieval interactions LDC Forced Aligner “You Seem Aggressive!” Monitoring Anger in a Practical Application Annotating progressive aspect constructions in the spoken section of the British National Corpus The KIT Lecture Corpus for Speech Translation The IWSLT 2011 Evaluation Campaign on Automatic Talk Translation Versatile Speech Databases for High Quality Synthesis for Basque CoALT: A Software for Comparing Automatic Labelling Tools Balanced data repository of spontaneous spoken Czech NKI-CCRT Corpus - Speech Intelligibility Before and After Advanced Head and Neck Cancer Treated with Concomitant Chemoradiotherapy Item Development and Scoring for Japanese Oral Proficiency Testing Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis A Parameterized and Annotated Spoken Dialog Corpus of the CMU Let's Go Bus Information System SMALLWorlds -- Multilingual Content-Controlled Monologues Dysarthric Speech Database for Development of QoLT Software Technology Annotating a corpus of human interaction with prosodic profiles ― focusing on Mandarin repair/disfluency SUTAV: A Turkish Audio-Visual Database KALAKA-2: a TV Broadcast Speech Database for the Recognition of Iberian Languages in Clean and Noisy Environments Developing and evaluating an emergency scenario dialogue corpus The ETAPE corpus for the evaluation of speech-based TV content processing in the French language LAST MINUTE: a Multimodal Corpus of Speech-based User-Companion Interactions Designing a search interface for a Spanish learner spoken corpus: the end-user's evaluation The Twins Corpus of Museum Visitor Questions Method for Collection of Acted Speech Using Various Situation Scripts The C-ORAL-BRASIL I: Reference Corpus for Spoken Brazilian Portuguese Towards Fully Automatic Annotation of Audio Books for TTS Multimedia database of the cultural heritage of the Balkans LAMP: A Multimodal Web Platform for Collaborative Linguistic Analysis Syntactic annotation of spontaneous speech: application to call-center conversation data Korean Children's Spoken English Corpus and an Analysis of its Pronunciation Variability Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task TED-LIUM: an Automatic Speech Recognition dedicated corpus Building Text-To-Speech Voices in the Cloud Leveraging study of robustness and portability of spoken language understanding systems across languages and domains: the PORTMEDIA corpora W-PhAMT: A web tool for phonetic multilevel timeline visualization The Nordic Dialect Corpus Towards Emotion and Affect Detection in the Multimodal LAST MINUTE Corpus The DISCO ASR-based CALL system: practicing L2 oral skills and beyond Using an ASR database to design a pronunciation evaluation system in Basque New language resources for the Pashto language Corpus of Children Voices for Mid-level Markers and Affect Bursts Analysis Evaluating expressive speech synthesis from audiobook corpora for conversational phrases The I3MEDIA speech database: a trilingual annotated corpus for the analysis and synthesis of emotional speech Creating a Data Collection for Evaluating Rich Speech Retrieval A Mandarin-English Code-Switching Corpus Developing Partially-Transcribed Speech Corpus from Edited Transcriptions
Speech Synthesis	Building Synthetic Voices in the META-NET Framework Building Text-to-Speech Systems for Resource Poor Languages Versatile Speech Databases for High Quality Synthesis for Basque Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis Practical Evaluation of Human and Synthesized Speech for Virtual Human Dialogue Systems Building Text-To-Speech Voices in the Cloud Evaluating expressive speech synthesis from audiobook corpora for conversational phrases Designing French Tale Corpora for Entertaining Text To Speech Synthesis
Standards for LRs	Using DiAML and ANVIL for multimodal dialogue annotations The Role of Model Testing in Standards Development: The Case of ISO-Space The coding and annotation of multimodal dialogue acts ROMBAC: The Romanian Balanced Annotated Corpus Iula2Standoff: a tool for creating standoff documents for the IULACT Temporal Annotation: A Proposal for Guidelines and an Experiment with Inter-annotator Agreement Semantic metadata mapping in practice: the Virtual Language Observatory UBY-LMF -- A Uniform Model for Standardizing Heterogeneous Lexical-Semantic Resources in ISO-LMF Modality in Text: a Proposal for Corpus Annotation Federated Search: Towards a Common Search Infrastructure EXMARaLDA and the FOLK tools ― two toolsets for transcribing and annotating spoken language ISO 24617-2: A semantically-based standard for dialogue annotation Towards a User-Friendly Platform for Building Language Resources based on Web Services Design and compilation of a specialized Spanish-German parallel corpus Conventional Orthography for Dialectal Arabic Standardizing a Component Metadata Infrastructure Citing on-line Language Resources On Using Linked Data for Language Resource Sharing in the Long Tail of the Localisation Market The Polish Sejm Corpus Creation of an Open Shared Language Resource Repository in the Nordic and Baltic Countries Romanian TimeBank: An Annotated Parallel Corpus for Temporal Information Towards automation in using multi-modal language resources: compatibility and interoperability for multi-modal features in Kachako Accessing and standardizing Wiktionary lexical entries for the translation of labels in Cultural Heritage taxonomies An Analytical Model of Language Resource Sustainability ”Rendering Endangered Lexicons Interoperable through Standards Harmonization”: the RELISH project Ontologies of Linguistic Annotation: Survey and perspectives RELcat: a Relation Registry for ISOcat data categories GATEtoGerManC: A GATE-based Annotation Pipeline for Historical German
Statistical and machine learning methods	Effort of Genre Variation and Prediction of System Performance Creating a Coreference Resolution System for Polish Evaluating the Similarity Estimator component of the TWIN Personality-based Recommender System Bootstrapping Sentiment Labels For Unannotated Documents With Polarity PageRank DutchSemCor: Targeting the ideal sense-tagged corpus The BladeMistress Corpus: From Talk to Action in Virtual Worlds Comparison between two models of language for the automatic phonetic labeling of an undocumented language of the South-Asia: the case of Mo Piu Task-Driven Linguistic Analysis based on an Underspecified Features Representation Distractorless Authorship Verification Predicting Phrase Breaks in Classical and Modern Standard Arabic Text TimeBankPT: A TimeML Annotated Corpus of Portuguese Constructing a Class-Based Lexical Dictionary using Interactive Topic Models Recognition of Nonmanual Markers in American Sign Language (ASL) Using Non-Parametric Adaptive 2D-3D Face Tracking Evaluation of Online Dialogue Policy Learning Techniques Re-ordering Source Sentences for SMT Semi-Supervised Technical Term Tagging With Minimal User Feedback A database of semantic clusters of verb usages PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora Visualizing Sentiment Analysis on a User Forum Towards Automatic Gesture Stroke Detection Annotating and Learning Morphological Segmentation of Egyptian Colloquial Arabic Evaluating Hebbian Self-Organizing Memories for Lexical Representation and Access Evaluating the Impact of External Lexical Resources into a CRF-based Multiword Segmenter and Part-of-Speech Tagger Similarity Ranking as Attribute for Machine Learning Approach to Authorship Identification Suffix Trees as Language Models The Romanian Neuter Examined Through A Two-Gender N-Gram Classification System Automatic Translation of Scientific Documents in the HAL Archive Irregularity Detection in Categorized Document Corpora Lemmatising Serbian as Category Tagging with Bidirectional Sequence Classification MaltOptimizer: A System for MaltParser Optimization Customization of the Europarl Corpus for Translation Studies Rhetorical Move Detection in English Abstracts: Multi-label Sentence Classifiers and their Annotated Corpora Evaluating Multi-focus Natural Language Queries over Data Services Evaluation of Classification Algorithms and Features for Collocation Extraction in Croatian RWTH-PHOENIX-Weather: A Large Vocabulary Sign Language Recognition and Translation Corpus Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier Using Noun Similarity to Adapt an Acceptability Measure for Persian Light Verb Constructions Improving K-Nearest Neighbor Efficacy for Farsi Text Classification An Annotation Scheme for Quantifier Scope Disambiguation A hierarchical approach with feature selection for emotion recognition from speech Using a Goodness Measurement for Domain Adaptation: A Case Study on Chinese Word Segmentation The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation
Summarisation	A good space: Lexical predictors in word space evaluation Effects of Document Clustering in Modeling Wikipedia-style Term Descriptions Text Simplification Tools for Spanish This also affects the context - Errors in extraction based summaries Summarizing a multimodal set of documents in a Smart Room

T
Text mining	Dependency parsing for interaction detection in pharmacogenomics Evaluating the Similarity Estimator component of the TWIN Personality-based Recommender System Mining Sentiment Words from Microblogs for Predicting Writer-Reader Emotion Transition Large Scale Semantic Annotation, Indexing and Search at The National Archives QurAna: Corpus of the Quran annotated with Pronominal Anaphora Learning Categories and their Instances by Contextual Features The BladeMistress Corpus: From Talk to Action in Virtual Worlds Leveraging the Wisdom of the Crowds for the Acquisition of Multilingual Language Resources Quantising Opinions for Political Tweets Analysis Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing Automatically Extracting Procedural Knowledge from Instructional Texts using Natural Language Processing A Cross-Lingual Dictionary for English Wikipedia Concepts Mining Hindi-English Transliteration Pairs from Online Hindi Lyrics Expertise Mining for Enterprise Content Management A Framework for Spelling Correction in Persian Language Using Noisy Channel Model SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks An Evaluation of the Effect of Automatic Preprocessing on Syntactic Parsing for Biomedical Relation Extraction Associative and Semantic Features Extracted From Web-Harvested Corpora Evaluation of Unsupervised Information Extraction From medical language processing to BioNLP domain Irregularity Detection in Categorized Document Corpora Spell Checking for Chinese Identification of Manner in Bio-Events CALBC: Releasing the Final Corpora Annotated Bibliographical Reference Corpora in Digital Humanities Unsupervised document zone identification using probabilistic graphical models A Holistic Approach to Bilingual Sentence Fragment Extraction from Comparable Corpora Automatic Term Recognition Needs Multiple Evidence Improving K-Nearest Neighbor Efficacy for Farsi Text Classification Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench
Textual Entailment and Paraphrasing	Constructing a Question Corpus for Textual Semantic Relations Automatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques Logical metonymies and qualia structures: an annotated database of logical metonymies for German DSim, a Danish Parallel Corpus for Text Simplification A contrastive review of paraphrase acquisition techniques Annotating Factive Verbs Chinese Whispers: Cooperative Paraphrase Acquisition Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource Turkish Paraphrase Corpus
Tools, systems, applications	Same domain different discourse style - A case study on Language Resources for data-driven Machine Translation Tajik-Farsi Persian Transliteration Using Statistical Machine Translation Statistical Section Segmentation in Free-Text Clinical Records ATLIS: Identifying Locational Information in Text Automatically Incorporating an Error Corpus into a Spellchecker for Maltese Building a 70 billion word corpus of English from ClueWeb A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic. Creating a Coreference Resolution System for Polish LDC Forced Aligner A finite-state morphological transducer for Kyrgyz Evaluating the Similarity Estimator component of the TWIN Personality-based Recommender System Annotating Agreement and Disagreement in Threaded Discussion Logic Based Methods for Terminological Assessment Fast Labeling and Transcription with the Speechalyzer Toolkit SPPAS: a tool for the phonetic segmentation of speech Orthographic Transcription: which enrichment is required for phonetization? A High-Quality Web Corpus of Czech TIMEN: An Open Temporal Expression Normalisation Resource Risk Analysis and Prevention: LELIE, a Tool dedicated to Procedure and Requirement Authoring WebAnnotator, an Annotation Tool for Web Pages Towards a comprehensive open repository of Polish language resources Constructive Interaction for Talking about Interesting Topics Portuguese Text Generation from Large Corpora SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles From keystrokes to annotated process data: Enriching the output of Inputlog with linguistic information Polaris: Lymba's Semantic Parser CoALT: A Software for Comparing Automatic Labelling Tools Textual Characteristics for Language Engineering A Survey of Text Mining Architectures and the UIMA Standard A Rule-based Morphological Analyzer for Murrinh-Patha Representing the Translation Relation in a Bilingual Wordnet Can Statistical Post-Editing with a Small Parallel Corpus Save a Weak MT Engine? Detecting Reduplication in Videos of American Sign Language EVALIEX ― A Proposal for an Extended Evaluation Methodology for Information Extraction Systems BiBiKit - A Bilingual Bimodal Reading and Writing Tool for Sign Language Users Leveraging the Wisdom of the Crowds for the Acquisition of Multilingual Language Resources Using multimodal resources for explanation approaches in intelligent systems CAT: the CELCT Annotation Tool Robust clause boundary identification for corpus annotation NKI-CCRT Corpus - Speech Intelligibility Before and After Advanced Head and Neck Cancer Treated with Concomitant Chemoradiotherapy PaCo2: A Fully Automated tool for gathering Parallel Corpora from the Web Making Ellipses Explicit in Dependency Conversion for a German Treebank Item Development and Scoring for Japanese Oral Proficiency Testing Further Developments in Treebank Error Detection Using Derivation Trees Dictionary Look-up with Katakana Variant Recognition SUTime: A library for recognizing and normalizing time expressions Two Database Resources for Processing Social Media English Text Experiences in Resource Generation for Machine Translation through Crowdsourcing An Oral History Annotation Tool for INTER-VIEWs Comparing computer vision analysis of signed language video with motion capture recordings Automatic MT Error Analysis: Hjerson Helping Addicter Semi-Supervised Technical Term Tagging With Minimal User Feedback Diachronic Changes in Text Complexity in 20th Century English Language: An NLP Approach Combining Language Resources Into A Grammar-Driven Swedish Parser An ontological approach to model and query multimodal concurrent linguistic annotations Extending a wordnet framework for simplicity and scalability Comparing performance of different set-covering strategies for linguistic content optimization in speech corpora PEXACC: A Parallel Sentence Mining Algorithm from Comparable Corpora A Framework for Spelling Correction in Persian Language Using Noisy Channel Model Customizable SCF Acquisition in Italian Statistical Evaluation of Pronunciation Encoding MISTRAL+: A Melody Intonation Speaker Tonal Range semi-automatic Analysis using variable Levels A Grammar-informed Corpus-based Sentence Database for Linguistic and Computational Studies ELAN development, keeping pace with communities' needs Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber) Rembrandt - a named-entity recognition framework An Adaptive Framework for Named Entity Combination Linguistic knowledge for specialized text production FreeLing 3.0: Towards Wider Multilinguality Inforex -- a web-based tool for text corpus management and semantic annotation Massively Increasing TIMEX3 Resources: A Transduction Approach Towards Automatic Gesture Stroke Detection Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects SemSim: Resources for Normalized Semantic Similarity Computation Using Lexical Networks A Resource-light Approach to Phrase Extraction for English and German Documents from the Patent Domain and User Generated Content Kitten: a tool for normalizing HTML and extracting its textual content A Metadata Editor to Support the Description of Linguistic Resources A Repository for the Sustainable Management of Research Data Large Scale Lexical Analysis NTUSocialRec: An Evaluation Dataset Constructed from Microblogs for Recommendation Applications in Social Networks Automatic lexical semantic classification of nouns Building and Exploiting a Corpus of Dialog Interactions between French Speaking Virtual and Human Agents Arabic-Segmentation Combination Strategies for Statistical Machine Translation Using Wikipedia to Validate the Terminology found in a Corpus of Basic Textbooks The SERENOA Project: Multidimensional Context-Aware Adaptation of Service Front-Ends The Herme Database of Spontaneous Multimodal Human-Robot Dialogues RIDIRE-CPI: an Open Source Crawling and Processing Infrastructure for Supervised Web-Corpora Building A Search Tool for FrameNet Constructicon Extraction of unmarked quotations in Newspapers A Morphological Analyzer For Wolof Using Finite-State Techniques Word Sketches for Turkish Development of a Web-Scale Chinese Word N-gram Corpus with Parts of Speech Information Annotation Facilities for the Reliable Analysis of Human Motion Arabic Word Generation and Modelling for Spell Checking Automatically Generated Online Dictionaries Citing on-line Language Resources Translog-II: a Program for Recording User Activity Data for Empirical Reading and Writing Research Holaaa!! writin like u talk is kewl but kinda hard 4 NLP Towards Fully Automatic Annotation of Audio Books for TTS Multimedia database of the cultural heritage of the Balkans ANALEC: a New Tool for the Dynamic Annotation of Textual Data Annotating Opinions in German Political News IDENTIC Corpus: Morphologically Enriched Indonesian-English Parallel Corpus Web Service integration platform for Polish linguistic resources The TARSQI Toolkit Annotation Trees: LDC's customizable, extensible, scalable, annotation infrastructure An NLP Curator (or: How I Learned to Stop Worrying and Love NLP Pipelines) Hindi Subjective Lexicon: A Lexical Resource for Hindi Adjective Polarity Classification Intelligibility assessment in forensic applications Spontaneous Speech Corpora for language learners of Spanish, Chinese and Japanese The SYNC3 Collaborative Annotation Tool Lemmatising Serbian as Category Tagging with Bidirectional Sequence Classification Efficient Dependency Graph Matching with the IMS Open Corpus Workbench Strategies to Improve a Speaker Diarisation Tool MaltOptimizer: A System for MaltParser Optimization Evaluation of the KomParse Conversational Non-Player Characters in a Commercial Virtual World Using Language Resources in Humanities research NgramQuery - Smart Information Extraction from Google N-gram using External Resources Adapting and evaluating a generic term extraction tool A GUI to Detect and Correct Errors in Hindi Dependency Treebank Example-Based Treebank Querying Evaluation of a Complex Information Extraction Application in Specific Domain Text Simplification Tools for Spanish Specifying Treebanks, Outsourcing Parsebanks: FinnTreeBank 3 W-PhAMT: A web tool for phonetic multilevel timeline visualization The Nordic Dialect Corpus This also affects the context - Errors in extraction based summaries Rapid creation of large-scale corpora and frequency dictionaries Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification The DISCO ASR-based CALL system: practicing L2 oral skills and beyond METU Turkish Discourse Bank Browser The New IDS Corpus Analysis Platform: Challenges and Prospects Application of a Semantic Search Algorithm to Semi-Automatic GUI Generation Two Phase Evaluation for Selecting Machine Translation Services A Graphical Citation Browser for the ACL Anthology Service Composition Scenarios for Task-Oriented Translation Towards automation in using multi-modal language resources: compatibility and interoperability for multi-modal features in Kachako META-SHARE v2: An Open Network of Repositories for Language Resources including Data and Tools A Tool/Database Interface for Multi-Level Analyses Using an ASR database to design a pronunciation evaluation system in Basque BUCEADOR, a multi-language search engine for digital libraries Getting more data -- Schoolkids as annotators Visualizing word senses in WordNet Atlas Building Large Corpora from the Web Using a New Efficient Tool Chain Assessing Divergence Measures for Automated Document Routing in an Adaptive MT System Linguagrid: a network of Linguistic and Semantic Services for the Italian Language. JRC Eurovoc Indexer JEX - A freely available multi-label categorisation tool ”Rendering Endangered Lexicons Interoperable through Standards Harmonization”: the RELISH project The Language Archive ― a new hub for language resources A Holistic Approach to Bilingual Sentence Fragment Extraction from Comparable Corpora A methodology for the extraction of information about the usage of formulaic expressions in scientific texts Tools for plWordNet Development. Presentation and Perspectives A Concise Query Language with Search and Transform Operations for Corpora with Multiple Levels of Annotation Collecting and Using Comparable Corpora for Statistical Machine Translation Clause-based Discourse Segmentation of Arabic Texts Latvian and Lithuanian Named Entity Recognition with TildeNER LG-Eval: A Toolkit for Creating Online Language Evaluation Experiments Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench A Fast, Memory Efficient, Scalable and Multilingual Dictionary Retriever Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz An implementation of a Latvian resource grammar in Grammatical Framework Dealing with unknown words in statistical machine translation PET: a Tool for Post-editing and Assessing Machine Translation
Topic detection & tracking	An Examination of Cross-Cultural Similarities and Differences from Social Media Data with respect to Language Use
Typological databases	Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions

U
Usability, user satisfaction	BLEU Evaluation of Machine-Translated English-Croatian Legislation Involving Language Professionals in the Evaluation of Machine Translation An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output Resource production of written forms of Sign Languages by a user-centered editor, SWift (SignWriting improved fast transcriber) Inforex -- a web-based tool for text corpus management and semantic annotation Towards Automatic Gesture Stroke Detection Designing a search interface for a Spanish learner spoken corpus: the end-user's evaluation Service Composition Scenarios for Task-Oriented Translation English to Indonesian Transliteration to Support English Pronunciation Practice Fivehundredmillionandone Tokens. Loading the AAC Container with Text Resources for Text Studies. Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz

V
Validation of LRs	The Use of Parallel and Comparable Data for Analysis of Abstract Anaphora in German and English Extracting Directional and Comparable Corpora from a Multilingual Corpus for Translation Studies Event Nominals: Annotation Guidelines and a Manually Annotated Corpus in French Automatic classification of German """"an"""" particle verbs TimeBankPT: A TimeML Annotated Corpus of Portuguese Statistical Evaluation of Pronunciation Encoding A Portuguese-Spanish Corpus Annotated for Subject Realization and Referentiality The Influence of Corpus Quality on Statistical Measurements on Language Resources KALAKA-2: a TV Broadcast Speech Database for the Recognition of Iberian Languages in Clean and Noisy Environments Analyzing the Impact of Prevalence on the Evaluation of a Manual Annotation Campaign Annotating Football Matches: Influence of the Source Medium on Manual Annotation A GUI to Detect and Correct Errors in Hindi Dependency Treebank Document Attrition in Web Corpora: an Exploration Building Large Corpora from the Web Using a New Efficient Tool Chain CLCM - A Linguistic Resource for Effective Simplification of Instructions in the Crisis Management Domain and its Evaluations A Concise Query Language with Search and Transform Operations for Corpora with Multiple Levels of Annotation

W
Web Services	Cloud Logic Programming for Integrating Language Technology Resources Korp ― the corpus infrastructure of Spräkbanken The open lexical infrastructure of Spräkbanken Introducing the Reference Corpus of Contemporary Portuguese Online An Oral History Annotation Tool for INTER-VIEWs A tool for enhanced search of multilingual digital libraries of e-journals Collecting and Analysing Chats and Tweets in SoNaR Recent Developments in CLARIN-NL The SERENOA Project: Multidimensional Context-Aware Adaptation of Service Front-Ends Dynamic web service deployment in a cloud environment Towards a User-Friendly Platform for Building Language Resources based on Web Services Typing Race Games as a Method to Create Spelling Error Corpora Word Sketches for Turkish Web Service integration platform for Polish linguistic resources Annotation Trees: LDC's customizable, extensible, scalable, annotation infrastructure ELRA in the heart of a cooperative HLT world Building Text-To-Speech Voices in the Cloud Using Language Resources in Humanities research Two Phase Evaluation for Selecting Machine Translation Services Service Composition Scenarios for Task-Oriented Translation Linguistic Analysis Processing Line for Bulgarian Classifying Standard Linguistic Processing Functionalities based on Fundamental Data Operation Types Linguagrid: a network of Linguistic and Semantic Services for the Italian Language. On the Way to a Legal Sharing of Web Applications in NLP LDC Language Resource Database: Building a Bibliographic Database
Word Sense Disambiguation	ATLIS: Identifying Locational Information in Text Automatically Unsupervised Word Sense Disambiguation with Multilingual Representations Cleaning noisy wordnets Wordnet extension made simple: A multilingual lexicon-based approach using wiki resources A Comparative Evaluation of Word Sense Disambiguation Algorithms for German DutchSemCor: Targeting the ideal sense-tagged corpus A voting scheme to detect semantic underspecification Mapping WordNet synsets to Wikipedia articles Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource for Lexical Substitution Using Verb Subcategorization for Word Sense Disambiguation The IMAGACT Cross-linguistic Ontology of Action. A new infrastructure for natural language disambiguation A new semantically annotated corpus with syntactic-semantic and cross-lingual senses Applying cross-lingual WSD to wordnet development Discovering Missing Wikipedia Inter-language Links by means of Cross-lingual Word Sense Disambiguation Evaluating the Impact of Phrase Recognition on Concept Tagging Detection of Peculiar Word Sense by Distance Metric Learning with Labeled Examples The MASC Word Sense Corpus Addressing polysemy in bilingual lexicon extraction from comparable corpora A light way to collect comparable corpora from the Web Using semi-experts to derive judgments on word sense alignment: a pilot study Yes we can!? Annotating English modal verbs Empirical Comparisons of MASC Word Sense Annotations Detecting Japanese Compound Functional Expressions using Canonical/Derivational Relation KPWr: Towards a Free Corpus of Polish

Powered by ELDA © 2012 ELDA/ELRA