CorefMT

Annotated corpora used in experiments in papers:

https://aclweb.org/anthology/E/E17/E17-2104.pdf

@inproceedings{zora136447,
       booktitle = {15th Conference of the European Chapter of the Association for Computational Linguistics},
           month = {April},
           title = {Co-reference Resolution of Elided Subjects and Possessive Pronouns in Spanish-English Statistical Machine Translation},
          author = {Annette Rios and Don Tuggener},
       publisher = {Association for Computational Linguistics},
            year = {2017},
           pages = {657--662},
             url = {http://dx.doi.org/10.5167/uzh-136447}
}

and

https://aclweb.org/anthology/E/E17/E17-2100.pdf

@inproceedings{zora136594,
       booktitle = {15th Conference of the European Chapter of the Association for Computational Linguistics},
           month = {April},
           title = {Machine Translation of Spanish Personal and Possessive Pronouns Using Anaphora Probabilities},
          author = {Ngoc Quang Luong and Andrei Popescu-Belis and Annette Rios and Don Tuggener},
       publisher = {Association for Computational Linguistics},
            year = {2017},
           pages = {631--636},
             url = {http://dx.doi.org/10.5167/uzh-136594}
}

Annotated News Commentary Corpus (v11) see: http://opus.lingfil.uu.se/download.php?f=News-Commentary11/News-Commentary11.tar.gz

Morphological analysis: FreeLing (http://nlp.lsi.upc.edu/freeling/)
Tagging: Wapiti (https://wapiti.limsi.fr/)
Parsing: MaltParser (http://www.maltparser.org/)
Co-Reference: Corzu: (http://www.cl.uzh.ch/de/research/completed-research/coreferenceresolution.html)

Contents:

es-en: pure text, sentence aligend
es-en-trees: English text, Spanish dependency trees in Moses XML (binarized)
es-en-posscoref: pure text, Spanish with dummies for null subjects and annotated possessive pronoun 'su/sus' and annotated relative pronoun 'que'
es-en-posscoref-trees: English text, Spanish dependency trees in Moses XML, with dummies for null subjects and annotated possessive pronoun 'su/sus' and annotated relative pronoun 'que'

nc11.es.coref.conll.tar.bz2

annotated Spanish corpus, conll with entities

nc11.es.coref.chains.tar.bz2

*.mables: extracted markables
*.mables.chains: co-reference chains
*.ante_scores: scores of possible antecedent for each pronoun

The format of the mentions is as follows: [MentionID, SentenceID, MentionStartToken, MentionEndToken, PoS, Person, Gender, Number, Gram. Funct., Animate, Dependency Head Token, Gov. Verb, NE class, *, Head lemma]

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
es-en-posscoref-trees		es-en-posscoref-trees
es-en-posscoref		es-en-posscoref
es-en-trees		es-en-trees
es-en		es-en
nc11_es_coref_chains		nc11_es_coref_chains
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CorefMT

About

Releases 1

Packages

Languages

a-rios/CorefMT

Folders and files

Latest commit

History

Repository files navigation

CorefMT

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages