Mirella De Sisto

Tilburg University, Cognitive Science and Artificial Intelligence, Assistant Professor

Followers

Following

Co-authors

Mentions

Public Views

InterestsView All (14)

Uploads

Conference Presentations by Mirella De Sisto

Challenges with Sign Language Datasets for Sign Language Recognition and Translation

by Vincent Vandeghinste and Mirella De Sisto

LREC2022 Proceedings, 2022

Sign Languages (SLs) are the primary means of communication for at least half a million people in... more Sign Languages (SLs) are the primary means of communication for at least half a million people in Europe alone. However, the
development of SL recognition and translation tools is slowed down by a series of obstacles concerning resource scarcity and
standardization issues in the available data. The former challenge relates to the volume of data available for machine learning as well
as the time required to collect and process new data. The latter obstacle is linked to the variety of the data, i.e., annotation formats are
not unified and vary amongst different resources. The available data formats are often not suitable for machine learning, obstructing
the provision of automatic tools based on neural models. In the present paper, we give an overview of these challenges by comparing
various SL corpora and SL machine learning datasets. Furthermore, we propose a framework to address the lack of standardization at
format level, unify the available resources and facilitate SL research for different languages. Our framework takes ELAN files as inputs
and returns textual and visual data ready to train SL recognition and translation models. We present a proof of concept, training neural
translation models on the data produced by the proposed framework.

Download

Rhymes in Spanish romances as evidence of internally layered ternary feet

Download

The birth of the iamb in Early Renaissance Low Countries

25/01/17

The rise of a morphological distinction: converging metaphony and RF in Airolano

24/06/2015. Talk at the Dialect Meeting 2015 and CIDSM X, Leiden University Centre for Linguisti... more

Download

Papers by Mirella De Sisto

XSL-HoReCo and GoSt-ParC-Sign: Two New Signed Language - Written Language Parallel Corpora

Linköping electronic conference proceedings, Jul 9, 2024

Metronome: tracing variation in poetic meters via local sequence alignment

arXiv (Cornell University), Apr 26, 2024

Download

PoeTree. Poetry Treebanks in Czech, English, French, German, Hungarian, Italian, Portuguese, Russian and Spanish

Zenodo (CERN European Organization for Nuclear Research), Oct 16, 2023

Understanding poetry using natural language processing tools: a survey

Digital scholarship in the humanities, Feb 7, 2024

The Development of a Poetic Tradition. A Study of a Dutch Renaissance Poetry Corpus

Studia Metrica et Poetica, Sep 10, 2023

Download

Report on Europe's Sign Languages (ELE D1.40)

Zenodo (CERN European Organization for Nuclear Research), Jun 9, 2023

Download

Predicting Perceptual Centers Located at Vowel Onset in German Speech Using Long Short-Term Memory Networks

INTERSPEECH 2023

Tailoring Domain Adaptation for Machine Translation Quality Estimation

arXiv (Cornell University), Apr 18, 2023

Download

Microvariation in the second form the infinitive in Campania

Isogloss, Mar 14, 2024

Download

Modelli di demarcazione della metà verso nel metro rinascimentale romanzo

ILLA - Nuove Ricerche Umanistiche, 2021

Rhymes in Spanish Romances as Evidence of Internally Layered Ternary Feet

28th Manchester Phonology Meeting, May 1, 2021

The interaction between phonology and metre: Approaches to Romance and West Germanic Renaissance metre

Download

The prosodification of possessive enclitics in Airola and Boiano

Moderna Sprak, 2020

In southern Italian dialects, possessives have an enclitic variant typically associated with kins... more In southern Italian dialects, possessives have an enclitic variant typically associated with kinship nouns (Rohlfs 1967, Sotiri 2007, Ledgeway 2009, D’Alessandro &amp; Migliori 2017) (e.g. [ˈfratə-mə] ˈbrother myˈ). The most common strategy to avoid violations of the three-syllable window is to avoid the enclitic form of the possessive, or stress shift, as in Lucanian (e.g. [ˌiennəˈru-mə] cf. [ˈiennərə], Lüdtke 1979:31). In the dialects of Airola and Boiano, a different strategy is attested: with proparoxytonic nouns (e.g. [ˈjennərə] ˈson-in-lawˈ in both varieties and [ˈsɔtʃəra]/[ˈswotʃəra] ˈmother/father-in-lawˈ in Boiano), the last unstressed syllable of the host is deleted (e.g. [ˈjennə-mə], [ˈsɔtʃə-mə], [ˈswotʃə-mə]). We claim that possessive enclitics in Airola and Boiano are internal clitics, that is, they amalgamate with the prosodic word that contains the host noun. We further propose that both proparoxytonic stress and the three-syllable-window derive from internally layered ternary feet (Martínez-Paricio 2013). These feet need to be aligned with the right edge of their containing prosodic word. When a possessive enclitic is incorporated, the most optimal strategy to comply with this alignment requirement is to build an internally layered ternary foot and delete the last syllable of the host noun, stress shift being excluded.

OntoPoetry : Postdata Ontology for poetry domain

The idiosyncrasy of literary studies has been an obstacle to its technological improvement for ye... more The idiosyncrasy of literary studies has been an obstacle to its technological improvement for years, especially to represent their knowledge in a machine-readable format. The richness, variety, and different study`s perspectives that scholars find in their studies make this task a highly complex challenge. This complexity is even more noticed in the poetry genre, where each poetic tradition has independently developed its analytical terminology and methodology. In this work, we have addressed the construction of a poetry ontology to express the scholar ́s knowledge spread out in isolated databases or works. Ontopoetry ontology has been developed following Neon methodology, and it has been structured in three modules: a) core, b) poetic analysis and c) transmission, covering the essential aspects in a poetry literary study. Ontopoetry core module has been aligned with FRBRoo ontology guaranteeing its interoperability. This paper is focused on the description of the core module, its ...

Download

Towards an Ontology for European Poetry

The main aim of Poetry Standardization and Linked Open Data Project, POSTDATA, is to provide mean... more The main aim of Poetry Standardization and Linked Open Data Project, POSTDATA, is to provide means for researchers on European poetry to publish and consume semantically-enriched data. Thus, developing a poetry ontology is a pillar of its semantic domain. This ontology tries to enhance interoperability in the European poetry community and capture the European poetry domain knowledge.

Download

PoetryLab. An Open Source Toolkit for the Analysis of Spanish Poetry Corpora

The study of the poetic features of text, especially their rhythmic structure when forming verses... more The study of the poetic features of text, especially their rhythmic structure when forming verses, pertains to the different traditions, whose scholars established the rules that might govern poetry. Within this context, the POSTDATA Project formalized a network of ontologies able to express any poetic expression and its analysis at the European level, enabling scholars all over Europe to interchange their data using Linked Open Data. However, varied research interests result in corpora that might not share the same facets of an analysis. To alleviate this concern and foster the completeness of the interchanged corpora, our team set out to build a software toolkit to assist in the analysis of poetry. This paper introduces PoetryLab, an extensible open source toolkit for syllabification, scansion (extraction of stress patterns), enjambment detection (syntactical units split in two lines), rhyme detection, and historical named entity recognition for Spanish poetry. Our toolkit achieve...

Download

Challenges with Sign Language Datasets for Sign Language Recognition and Translation

by Vincent Vandeghinste and Mirella De Sisto

LREC2022 Proceedings, 2022

Download

Rhymes in Spanish romances as evidence of internally layered ternary feet

Download

The birth of the iamb in Early Renaissance Low Countries

25/01/17

The rise of a morphological distinction: converging metaphony and RF in Airolano

24/06/2015. Talk at the Dialect Meeting 2015 and CIDSM X, Leiden University Centre for Linguisti... more

Download

XSL-HoReCo and GoSt-ParC-Sign: Two New Signed Language - Written Language Parallel Corpora

Linköping electronic conference proceedings, Jul 9, 2024

Metronome: tracing variation in poetic meters via local sequence alignment

arXiv (Cornell University), Apr 26, 2024

Download

PoeTree. Poetry Treebanks in Czech, English, French, German, Hungarian, Italian, Portuguese, Russian and Spanish

Zenodo (CERN European Organization for Nuclear Research), Oct 16, 2023

Understanding poetry using natural language processing tools: a survey

Digital scholarship in the humanities, Feb 7, 2024

The Development of a Poetic Tradition. A Study of a Dutch Renaissance Poetry Corpus

Studia Metrica et Poetica, Sep 10, 2023

Download

Report on Europe's Sign Languages (ELE D1.40)

Zenodo (CERN European Organization for Nuclear Research), Jun 9, 2023

Download

Predicting Perceptual Centers Located at Vowel Onset in German Speech Using Long Short-Term Memory Networks

INTERSPEECH 2023

Tailoring Domain Adaptation for Machine Translation Quality Estimation

arXiv (Cornell University), Apr 18, 2023

Download

Microvariation in the second form the infinitive in Campania

Isogloss, Mar 14, 2024

Download

Modelli di demarcazione della metà verso nel metro rinascimentale romanzo

ILLA - Nuove Ricerche Umanistiche, 2021

Rhymes in Spanish Romances as Evidence of Internally Layered Ternary Feet

28th Manchester Phonology Meeting, May 1, 2021

The interaction between phonology and metre: Approaches to Romance and West Germanic Renaissance metre

Download

The prosodification of possessive enclitics in Airola and Boiano

Moderna Sprak, 2020

OntoPoetry : Postdata Ontology for poetry domain

Download

Towards an Ontology for European Poetry

Download

PoetryLab. An Open Source Toolkit for the Analysis of Spanish Poetry Corpora

Download

A Hybrid Approach to Stanza Classification in Spanish Poetry

The creation and analysis of poetry have been commonly carried out by hand; with only a few compu... more The creation and analysis of poetry have been commonly carried out by hand; with only a few computer-assisted approaches appearing over the years. In the Spanish context, the promise of machine learning is starting to pan out in specific tasks such as metrical annotation and rhythm extraction. Among the possible tasks that comprise the analysis of a poem, identifying the type of a stanza remains underexplored. The classification of the inner structures of verses in which a poem is built upon is an especially relevant task for poetry studies since it complements the structural information of a poem. In this work, we analyzed different computational approaches to stanza classification in the Spanish poetic tradition. We collected a corpus of 5005 stanzas of 46 different types, and created a baseline expert system on a set of rules defined by poetry scholars. We show that this task continues to be hard for computers systems even when leveraging the best performing embeddings. However, ...

Download

The Birth of the Iamb in Early Renaissance Low Countries

Download

Metaphony and Raddoppiamento Fonosintattico in plural nouns in the dialect of Airola

Approaches to Metaphony in the Languages of Italy

Defining meaningful units. Challenges in sign segmentation and segment-meaning mapping (short paper)

This paper addresses the tasks of sign segmentation and segment-meaning mapping in the context of... more This paper addresses the tasks of sign segmentation and segment-meaning mapping in the context of sign language (SL) recognition. It aims to give an overview of the linguistic properties of SL, such as coarticulation and simultaneity, which make these tasks complex. A better understanding of SL structure is the necessary ground for the design and development of SL recognition and segmentation methodologies, which are fundamental for machine translation of these languages. Based on this preliminary exploration, a proposal for mapping segments to meaning in the form of an agglomerate of lexical and non-lexical information is introduced.

Download

The interaction between phonology and metre. Approaches to Romance and West-Germanic Renaissance metre

LOT Publications, 2020

This dissertation investigates the interface between phonological and metrical structure. The int... more This dissertation investigates the interface between phonological and metrical structure. The interaction between phonology and metrics is explored from two perspectives: one looks at poetic aspects as evidence for phonological characteristics; the other explores to what extent phonology conditions the development of poetic tradition and by what means the metrical template is filled by phonological material. The case study is Renaissance metre and its implementation in a set of Romance and West-Germanic languages. A comparison of the different ways in which the same source metre was incorporated in various European poetic traditions sheds light on the role played by phonology in the process of adaptation. When a metre is borrowed, this needs to be adapted to the metrical structure which mirrors the phonology of the recipient language. In particular, the metrical template selects a macroparameter based on the macroparameter selected by phonology. The phonological macroparameter defines which prosodic domain (i.e. phrase or word) plays a prominent role in the language; consequently, metrics selects which of its layers (i.e. colon or foot) is going to play a prominent role in the poetic form. In addition, this work argues that the relationship between the two structures is bidirectional: on the one hand, phonology sees metrical structure and fills it with its elements; on the other hand, the metrical structure can stretch the possibilities of phonological material. The interaction is based on a series of matches and mismatches between the two structures, in a game of tension managed by metrics.

Download

Tackling the Toolkit: Plotting Poetry through Computational Literary Studies

by Petr Plecháč, Robert Kolár, Anne-Sophie Bories, Jakub Říha, Jan Macutek, Helena Bermúdez Sabel, Laura Hernández-Lorenzo, Mirella De Sisto, Szilvia Maróthy, Levente Selaf, and Anastasia Belousova

In Tackling the Toolkit, we focus on the methodological innovations, challenges, obstacles and ev... more In Tackling the Toolkit, we focus on the methodological innovations, challenges, obstacles and even shortcomings associated with applying quantitative methods to poetry specifically and poetics more broadly. Using tools including natural language processing, web ontologies, similarity detection devices and machine learning, our contributors explore not only metres, stanzas, stresses and rhythms but also genres, subgenres, lexical material and cognitive processes. Whether they are testing old theories and laws, making complex concepts machine-readable or developing new lines of textual analysis, their works challenge standard descriptions of norms and variations.

Download

Quantitative Approaches to Versification

by Petr Plecháč, Helena Bermúdez Sabel, Robert Kolár, Anastasia Belousova, James K Tauber, Mirella De Sisto, Kristina V Litvintseva, Andrew Cooper, Vera Polilova (Вера Полилова), Ksenia Tveryanovich, Александр Костюк, and Igor Pilshchikov

This volume presents a wide range of quantitative approaches to versification. It comprises vario... more This volume presents a wide range of quantitative approaches to versification. It comprises various methodological perspectives ranging from simple descriptive statistics to advanced machine learning methods (such as support vector machines, random forests or neural networks) as well as material covering a large span of time and lan -
guages: from very ancient versifications (Sumerian, Akkadian, Hittie; Ancient Greek), through medieval (Old English, Old Icelandic, Old Saxon) and Renaissance verse to modern experiments (free verse, concrete poetry); from English and Russian through Spanish and German to Portuguese and Catalan. Not only written, but also spoken poetry has been analyzed.

Download

Complementary Distribution of Metaphony and Raddoppiamento Fonosintattico in plural nouns in Airolano

In the southern Italian dialect of Airola (Campania) feminine plural and masculine plural are dis... more In the southern Italian dialect of Airola (Campania) feminine plural and masculine plural are distinguished by means of two phonological processes: metaphony and Raddoppiamento Fonosintattico (RF henceforth). They appear to be in complementary distribution and to
create gender distinction in the plural of nouns; in fact, metaphony takes place in masculine plural forms, while RF marks feminine plural ones. Therefore, two distinct phenomena, one being phonological, namely metaphony, and one being phono-syntactic, namely RF, happen to interact within plural noun formation. These two processes, which developed separately, acquired, synchronically speaking, a value of gender distinction.
Metaphony is a well-known phenomenon of Italian dialects, which consists in the raising or diphthongization of a stressed vowel under the influence of a non-adjacent following high vowel (Rohlfs 1966, Fanciullo 1994, Ledgeway 2009, Maiden 2010). In the dialect of Airola, it only affects mid vowels, namely /ɔ, o, e, ɛ/, and its attestation is not limited to the nominal class; it occurs, in fact, in various word categories, such as adjectives, verbs and possessive pronouns.
RF is an external sandhi phenomenon which consists in the gemination of a word-initial consonant under the influence of a preceding word (Rohlfs 1970, Leone 1984,
Loporcaro 1997, Borrelli 2002). In Airolano RF is lexically triggered, differently from the RF attested in Standard Italian, which occurs to be stress-induced.
The aim of this thesis is to describe the two phenomena, metaphony and RF, in Airolano and to give an analysis of them in order to explain their division of labor. To do so, the processes are first analyzed separately. Then, a unified analysis is elaborated aiming to shed some light on the difference between genders in the plural of nouns.
The analysis of the two phenomena will be based on data from Airolano that were collected in December 2013 and April 2014 by the author.Ten informants were selected, which were classified into four different age groups. All
the recordings were, subsequently, transcribed in IPA and they appear in this form in the text. The full set of data is stored in the Italian Dialect archive of Leiden University.

Mirella De Sisto

Uploads

Conference Presentations by Mirella De Sisto

Papers by Mirella De Sisto

Log In