Rulemaking process by the U.S. Environmental Protection Agency (online and offline submissions) | English | argument components: main root (claim) and subroot (sub-claim or main-support) relations: support, opposition and restate | Kwon et al. [45] | (1) + (2) + (3) | n-grams, subjectivity score, structural properties, cue phrases, named entities, sentiment features, topic information | SVM |
Kwon & Hovy [44] | (1) + (2) + (3) | n-grams, subjectivity score, structural properties, cue phrases, topic information | SVM, boosting |
User comments from U.S. eRulemaking online platform RegulationRoom.org (two rules: Airline Passenger Rights and Home Mortgage Consumer Protection) | English | proposition types: unverifiable, verifiable experiential and verifiable non-experiential | Park & Cardie [64] | (2) | n-grams, core clause tags, part-of-speech information, sentiment and emotion cues, speech events, imperative expressions, tense, pronouns | SVM |
Park et al. [67] | (2) | n-grams, lexicon-based features, part-of-speech information, emotion cues, tense, pronouns | CRF |
Guggilla et al. [33] | (2) | embeddings (word2vec, dependency, factual) | CNN, LSTM |
Cornell eRulemaking Corpus – CDCP [66]: User comments on Consumer Debt Collection Practices rule from RegulationRoom.org | English | proposition types: fact, testimony, value, policy and reference relations: evidence and reason | Niculae et al. [62] | Joint model for (2) and (3) | lexical information (e.g., n-grams, word embeddings and dependency information), lexicon-based features, structural properties, context information, syntactic properties (e.g., part-of-speech and tense), discourse properties | SVM, RNN, linear structured SVM, structured RNN |
Galassi et al. [27] | Joint model for (2) and (3) | word embeddings, structural properties | deep network without residual network block, deep residual network |
Cocarascu et al. [18] | (3) | word embeddings, sentiment features, syntactic features, textual entailment | SVM, RF, GRU, Attention, BERT |
Falk & Lapesa [25] | (2); focus on testimony | n-grams, surface features, syntactic features, textual complexity features, sentiment/polarity features | RF, FeedforwardNN, BERT |
Regulation Room Divisiveness Corpus – User comments on Airline Passenger Rights rule from RegulationRoom.org | English | relations: pro-arguments, con-arguments and rephrases of argument | Konat, Lawrence et al. [41] | (3) | - | Two graph theoretical measures for divisiveness |
eRulemaking_Controversy Corpus – User comments on Airline Passenger Rights rule from RegulationRoom.org | English | relations: pro-arguments and con-arguments | Lawrence et al. [48] | (3) | word features and grammatical features, e.g., discourse indicators and syntactic structure of an argument | semantic similarity, SVM, NB, rule-based classifier, a graph theoretical measure for centrality |
Various user comments from U.S. eRulemaking online platform regulations.gov (annotated semi-automatically) | English | 4 generic argument types: opposition (explicit, likely), support (explicit, likely) + 12 specific argument types: burdensome, not sufficient type, lacks flexibility, conflicting interest, disputed information, legal challenge, overreach, requests clarification, seeks exclusion, lacks clarity, too broad, too narrow | Eidelman & Grom [23] | (1) + (2) | n-grams, word embeddings | LR, fastText |
THF Airport ArgMining Corpus - German language dataset of a citizen online participation in the restructuring of a former airport site | German | Argument components: Claim, premise and major position | Liebeck et al. [51] | (1) + (2) | n-grams, part-of-speech information, dependency information, structural properties | SVM, RF, k-NN |
Argument components: Claim (pro/contra), premise and major position | Liebeck [50] | (1) + (2) | n-grams, word embeddings, part-of-speech information, dependency information, named entities, structural properties, topic information, sentiment features | SVM, RF, k-NN, CNN, LSTM, BiLSTM |
Multiple transportation-related public participation processes (online platforms and survey data) | German | Argument components: Premise and major position | Romberg & Conrad [73] | (1) + (2) | n-grams, word embeddings, part-of-speech information, dependency information | SVM, fastText, ECGA, BERT |
Argument concreteness: high, intermediate and low | Romberg [72] | (2) | n-grams, text length (in tokens) | LR, SVM, RF, BERT |
Online civic discussion data about the city of Nagoya | Japanese | Argument components: claim and premise relations: inner-post and inter-post | Morio & Fujita [59] | (1), (2) and (3) | n-grams, part-of-speech information, structural properties | SVM |
Morio & Fujita [60] | Joint model for (1), (2) and (3) | sequence of sentence representations | SVM, RF, LR, STagBiLSTMs. PN, PCPA |
Citizen contributions of the 2016 Chilean constitutional process (local on-site events) | Spanish | Argument components: policy, fact and value | Fierro et al. [26] | (2) | n-grams, word embeddings, part-of-speech information | SVM, RF, LR, fastText, deep averaging networks |
Giannakopoulos et al. [28] | (2) | word embeddings | CNN, BiGRUs, Attention |