Harvey Mudd College at SemEval-2019 Task 4: The D.X. Beaumont Hyperpartisan News Detector

Evan Amason, Jake Palanker, Mary Clare Shen, Julie Medero

Abstract

We use the 600 hand-labelled articles from SemEval Task 4 to hand-tune a classifier with 3000 features for the Hyperpartisan News Detection task. Our final system uses features based on bag-of-words (BoW), analysis of the article title, language complexity, and simple sentiment analysis in a naive Bayes classifier. We trained our final system on the 600,000 articles labelled by publisher. Our final system has an accuracy of 0.653 on the hand-labeled test set. The most effective features are the Automated Readability Index and the presence of certain words in the title. This suggests that hyperpartisan writing uses a distinct writing style, especially in the title.

Anthology ID:: S19-2166
Volume:: Proceedings of the 13th International Workshop on Semantic Evaluation
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota, USA
Editors:: Jonathan May, Ekaterina Shutova, Aurelie Herbelot, Xiaodan Zhu, Marianna Apidianaki, Saif M. Mohammad
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 967–970
Language:
URL:: https://aclanthology.org/S19-2166
DOI:: 10.18653/v1/S19-2166
Bibkey:
Cite (ACL):: Evan Amason, Jake Palanker, Mary Clare Shen, and Julie Medero. 2019. Harvey Mudd College at SemEval-2019 Task 4: The D.X. Beaumont Hyperpartisan News Detector. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 967–970, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):: Harvey Mudd College at SemEval-2019 Task 4: The D.X. Beaumont Hyperpartisan News Detector (Amason et al., SemEval 2019)
Copy Citation:
PDF:: https://aclanthology.org/S19-2166.pdf

PDF Cite Search