Identifying Parties in Manifestos and Parliament Speeches

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Documents

This paper addresses differences in the word use of two left-winged and two right-winged Danish parties, and how these differences,
which reflect some of the basic stances of the parties, can be used to automatically identify the party of politicians from their speeches.
In the first study, the most frequent and characteristic lemmas in the manifestos of the political parties as well as their language
complexity are analysed. The analysis shows inter alia that the most frequently occurring lemmas in the manifestos reflect either
the ideology or the position of the parties towards specific subjects, confirming for Danish preceding studies of English and German
manifestos. Successively, we scaled our analysis applying NLP methods to the transcribed speeches by members of the same parties
in the Parliament (Hansards) and trained machine learning algorithms in order to determine to what extent it is possible to predict the party of the politicians from the speeches. The speeches are a subset of the Danish Parliament corpus 2009–2017. The best results of the classification experiments gave a weighted F1-score of 0.57. These results are significantly better than the results obtained by the majority classifier (weighted F1-score = 0.11) and by chance results. They show that the party of the politicians can be distinguished from their speeches in nearly 60% of the cases, even if they debate about the same subjects and thus often use the same terminology. In the future, we will include the subject of the speeches in the prediction experiments.
Original languageEnglish
Title of host publicationCreating, Using and Linking of Parliamentary Corpora with Other Types of Political Discourse ( ParlaCLARIN II) : LREC2020 Workshop PARLACLARIN 2
EditorsDarja Fiser, Maria Eskevich, Franciska de Jong
PublisherEuropean Language Resources Association
Publication date2020
Pages51-57
ISBN (Print)9791095546474
ISBN (Electronic)9791095546474
Publication statusPublished - 2020

Number of downloads are based on statistics from Google Scholar and www.ku.dk


No data available

ID: 241213825