Workshop Programme

CPSS-2021, September, 6

back to CPSS 2021

The workshop proceedings and slides can be found here: proceedings & slides

9:00 – 10:00
Welcome & Invited Talk I

  • Title: Comparability and interoperability of parliamentary corpora: Easier said than done
    Tomaž Erjavec, Jožef Stefan Institute, Ljubljana
    abstract & slides

10:00 – 11:00
Oral Session 1

  • Small data problems in political research: a critical replication study (Hugo de Vos, Suzan Verberne)
  • Frame detection in German political discourses: How far can we go without large-scale manual corpus annotation? (Qi Yu and Anselm Fliethmann) slides
  • Share and shout: Discovering proto-slogans in online political communities (Irene Russo, Gloria Comandini, Tommaso Caselli, Viviana Patti)

11:00 – 11:15
Coffee Break

11:15 – 12:15
Twin Panel

  • EUINACTION (Nikoleta Yordanova, University of Leiden & Goran Glavaš, University of Mannheim)
  • MARDY (Sebastian Haunss, University of Stuttgart & Jonas Kuhn, University of Stuttgart)

12:15 – 13:15
Lunch Break

13:15 – 14:15
Oral Session 2

  • Application of the interactive Leipzig Corpus Miner as a generic research platform for the use in the social
    sciences (Christian Kahmann, Andreas Niekler, Gregor Wiedemann)
  • Textual contexts for ”Democracy”: Using topic- and word-models for exploring Swedish government official reports (Magnus Ahltorp, Luise Dürlich, Maria Skeppstedt) slides
  • Detecting policy fields in German parliamentary materials with Heterogeneous Information Networks and node embeddings (Alexander Brand, Wolf J. Schünemann, Tim König and Tanja Preböck) slides
  • A semi-supervised approach to classifying political agenda issues (Tim Kreutz, Walter Daelemans)

14:15 – 15:00
Invited Talk II

  • Title: Coding and Mining Arguments for a Better Democracy
    Katharina Esau, University of Düsseldorf
    abstract & slides

15:00 – 15:15
Coffee Break

15:15 – 16:15
Poster Session

  • Room 1 Predicting Policy Domains from Party Manifestos with BERT and Convolutional Neural Networks (Allison Koh, Daniel Kai Sheng Boey, Hannah Béchara)
    abstract & poster
  • Room 2 UNSC-NE: A Named Entity Extension to the UN Security Council Debates Corpus (Luis Glaser, Ronny Patz, Manfred Stede)
    abstract & poster
  • Room 3 LegalEGPD: Legal Element Classification on German Parliamentary Debates (Christopher Klamm, Martin Hock)
    abstract & poster
  • Room 4 The role of interjections in Austrian parliamentary debates (Klaus Hofmann, Tanja Wissik)
    abstract & poster
  • Room 5 Lexical convergence and divergence in Austrian parliamentary debates: a network-based approach (Anna Marakasova, Klaus Hofmann, Andreas Baumann, Julia Neidhardt, Tanja Wissik)
    abstract & poster
  • Room 6 Crime reports and the electoral success of radical right parties (cancelled) (Uwe Remer, Raphael Heiberger, Marius Kaffai)
    abstract & poster
  • Room 7 Representing Political Frames with Sentence Transformers - Transfer Learning with Frame Centroids (Moritz Laurer)
    abstract & poster
  • Room 8 Politics in between crises. A political and textual comparative analysis of budgetary speeches and expenditure (Dario Del Fante, Alice Cavalieri)
    abstract & poster

16:15 – 17:00
Invited Talk III

  • Title: Using NLP to Track Health Implications of Climate Change
    Slava Jankin, Hertie School of Governance, Berlin
    abstract & slides

17:00 – 17:45
Open Discussion
CPSS future activities & organisation of a shared task


Invited Talk Abstracts

Invited Talk I:
Tomaž Erjavec, Jožef Stefan Institute, Ljubljana
Title: Comparability and interoperability of parliamentary corpora: Easier said than done
Abstract:

The talk presents the ParlaMint corpora containing transcriptions of the sessions of 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about the 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project's GitHub repository, the complete corpora are deposited on the CLARIN.SI repository under the CC BY license, and available through its NoSketch Engine and KonText concordancers for exploration and analysis. The corpora are the result of the CLARIN ParlaMint project (2019-2021), and the talk presents the project, the corpus compilation workflow, the Parla-CLARIN-based encoding of the corpora and their distribution. We concentrate on the most difficult aspect of the project, which was the goal to make the corpora interoperable while at the same time having a large number of partners each one in charge of producing their own corpus. slides

back to top

Invited Talk II:
Katharina Esau, University of Düsseldorf
Title: Coding and Mining Arguments for a Better Democracy
Abstract:

The increased desire of citizens to participate in political processes has prompted numerous state-organized participation procedures in the last two decades (e.g., citizens’ assemblies, deliberative forums). Such attempts seem particularly promising at the local level of politics where, if successful and well designed, publicly expressed opinions and local knowledge of citizens can be incorporated into decision-making. Against the theoretical background of deliberative democracy, asynchronous online discussions offer a desirable infrastructure for a reasoned public sphere. As such these platforms are of great interest to investigate. However, successful platforms attract large numbers of participants and produce large amounts of text data and that quickly becomes difficult to manage manually. Therefore, automated techniques are invaluable when it comes to analysing citizens’ contributions and informing decision makers. The talk presents a method mix combining both manual content analysis and automated techniques. It shows that this can be a fruitful approach for the extraction of argument components and other discussion elements, such as emotions and narratives, from user content. To illustrate this, the methodology and results of a semi-automated content analysis that examined one participation platform (Tempelhofer Feld, Berlin) will be presented. The annotation tool BRAT will be briefly demonstrated and the possibilities for relational coding and analysis of text data explained. Throughout, the challenges and opportunities of interdisciplinary collaboration between social sciences and computer science will be addressed. slides

back to top

Invited Talk III:
Slava Jankin, Hertie School of Governance, Berlin
Title: Using NLP to Track Health Implications of Climate Change
Abstract:

Climate change is undermining the foundations of good health; threatening the food we eat, the air we breathe, and the hospitals and clinics we depend on. However, the response to climate change could be the greatest global health opportunity of the 21st century. The Lancet Countdown: Tracking Progress on Health and Climate Change brings together 35 leading academic institutions and UN agencies from every continent to monitor this transition from threat to opportunity. We track annual indicators of progress, empowering the health profession and supporting policymakers to accelerate their response. In the talk we discuss the application of natural language processing to develop and track a set of Lancet Countdown indicators, focusing on political speech, corporate commitments, social media, and parliamentary debates. We also discuss challenges in establishing attributional, causal links between statements about health and climate change.

slides

back to top

Poster abstracts

  • Lexical convergence and divergence in Austrian parliamentary debates: a network-based approach
    Anna Marakasova, Klaus Hofmann, Andreas Baumann, Julia Neidhardt, Tanja Wissik
    Abstract:

    Parliamentary debates are a key source for studying political discourse. Ostensibly, debates have the function to discuss the merits of legislative proposals and governmental policies. From a sociology-of-politics perspective, however, debates are at least equally important for developing the image of a party or of individual politicians in contrast to their political opponents. We investigate some of the dynamics at work in the debates of the Austrian Parliament, especially focusing on topical and discursive unity and divergence within and across parties. Our analysis is based on the Corpus of Austrian Parliamentary Records (ParlAT). We construct similarity-based network representations of politicians’ typical lexical repertoires in every year to model the topical and discursive patterns in their speeches. The discourse patterns of the parliament members are compared by calculating the Canberra distances between their networks, and general additive models are applied to identify diachronic trends. Our experiments show that similarity between politicians’ discourse patterns fluctuates over time. For instance, the distance between opposition parties decreased while the distance between government parties increased. However, there is a slight overall trend towards more convergence, between and within parties.​
    poster



  • Predicting Policy Domains from Party Manifestos with BERT and Convolutional Neural Networks
    Allison Koh, Daniel Kai Sheng Boey and Hannah Béchara
    Abstract:

    Hand-labeled political texts are often required in empirical studies on party systems, coalition building, agenda setting, and many other areas in political science research. While hand-labeling remains the standard procedure for analyzing political texts, it can be slow and expensive, and subject to human error and disagreement. Recent studies in the field have leveraged supervised machine learning techniques to automate the labeling process of electoral programs, debate motions, and other relevant documents. We build on current approaches to label shorter texts and phrases in party manifestos using a pre-existing coding scheme developed by political scientists for classifying texts by policy domain and policy preference. Using labels and data compiled by the Manifesto Project, we make use of the state-of-the-art Bidirectional Encoder Representations from Transformers (BERT) in conjunction with Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU) to seek out the best model architecture for policy domain and policy preference classification. We find that our proposed BERT-CNN model outperforms other approaches for the task of classifying statements from English language party manifestos by major policy domain.
    poster



  • The role of interjections in Austrian parliamentary debates
    Klaus Hofmann and Tanja Wissik
    Abstract:

    Interjections and heckling are an integral part of parliamentary discourse that is often overlooked. We investigate the distribution and usage of interjections in the Austrian Parliament. In particular, we ask whether interjections are employed asymmetrically by various groups of MPs and to what extent they differ with respect to style and function. Our analysis is based on the Corpus of Austrian Parliamentary Records (ParlAT). We combine the linguistic data with lexical ratings of abstractness (concreteness, imageability) and emotion (arousal, valence) to characterize the style and communicative functions of interjections along those dimensions. In terms of analytical methodology, we rely on regression modeling using R. Our results confirm that interjections are very unevenly distributed among MPs: (a) female members are much less likely to utter interjections than their male colleagues; (b) right-wing parties are more likely to use interjections than liberal and left-leaning parties; (c) members of the opposition are more prone to verbal interjections than members of governing parties; and (d) the relative incidence of interjections varies considerably between legislative periods, even when confounds such as gender and party membership are controlled for. In terms of style and function, the analysis suggests that (a) liberal parties’ interjections use language that is more abstract, more imageable, more positive, and less arousing; (b) the interjections from opposition parties are less positive and more arousing than those from coalition parties; (c) women’s interjections are more abstract and more positive than those from their male colleagues.
    poster



  • Legal Element Classification on German Parliamentary Debates
    Christopher Klamm and Martin Hock
    Abstract:

    Parliamentary debates provide a broad overview of statements for supporting or opposing the use of force by a state. If a state backs its practice by referring to a legal concept or the legal elements of that concept, the existence of a rule of customary international law (CIL) may be assumed. Traditionally, however, parliamentary debates have not been used as a source of CIL. We address this research gap with a joint approach that combines methods from political science, legal studies and natural language processing in order to ascertain the existence of CIL regarding the legal concepts of humanitarian intervention and responsibility to protect. We introduce a new framework to tackle the task of automatic legal elements classification to analyse the use of force in German parliamentary debates.
    poster



  • Representing Political Topics with Sentence Transformers - Transfer Learning with Topic Centroids
    Moritz Laurer
    Abstract:

    In the past years, several researchers have leveraged the Manifesto Corpus to train machine learning classifiers to classify texts in predefined political categories. Some papers have also applied these classifiers to texts from a different domain – an approach called transfer learning. While most recent papers use softmax classifiers, this paper proposes an alternative approach to classifying texts into the Manifesto categories, which might be more suitable for transfer learning settings: centroid classification with sentence transformers. ‘Topic centroids’ created with sentence transformers have important advantages over softmax classifiers in real-world transfer learning settings: They address the transfer learning challenge of different label spaces in source and target data Ys != Yt. First, when applied to another domain, softmax classifiers are forced to only classify texts into the classes they have been trained on. With centroid classification, on the other hand, any sentence can be compared to topic centroids and the comparison will return a low similarity score if the sentence is unrelated. Second, the centroids created with sentence transformers are modular. Third, sentence transformers can be used for multi-label classification, even if the training data is only annotated with single labels. If a sentence is close to the centroid of two topics, the sentence can be attributed to both topics based on a manually defined threshold. This approach is particularly useful for social science datasets like the Manifesto Corpus which suffers from noisy and overlapping labels, given the high complexity of the labelling scheme.
    poster



  • UNSC-NE: A Named Entity Extension to the UN Security Council Debates Corpus
    Luis Glaser, Ronny Patz, Manfred Stede
    Abstract:

    We present the Named Entity (NE) add-on to the previously published United Nations Security Council (UNSC) Debates corpus (Schoenfeld et al., 2019). Starting from the argument that the annotated classes in Named Entity Recognition (NER) pipelines offer a tagset that is too limited for relevant research questions in political science, we employ Named Entity Linking (NEL), using DBpedia-spotlight to produce the UNSC-NE corpus add-on. The validity of the tagging and the potential for future research are then discussed in the context of UNSC debates on Women, Peace and Security (WPS).
    poster



  • Crime reports and the electoral success of radical right parties
    Uwe Remer, Raphael Heiberger, Marius Kaffai
    Abstract:

    Rising public approval and electoral gains for radical right parties and populist movements contest liberal democracies all over Europe (Mudde 2007; Norris & Inglehart 2019: 9; Guth & Nelsen 2021). By positioning themselves as law and order parties together with a pronounced framing of the migration crisis in terms of security threats, radical right parties aim to obtain issue ownership on immigration and link it with crime (Mudde 2007: 146; Dinas & van Spanje 2011: 661; Arzheimer 2018: 151). The strategic use of agenda setting and priming to combine these issues is main part of the electoral strategy of populist radical right parties (Arzheimer 2018: 157). On the empirical side, the findings are mixed and scarce. Neither the effect immigration, nor the effect of crime on the electoral success of radical right parties is uncontested (Coffé et al. 2007; Mudde 2007: 224; Smith 2010; Dinas & van Spanje 2011; Arzheimer 2018: 156–157; Dennison 2020: 398; Deiss-Helbig & Remer 2021). Potential sources for the inconclusiveness of these findings are differences in scale and level of aggregation, heterogeneous operationalization of the theoretical constructs (Kaufmann & Goodwin 2018; Deiss-Helbig & Remer 2021: 3), and the complex interaction between the variables at play (Dinas & van Spanje 2011). Our contribution connects to this research puzzle. We ask, whether and how crime has an influence on the success of populist radical right parties [and how this effect is moderated by the local presence of immigrants]. Based on previous research, we assume that immigration evokes a perceived threat within parts of the electorate (Deiss-Helbig & Remer 2021) which leads to increased vote proportions of populist radical right parties (Green et al. 2016). We extend the state research as we study the proposed effects on the local level at several elections at three levels of the political system over six years: national, state, and local elections in the state of Baden-Württemberg, Germany. Beside official criminal statistics, we utilize original data to measure crime with a resolution down to the municipal level. The analysis is based on a corpus of nearly 500.000 police press reports published since 2015 by police departments in the state of Baden-Wuerttemberg, Germany. To be able to match the crime reports with the official electoral results on municipal level, first, the documents are geolocated. In a second step, human coders annotate a sample of the corpus for text classification. The labeled data are then used for supervised machine learning to classify the documents regarding the reported crime. The crimes that are identified by the classified documents are aggregated on the level of municipal administrative units. With this measure of crime prevalence on the local level, we are able to test the local influence of reported crime on the local vote shares of populist radical right parties and its interaction with immigration. As controls, we account for potential confounders like official crime statistic, urbanization and, wealth. Prospectively, we hope to be able to differentiate between different types of crime and to Our preliminary results reveal heterogeneous relationship between reported crime and votes for radical right parties.
    poster



  • Politics in between crises. A political and textual comparative analysis of budgetary speeches and expenditure.
    Alice Cavalieri, Dario Del Fante
    Abstract:

    European countries have recently been heavily hit by two dramatic crises (i.e, the great recession and the Covid-19 pandemic) which have dramatically smashed national and even supranational economic policies. From an agenda-setting perspective, the moment of crisis, by concentrating the attention on the trend topic, overpowers the usual political dynamics, transforming political actors from agenda-setters to agenda-takers. In ‘normal’ times, namely in those periods when there is not a sudden necessity to immediately respond to an external shock, instead, budget changes tend to be very small and largely determined by previous years’ spending choices. To efficaciously analyze budgetary changes over time, for a long time scholars have been using the annual percentage change as dependent variable, both considering the total expenditure or the expenditure in single budget categories. However, this method neglects a fundamental aspect, that is the complementary nature of spending allocation across budget functions. Plainly, when the government decides to increase expenditure for a certain budget category, a parallel reduction of expenditure in another budget category follows. As budget spending of multiple categories add up to 100% of total spending, budget trade-offs can be treated as a compositional dependent variable. Using public expenditure data from the Eurostat database which split expenditure into 10 macro categories, we engage in a preliminary analysis of budgetary trade-offs across spending categories focusing on Italy and the UK between 2013 and 2019, the period in-between two crises when – we suppose – governments had more chances to steer the allocation of expenditure according to their ideological and/or strategic considerations. The choice of this time frame is driven also by methodological reasons, as we use for the very first time the ParlaMint dataset which, for now, encompasses the period from 2013 onwards. ParlaMint, is a linguistically annotated corpus of parliamentary debates across European countries, financed by CLARIN ERIC, the research infrastructure for language as social and cultural data, which aims to convert existing contemporary multilingual and diverse cross-national parliamentary data into comparable and interpretable resources. This dataset is meant to provide one of the main – if not the most important – independent variable(s) explaining budgetary trade-offs. More precisely, starting from the available data, we build a sub-corpus of debates about the budget, then we create additional sub-corpus referring to specific budget categories and verify whether the emphasis during parliamentary debates leads to swinging in the allocation of expenditure. Including the analysis of parliamentary debates into the study of budgetary changes is crucial, as political parties exploit the parliamentary arena to strategically emphasize their policy position and because is in the parliament that political conflicts unfold and also both majority and opposition parties can shape government’s decisions. The combined analysis of parliamentary debates and of the final expenditure, carried out by adding quantitative text analysis techniques to the analysis of expenditure trade-offs, despite its preliminary stage, helps to better grasp dynamics which public budgeting is subject to and constitute a very promising venue for future research both for political science and linguistic scholars..
    poster



back to top
back to CPSS 2021

13688 13688