German Elections 2017, Hertie Instructors Predict The Outcome

Mark Kayser, Professor of Applied Methods and Comparative Politics of the Hertie School of Governance, and Simon Munzert, Lecturer in Political Data Science of the Hertie School of Governance, have constructed forecasting models to predict the outcome of the 2017 German Federal Elections. Dynamic versions of each have been made open to the public and can be found online at Signal & Rauschen and Zweitstimme.org. Jonathan Rummel, an Executive Editor of the Governance Post, sat down to hear about their models and the “in-house” competition to see whose will be more accurate.

Forecasts as of September 21, 2017:

Kayser & Leiniger Model w/ Signal & Rauschen Forecast: CDU/CSU 36,6%, SPD 22,5%, Die Linke 9,3%, Bündnis 90/Die Grünen 8,3%, FDP 8,8% and AfD 9,9%.

Munzert forecast: CDU/CSU 36,7%, SPD 22,7%, Die Linke 9,5%, Bündnis 90/Die Grünen 7,8%, FDP 9,1% and AfD 9,9%.

TGP: You’ve both developed insightful models to predict the upcoming German elections, can you go into a bit more detail on the various components of your respective models?

M: Ours at zweitstimme.org is a hybrid model. We function under the idea that there are short term and long-term components that help us forecast the election. The structural component takes historical data from all the past bundestag elections since 1949 – vote results, standings before the elections, whether the chancellor is a member of that party or not and variables along those lines. This data helps us to arrive at an early forecast about 200 days before the election, which then serves as an anchor for the rest of the model.

M: You can think of that as the starting point of assimilation between the data sets, followed thereafter by the polling data. Both possess a degree of uncertainty so you fit the model by running hundreds of thousands of simulations to identify a distribution of possible outcomes. These outcomes are of possible electoral races that we can then use to draw inferences about the probability of certain events such as whether party A and B can enter parliament, or whether parties A and B can enter into a coalition due to their vote percentage shares.

TGP: And how about yours Professor Kayser?

K: Arndt Leiniger and I made our structural ‘Länder model’, named after the Länder (states) in Germany. Essentially, our structural model operates under the logic that there are fundamental empirical irregularities that help you predict elections in the future. Basically, you can learn from past elections. The primary idea is that if you are to combine that with polling data, you can increase predictability and improve upon methods that rely solely on aggregated polls.

K: We’ve now joined our model with Signal & Rauschen to serve as their anchor, while they integrate the polls over time. In that sense, it is similar to Simon’s. But what remains different is that ours is based on German state-level elections that are scattered throughout the calendar. As there have been many recent land-level elections we use that historical data to try to predict the federal election outcomes within each state itself and then aggregate up the predicted party vote shares towards the federal level. This increases our sample size and statistical power.

TGP: Right, what led you and Arndt to decide on using land-level data? That’s rather uncommon in Germany, is it not?

K: Actually, this is something that couldn’t really be done prior to this election due to the uberhangmandate, which was recently changed. The uberhangmandate system allowed a deviation from proportional distribution in which a party could be allotted more seats in parliament than they were entitled to if they won even more constituencies than needed for their entitlement. This was recently overturned by the German supreme court. Because the allocation of seats in the Bundestag is now fully proportional to the national second-ballot vote share, simply aggregating the predicted votes from each state is not a problem.

M: Right, so the new rule means that the uberhangmandate can still occur but will be compensated by additional mandates for other parties. What this means is that the Bundestag will increase, which is a new quantity of interest in itself. How big will the new Bundestag become?

TGP: That is fascinating because it seems that most forecasters are pretty confident with their results at this time. With that confidence in mind, what would you say has been the most difficult thing to account for?

M: That’s right. At the moment, both pundits and expert say the race isn’t about who is going to be the largest party in parliament, but who will be the opposition leader or the third largest party. That is an interesting race because for the first time there is a rightist, populist party with odds that it will be the largest opposition party in German politics.

K: Exactly, for us this was difficult because our model is using historical data from Länder elections and the AfD didn’t exist for many of these. So, we essentially counted them as part of the “Other” category. An interesting thing to note is that the main opposition party is usually recognized as the biggest party not in government. That could be the AfD in Germany, which means there are certain parliamentary privileges that are accrue to that party. Right now, this is one of the more interesting aspects of the German elections as there is a very close race between die Linke and the AfD.

TGP: On that point, what would you say were some particularly noteworthy aspects of the modelling itself?

M: Well, we’ve both had our experiences with forecasting so one exciting piece was being able to forecast the vote share of the AfD, despite working with limited historical data. And what’s always interesting in the German election setting is working with a multi-party system, as you can’t apply some of the more 1-to-1 methods that are applicable to elections like in the US and Great Britain. It’s always difficult to think about the compositional nature of the data or the historical breaks in your historical data, such as pre-1990 elections vs post-1990 elections and whether they should be treated in the same manner or not.

K: That might be one of the main points for both of our models. Since we aren’t just averaging polls and are using a structural component as an anchor or prior, this is really making a statement that historical patterns do matter. It shows that there are these long standing historical and empirical regularities in the data that will be reflected in the outcomes. In a sense, there is a theoretical component that drives German elections that is completely absent in poll aggregates. That allows us to learn from outcomes in our models and apply them in the future.

M: As political scientists, we rarely have the chance to produce forecasts of political outcomes and participate in that realm. And in a more general sense, it is satisfying to be able to communicate these results of our discipline to the broad public. Both of our models seem to do that through the online portals and I hope that is valuable to people.

K: I’d like to emphasize that because in the sense of election forecasting, what we are doing is probably one of the most difficult exercises that you can imagine social science. If you think about political science, how many predictive models do you really see? That is something that most of our theory and empirical testing is far away from. This is a real exception.

TGP: Since your models do differ and there’s a friendly, in-house competition, how are you judging the victor?

M: That’s a good question. I think, as a scientist, there are really two benchmarks: either root mean squared error or comparable benchmarks that help you quantify the deviation of the forecast from the final results, but also just how correctly we were able to calibrate our models in terms of uncertainty. It’s not like we have hundreds and hundreds of cases to train with, so getting the issue of uncertainty right is important. And I’d also consider a model to be successful if it helps people in getting a more reasonable perception of uncertainty in something like the German elections – in particular since the precision of both polls and forecasting models has been overestimated in the public. But this is obviously difficult to assess.

K: One of the obstacles we face is that this is more complicated than, say a two-party system. We are looking at what may very well be a 6-party parliament. So substantively what matters is the ranking; for example, will AfD be bigger than die Linke? But when they’re both so close together and within each other’s confidence intervals, there’s a good amount of randomness.

K: Honestly, I think it’s remarkable that so much of this is happening at the Hertie School, there’s a large academic community in Germany but despite all the statistics and political science departments in the country, so much of this is happening at this small private university in Berlin.

M: I’d definitely agree with that. In the Spring 2017 semester I did an election forecasting seminar and many of the students came up with some very creative models that were meant for forecasting many different elections, not just the German. It is truly a fascinating time to be at the Hertie School.

Mark Kayser is Professor of Applied Methods and Comparative Politics at the Hertie School of Governance. His research primarily focuses on elections and political economy. Kayser’s major projects centre on partisan asymmetries in electoral accountability, media reporting on the economy, and the effect of electoral competitiveness on incumbent behaviour.

Simon Munzert is a Lecturer in Political Data Science at the Hertie School of Governance. His research interests include public opinion, political representation, and the use of new media in politics. He is the principal investigator of an international cooperation project funded by the VolkswagenStiftung entitled “Paying Attention to Attention: Media Exposure and Opinion Formation in an Age of Information Overload.”

Jonathan Rummel is a soon-to-be Master of Public Policy graduate of the Hertie School. His work focuses on data visualization and the intersection of climate policy, energy transition and global mobility. He is also addicted to Ultimate Frisbee.