Almost half of the total seats in the German Bundestag are awarded through first-past-the post elections at the electoral-district level. However, many election forecasting models do not consider this. In this paper we present an approach to predicting the candidate-vote shares at the district level for the German Federal Elections. To that end, we combine the national-level election prediction model from zweitstimme.org with two district-level prediction models, a linear regression and an artificial neural network, that both use the same candidate and district characteristics for their predictions. All data in our approach are publicly available prior to the respective election; thus, our model yields real forecasts. The model is therefore able to provide valuable information to running candidates and the interested public in future elections. Moreover, our prediction results are also relevant for substantive research; with the aid of the resulting odds of winning, better measures can be created to characterize the competitiveness of an electoral district and the expected closeness of electoral-district elections, which can influence political behaviour. Furthermore, the prediction allows empirical statements to be made about the expected size of the Bundestag as well as the composition of its personnel.
We offer a dynamic Bayesian forecasting model for multi-party elections. It com- bines data from published pre-election public opinion polls with information from fundamentals-based forecasting models. The model takes care of the multi-party nature of the setting and allows making statements about the probability of other quantities of interest, such as the probability of a plurality of votes for a party or the majority for certain coalitions in parliament. We present results from two ex ante forecasts of elections that took place in 2017 and are able to show that the model outperforms fundamentals-based forecasting models in terms of accuracy and the calibration of uncertainty. Provided that historical and current polling data are available, the model can be applied to any multi-party setting.
The introduction of new "machine learning" methods and terminology to political science complicates the interpretation of results. Even more so, when one term – like cross-validation – can mean very different things. We find different meanings of cross-validation in applied political science work. In the context of predictive modeling, cross-validation can be used to obtain an estimate of true error or as a procedure for model tuning. Using a single cross-validation procedure to obtain an estimate of the true error and for model tuning at the same time leads to serious misreporting of performance measures. We demonstrate the severe consequences of this problem with a series of experiments. We also observe this problematic usage of cross-validation in applied research. We look at Muchlinski et al. (2016) on the prediction of civil war onsets to illustrate how the problematic cross-validation can affect applied work. Applying cross-validation correctly, we are unable to reproduce their findings. We encourage researchers in predictive modeling to be especially mindful when applying cross-validation.
We present results of an ex-ante forecast of party-specific vote shares at the German Federal Election 2017. To that end, we combine data from published trial heat polls with structural information. The model takes care of the multi-party nature of the setting and allows making statements about the probability of certain events, such as the plurality of votes for a party or the majority for coalition options in parliament. The forecasts of our model are continuously being updated on the platform zweitstimme.org. The value of our approach goes beyond the realms of academia - We equip journalists, political pundits, and ordinary citizens with information that can help make sense of the parties’ latent support and ultimately make voting decisions better informed.