Correctly predicting civil war onset can save many lives. In a recent paper, Muchlinski et al. (2016) compare the predictive performance of different logit model specifications with Random Forest. They find that their Random Forest approach outperforms all logit model specifications in-sample and out-of-sample. However, we show that the impressive superiority of their Random Forest model is an artifact of methodological flaws. We use this opportunity to fix these flaws and provide a checklist for better model comparisons. To move the field forward and to save lives we need to be sure that new models actually perform better.