Bio

I am a Postdoctoral Researcher at the Statistical Methods Unit of the Institute for Employment Research (IAB) in Nuremberg, Germany, and at the Chair for Statistics and Data Science in Social Sciences and the Humanities at LMU Munich.

My research focuses on privacy-preserving AI and synthetic data generation. I completed my PhD (“Generative Adversarial Nets for Social Scientists”) at the University of Mannheim in 2023, supervised by Prof. Thomas Gschwend, Ph.D. and Prof. Dr. Frauke Kreuter. Previously, I held research positions at Boston University’s Computer Science Department (2021-2023) and the Simons Institute for the Theory of Computing at UC Berkeley (2019).

I believe that translation from computer science to social science requires careful evaluation, not blind adoption of hyped techniques. Newer and more complex methods are not automatically better—rigorous benchmarking reveals strengths and limitations. My dissertation exemplified this: while introducing GANs to social scientists, I also demonstrated that GAN-based multiple imputation methods fail to meet the standards required for valid statistical inference in most scenarios.

I collaborate with the US Census Bureau and the German Federal Statistical Office (Destatis) on privacy-preserving synthetic data for sensitive administrative datasets. I have published in leading venues across disciplines, including ICLR, PNAS, the Harvard Data Science Review, and Political Analysis.

As a co-founder and contributor to zweitstimme.org, I co-built a platform that communicates scientific election forecasts for German Federal elections to a broad audience, covered by major German media including Zeit Online, Tagesspiegel, and the Washington Post.

Current Affiliations

Previous Positions

  • Boston University - Postdoctoral Researcher, Computer Science Department (2021-2023)
  • Simons Institute for the Theory of Computing, UC Berkeley - Research Stay (2019)

Research Interests

  • Differential Privacy
  • Privacy-Preserving Synthetic Data
  • Trustworthy AI
  • Machine Learning & Deep Learning
  • Generative Adversarial Networks (GANs)
  • Multiple Imputation & Missing Data
  • Election Forecasting
  • Reproducibility & Research Methods

Software

  • RGAN - Generative Adversarial Networks in R (8,300+ CRAN downloads)
  • MIBench - First standardized benchmark for multiple imputation algorithms
  • Post-GAN Boosting - Replication code for ICLR 2021 paper

Education

  • Ph.D. - Graduate School of Economic and Social Sciences, University of Mannheim (2023)
    • Thesis: “Generative Adversarial Nets for Social Scientists”
    • Supervisors: Prof. Thomas Gschwend, Ph.D. and Prof. Dr. Frauke Kreuter
  • M.A. in Political Science - University of Mannheim (2016)
  • B.A. in Governance and Public Policy - University of Passau (2013)

Frequently Asked Questions

What is your research focus?

I specialize in privacy-preserving AI and synthetic data generation. My work develops methods that allow organizations to share and analyze sensitive data while protecting individual privacy through techniques like differential privacy and generative models.

What is differential privacy?

Differential privacy is a mathematical framework that provides formal, quantifiable privacy guarantees. It ensures that the output of a data analysis doesn’t reveal whether any specific individual’s data was included in the dataset, making it possible to learn aggregate patterns while protecting individual information.

How can synthetic data protect privacy?

Synthetic data protects privacy by generating new data points that preserve the statistical properties of original data without containing real individuals’ information. When combined with differential privacy guarantees, synthetic data can enable data sharing for research and policy analysis while preventing re-identification of individuals.

What are Generative Adversarial Networks (GANs)?

GANs are a class of machine learning models where two neural networks compete: a generator creates synthetic data while a discriminator tries to distinguish synthetic from real data. This adversarial process produces increasingly realistic synthetic data. My dissertation “Generative Adversarial Nets for Social Scientists” evaluated their applications and limitations for social science research.

Who do you collaborate with?

I collaborate with the U.S. Census Bureau and the German Federal Statistical Office (Destatis) on privacy-preserving synthetic data methods for sensitive administrative datasets. I also work with researchers at IAB (Institute for Employment Research) and LMU Munich.

Contact