Causal discovery in the presence of latent variables


A fundamental problem in attributing causal information from large-scale remote sensing time series datasets is that important confounding drivers are not observed. From a theoretical point of view, this problem can be addressed by causal discovery algorithms such as Fast Causal Discovery (FCI) which can in many cases decide whether an estimated link is (directly or indirectly) causal or due to an unobserved confounder. However, such algorithms are currently not well adapted to large-scale climate time series with high autocorrelation and time lags. We will investigate and develop new methodologies for unobserved confounders. To this end causal discovery methods suitable for large-scale linear and nonlinear time series datasets will be extended using ideas from FCI. Utilizing the causality benchmark database (initiated by partners DLR and UVEG), the method will be systematically evaluated and improved. Then the new technique will be applied on simulated aerosol-cloud interactions using large-eddy simulations with the ICON-HAM model together with partner UOXF. By systematically including and excluding selected variables, this approach will allow to test the effect of unobserved variables on estimating aerosol-cloud interactions. Finally, the method will be applied on the A-Train (MODIS/CALIOP/CloudSat) satellite datasets together with other relevant climate observations to investigate aerosol effects on cloud albedo and the associated radiative effects. In a secondment at Amazon the student will apply the new method to further datasets. 


Jakob Runge (supervisor), Joachim Denzler (co-supervisor), Philip Stier (co-supervisor), Javier Gonzalez Hernandez (Amazon, non-academic advisor)


3 months at University of Oxford & 6 months at Amazon Cambridge

Enrolment in Doctoral degree

PhD in Data Science at University of Jena

tile dlrwhite bg