Seminar on Statistics and Data Science

This seminar series is organized by the research group in statistics and features talks on advances in methods of data analysis, statistical theory, and their applications. The speakers are external guests as well as researchers from other groups at TUM. All talks in the seminar series are listed in the Munich Mathematical Calendar.

The seminar takes place in room BC1 2.01.10 under the current rules and simultaneously via zoom. To stay up-to-date about upcoming presentations please join our mailing list. You will receive an email to confirm your subscription.

Zoom link

Join the seminar. Please use your real name for entering the session. The session will start roughly 10 minutes prior to the talk.


Upcoming talks

05.06.2023 15:30 Fang Han (University of Washington, Seattle): Chattejee's rank correlation: what is new?

This talk will provide an overview of the recent progress made in exploring Sourav Chatterjee's newly introduced rank correlation. The objective is to elaborate on its practical utility and present several new findings pertaining to (a) the asymptotic normality and limiting variance of Chatterjee's rank correlation, (b) its statistical efficiency for testing independence, and (c) the issue of its bootstrap inconsistency. Notably, the presentation will reveal that Chatterjee's rank correlation is root-n consistent, asymptotically normal, but bootstrap inconsistent - an unusual phenomenon in the literature.

07.06.2023 12:15 Manfred Denker (Penn State University): Monte Carlo estimation of multiple stochastic integrals and its statistical applications

Multiple stochastic integrals with respect to Brownian motion is a classical topic while its version with respect to stable processes has created minor interest. Their distributions can be simulated using U-statistics. This will be discussed in the first part of the talk. On the other hand this representation allows for statistical applications for observations with slowly decaying tail distributions. I shall present some simulations and give an application from neuroscience.

07.06.2023 13:15 Alexis Derumigny (Delft University of Technology): t.b.a.


14.06.2023 13:15 Marcel Wienöbst (Universität zu Lübeck): Linear-Time Algorithms for Front-Door Adjustment in Causal Graphs

Causal effect estimation from observational data is a fundamental task in empirical sciences. It becomes particularly challenging when unobserved confounders are involved in a system. This presentation provides an introduction to front-door adjustment – a classic technique which, using observed mediators, allows to identify causal effects even in the presence of unobserved confounding. Focusing on the algorithmic aspects, this talk presents recent results for finding front-door adjustment sets in linear-time in the size of the causal graph. Link to technical report:

22.06.2023 12:15 Harry Joe (University of British Columbia): Vine copula regression for observational studies

If explanatory variables and a response variable of interest are simultaneously observed, then multivariate models based on vine pair-copula constructions can be fit, from which inferences are based on the conditional distribution of the response variable given the explanatory variables. For applications, there are things to consider when implementing this idea. Topics include: (a) inclusion of categorical predictors; (b) right-censored response variable; (c) for a pair with one ordinal and one continuous variable, diagnostics for copula choice and assessing fit of copula; (d) use of empirical beta copula; (e) performance metrics for prediction/classification and sensitivity to choice of vine structure and pair-copulas on edges of vine; (f) weighted log-likelihood for ordinal response variable; (g) comparisons with linear regression methods.

Previous talks

within the last 90 days

23.05.2023 13:15 Jakob Runge (German Aerospace Center, Jena/Technische Universität Berlin): Causal inference for data-driven science

Machine learning excels in learning associations and patterns from data and is increasingly adopted in natural-, life- and social sciences, as well as engineering. However, many relevant research questions about such complex systems are inherently causal and machine learning alone is not designed to answer them. At the same time there often exists ample theoretical and empirical knowledge in the application domains. In this talk, I will briefly outline causal inference as a powerful framework providing the theoretical foundations to combine data and machine learning models with qualitative domain assumptions to quantitatively answer causal questions. I will discuss challenges ahead and selected application scenarios to spark interest for integrating causal thinking into data-driven science. Short bio: Jakob Runge heads the Causal Inference group at the German Aerospace Center’s Institute of Data Science in Jena since 2017 and is guest professor of computer science at TU Berlin since 2021. His group develops theory, methods, and accessible software for causal inference on time series data inspired by challenges in various application domains. Jakob studied physics at Humboldt University Berlin and finished his PhD project at the Potsdam Institute for Climate Impact Research in 2014. For his studies he was funded by the German National Foundation (Studienstiftung) and his thesis was awarded the Carl-Ramsauer prize by the Berlin Physical Society. In 2014 he won a $200.000 Fellowship Award in Studying Complex Systems by the James S. McDonnell Foundation and joined the Grantham Institute, Imperial College London, from 2016 to 2017. In 2020 he won an ERC Starting Grant with his interdisciplinary project CausalEarth. On he provides Tigramite, a time series analysis python module for causal inference. For more details, see:

23.05.2023 14:15 Andrew McCormack (Duke University): Aspects of Information Geometry and Efficiency for Kronecker Covariances

The Kronecker covariance structure for array data posits that the covariances along comparable modes, such as rows and columns, of an array are similar. For example, when modelling a multivariate time series, it might be assumed that each individual series follows the same AR process, up to changes in scale, while at each particular timepoint the observations across series have the same correlation structure. Over and above being a plausible model for many types of data, the Kronecker covariance assumption is especially useful in high-dimensional settings, where unconstrained covariance matrix estimates are typically unstable. In this talk we explore the information geometric aspects of the estimation of Kronecker covariance matrices. The asymptotic properties of two estimators, the maximum likelihood estimator and an estimator based on partial traces, are contrasted. It is shown that the partial trace estimator is inefficient, where the relative performance of this estimator can be quantified in terms of a principle angle between tangent spaces. This principle angle can be related to the eigenvalues of the underlying Kronecker covariance matrix. By defining a rescaled version of the partial trace operator, an asymptotically efficient correction to the partial trace estimator is proposed. This estimator has a closed-form expression and also has a useful equivariance property. An orthogonal parameterization of the collection of Kronecker covariances is subsequently motivated by the rescaled partial trace estimator. Orthogonal parameterizations imply that the components of the parameterization are asymptotically independent, which in the Kronecker case has implications for tests concerning row and column covariances.

10.05.2023 12:15 Gernot Müller (Universität Augsburg): New challenges in electricity price modeling

Renewable energies, in particular wind and solar power, have become responsible for a large part of the variation in electricity prices in the past years. Moreover, traders are more and more interested in models which can be used to forecast the day-ahead-/intraday-price-spread. In this talk we shed some light on new models for electricity prices using continuous autoregressive processes. In addition, we discuss intraday price modeling and spread forecasting based on Bayesian statistics and artificial intelligence.

For talks more than 90 days ago please have a look at the Munich Mathematical Calendar (filter: "Oberseminar Statistics and Data Science").