04.06.2025 12:15 Gilles Blanchard (Université Paris-Saclay, FR): Estimating a large number of high-dimensional vector means
The problem of simultaneously estimating multiple means from independent samples has a long history in statistics, from the seminal works of Stein, Robbins in the 50s, Efron and Morris in the 70s and up to the present day. This setting can be also seen as an (extremely stylized) instance of "personalized federated learning" problem, where each user has their own data and target (the mean of their personal distribution), but potentially want to share some relevant information with "similar" users (though there is no information available a priori about which users are "similar"). In this talk I will concentrate on contributions to the high-dimensional case, where the samples and their means belong to R^d with "large" d.
\[ \]
We consider a weighted aggregation scheme of empirical means of each sample, and study the possible improvement in quadratic risk over the simple empirical means. To make the stylized problem closer to challenges encountered in practice, we allow (a) full heterogeneity of sample sizes
(b) zero a priori knowledge of the structure of the mean vectors (c) unknown and possibly heterogeneous sample covariances.
\[ \]
We focus on the role of the effective dimension of the data in a "dimensional asymptotics'' point of view, highlighting that the risk improvement of the proposed method satisfies an oracle inequality approaching an adaptive (minimax in a suitable sense) improvement as the effective dimension grows large.
\[ \]
(This is joint work with Jean-Baptiste Fermanian and Hannah Marienwald)
Quelle
10.06.2025 16:00 Prof. Johannes Maly: Analyzing the implicit regularization of Gradient Descent
Gradient descent (GD) and its variants are vital ingredients in neural network training. It is widely believed that the impressive generalization performance of trained models is partially due to some form of implicit bias of GD towards specific minima of the loss landscape. In this talk, we will review and discuss approaches to rigorously identify and analyze implicit regularization of GD in simplified training settings. We furthermore provide evidence suggesting that a single implicit bias is not sufficient to explain the effectiveness of GD in training tasks.
Quelle
17.06.2025 16:30 Eyal Neuman (Imperial College London): Stochastic Graphon Games with Memory
We study finite-player dynamic stochastic games with heterogeneous interactions and non-Markovian linear-quadratic objective functionals. We derive the Nash equilibrium explicitly by converting the first-order conditions into a coupled system of stochastic Fredholm equations, which we solve in terms of operator resolvents. When the agents' interactions are modeled by a weighted graph, we formulate the corresponding non-Markovian continuum-agent game, where interactions are modeled by a graphon. We also derive the Nash equilibrium of the graphon game explicitly by first reducing the first-order conditions to an infinite-dimensional coupled system of stochastic Fredholm equations, then decoupling it using the spectral decomposition of the graphon operator, and finally solving it in terms of operator resolvents. Moreover, we show that the Nash equilibria of finite-player games on graphs converge to those of the graphon game as the number of agents increases. This holds both when a given graph sequence converges to the graphon in the cut norm and when the graph sequence is sampled from the graphon. We also bound the convergence rate, which depends on the cut norm in the former case and on the sampling method in the latter. Finally, we apply our results to various stochastic games with heterogeneous interactions, including systemic risk models with delays and stochastic network games.
______________________________
Invited by Prof. Alexander Kalinin
Quelle
23.06.2025 16:30 Jae Youn Ahn (Ewha Womans University, Korea): Interpretable Generalized Coefficient Models Integrating Deep Neural Networks within a State-Space Framework for Insurance Credibility
Credibility methods in insurance provide a linear approximation, formulated as a weighted average of claim history, making them highly interpretable for estimating the predictive mean of the a posteriori rate. In this presentation, we extend the credibility method to a generalized coefficient regression model, where credibility factors—interpreted as regression coefficients—are modeled as flexible functions of claim history. This extension, structurally similar to the attention mechanism, enhances both predictive accuracy and interpretability. A key challenge in such models is the potential issue of non-identifiability, where credibility factors may not be uniquely determined. Without ensuring the identifiability of the generalized coefficients, their interpretability remains uncertain. To address this, we first introduce a state-space model (SSM) whose predictive mean has a closed-form expression. We then extend this framework by incorporating neural networks, allowing the predictive mean to be expressed in a closed-form representation of generalized coefficients. We demonstrate that this model guarantees the identifiability of the generalized coefficients. As a result, the proposed model not only offers flexible estimates of future risk—matching the expressive power of neural networks—but also ensures an interpretable representation of credibility factors, with identifiability rigorously established. This presentation is based on joint work with Mario Wuethrich (ETH Zurich) and Hong Beng Lim (Chinese University of Hong Kong).
Quelle
23.06.2025 16:30 Adam Waterbury (Denison University): Large Deviations for Empirical Measures of Self-Interacting Markov Chains
Self-interacting Markov chains arise in a range of models and applications. For example, they can be used to approximate the quasi-stationary distributions of irreducible Markov chains and to model random walks with edge or vertex reinforcement. The term self-interacting Markov chain is something of a misnomer, as such processes interact with their full path history at each time instant, and therefore are non-Markovian. Under conditions on the self-interaction mechanism, we establish a large deviation principle for the empirical measure of self-interacting chains on finite spaces. In this setting, the rate function takes a strikingly different form than the classical Donsker-Varadhan rate function associated with the empirical measure of a Markov chain; the rate function for self-interacting chains is typically non-convex and is given through a dynamical variational formula with an infinite horizon discounted objective function.
Quelle