10.06.2025 16:00 Prof. Johannes Maly: Analyzing the implicit regularization of Gradient Descent
Gradient descent (GD) and its variants are vital ingredients in neural network training. It is widely believed that the impressive generalization performance of trained models is partially due to some form of implicit bias of GD towards specific minima of the loss landscape. In this talk, we will review and discuss approaches to rigorously identify and analyze implicit regularization of GD in simplified training settings. We furthermore provide evidence suggesting that a single implicit bias is not sufficient to explain the effectiveness of GD in training tasks.
Quelle
17.06.2025 16:30 Eyal Neuman (Imperial College London): Stochastic Graphon Games with Memory
We study finite-player dynamic stochastic games with heterogeneous interactions and non-Markovian linear-quadratic objective functionals. We derive the Nash equilibrium explicitly by converting the first-order conditions into a coupled system of stochastic Fredholm equations, which we solve in terms of operator resolvents. When the agents' interactions are modeled by a weighted graph, we formulate the corresponding non-Markovian continuum-agent game, where interactions are modeled by a graphon. We also derive the Nash equilibrium of the graphon game explicitly by first reducing the first-order conditions to an infinite-dimensional coupled system of stochastic Fredholm equations, then decoupling it using the spectral decomposition of the graphon operator, and finally solving it in terms of operator resolvents. Moreover, we show that the Nash equilibria of finite-player games on graphs converge to those of the graphon game as the number of agents increases. This holds both when a given graph sequence converges to the graphon in the cut norm and when the graph sequence is sampled from the graphon. We also bound the convergence rate, which depends on the cut norm in the former case and on the sampling method in the latter. Finally, we apply our results to various stochastic games with heterogeneous interactions, including systemic risk models with delays and stochastic network games.
______________________________
Invited by Prof. Alexander Kalinin
Quelle
23.06.2025 12:15 Qingqing Zhai (Shanghai University, CN): Modeling Complex System Deterioration: From Unit Degradation to Networked Recurrent Failures
This presentation addresses statistical challenges in modeling the deterioration of complex systems, spanning from individual unit degradation to interdependent network failures. First, we introduce statistical degradation data modeling using stochastic processes. Then, we shift to modeling recurrent failures in large-scale infrastructure networks (e.g., water distribution systems). Motivated by 16 years of Scottish Water pipe failure data, we propose the novel Network Gamma-Poisson Autoregressive NHPP (GPAN) model. This two-layer framework captures temporal dynamics via Non-Homogeneous Poisson Processes (NHPPs) with node-specific frailties and spatial dependencies through a gamma-Poisson autoregressive scheme structured by the network's Directed Acyclic Graph (DAG). To overcome computational intractability, a scalable sum-product algorithm based on factor graphs and message passing is developed for efficient inference, enabling application to networks with tens of thousands of nodes. We demonstrate how this approach provides accurate failure predictions, identifies high-risk clusters, and supports operational management and risk assessment. The methodologies presented offer powerful tools for reliability analysis across diverse engineering contexts, from product lifespan prediction to critical infrastructure resilience.
Quelle
23.06.2025 16:30 Jae Youn Ahn (Ewha Womans University, Korea): Interpretable Generalized Coefficient Models Integrating Deep Neural Networks within a State-Space Framework for Insurance Credibility
Credibility methods in insurance provide a linear approximation, formulated as a weighted average of claim history, making them highly interpretable for estimating the predictive mean of the a posteriori rate. In this presentation, we extend the credibility method to a generalized coefficient regression model, where credibility factors—interpreted as regression coefficients—are modeled as flexible functions of claim history. This extension, structurally similar to the attention mechanism, enhances both predictive accuracy and interpretability. A key challenge in such models is the potential issue of non-identifiability, where credibility factors may not be uniquely determined. Without ensuring the identifiability of the generalized coefficients, their interpretability remains uncertain. To address this, we first introduce a state-space model (SSM) whose predictive mean has a closed-form expression. We then extend this framework by incorporating neural networks, allowing the predictive mean to be expressed in a closed-form representation of generalized coefficients. We demonstrate that this model guarantees the identifiability of the generalized coefficients. As a result, the proposed model not only offers flexible estimates of future risk—matching the expressive power of neural networks—but also ensures an interpretable representation of credibility factors, with identifiability rigorously established. This presentation is based on joint work with Mario Wuethrich (ETH Zurich) and Hong Beng Lim (Chinese University of Hong Kong).
Quelle
23.06.2025 16:30 Adam Waterbury (Denison University): Large Deviations for Empirical Measures of Self-Interacting Markov Chains
Self-interacting Markov chains arise in a range of models and applications. For example, they can be used to approximate the quasi-stationary distributions of irreducible Markov chains and to model random walks with edge or vertex reinforcement. The term self-interacting Markov chain is something of a misnomer, as such processes interact with their full path history at each time instant, and therefore are non-Markovian. Under conditions on the self-interaction mechanism, we establish a large deviation principle for the empirical measure of self-interacting chains on finite spaces. In this setting, the rate function takes a strikingly different form than the classical Donsker-Varadhan rate function associated with the empirical measure of a Markov chain; the rate function for self-interacting chains is typically non-convex and is given through a dynamical variational formula with an infinite horizon discounted objective function.
Quelle