21.05.2025 12:15 Michael Muma (TU Darmstadt): The T-Rex Selector: Fast High-Dimensional Variable Selection with False Discovery Rate Control
Providing guarantees on the reproducibility of discoveries is essential when drawing inferences from high-dimensional data. Such data is common in numerous scientific domains, for example, in biomedicine, it is imperative to reliably detect the genes that are truly associated with the survival time of patients diagnosed with a certain type of cancer, or in finance, one aims at determining a sparse portfolio to reliably perform index tracking. This talk introduces the Terminating-Random Experiments (T-Rex) selector, a fast multivariate variable selection framework for high-dimensional data. The T-Rex selector provably controls a user-defined target false discovery rate (FDR) while maximizing the number of selected variables. It scales to settings with millions of variables. Its computational complexity is linear in the number of variables, making it more than two orders of magnitude faster than, e.g., the existing model-X knockoff methods. An easy-to-use open-source R package that implements the TRexSelector is available on CRAN. The focus of this talk lies on high-dimensional linear regression models, but we also describe extensions to principal component analysis (PCA) and Gaussian graphical models (GGMs).
Quelle
23.07.2025 12:15 Oezge Sahin (TU Delft, NL): t.b.a.
t.b.a.
Quelle