Robust mean change point testing in high-dimensional data with heavy tails

Abstract

We study a mean change point testing problem for high-dimensional data, with exponentially- or polynomially-decaying tails. In each case, depending on the $\ell_0$-norm of the mean change vector, we separately consider dense and sparse regimes. We characterise the boundary between the dense and sparse regimes under the above two tail conditions for the first time in the change point literature and propose novel testing procedures that attain optimal rates in each of the four regimes up to a poly-iterated logarithmic factor. Our results quantify the costs of heavy-tailedness on the fundamental difficulty of change point testing problems for high-dimensional data by comparing to the previous results under Gaussian assumptions.
To be specific, when the error vectors follow sub-Weibull distributions, a CUSUM-type statistic is shown to achieve a minimax testing rate up to $\sqrt{\log\log(8n)}$. When the error distributions have polynomially-decaying tails, admitting bounded $\alpha$-th moments for some $\alpha \geq 4$, we introduce a median-of-means-type test statistic that achieves a near-optimal testing rate in both dense and sparse regimes. In particular, in the sparse regime, we further propose a computationally-efficient test to achieve the exact optimality. Surprisingly, our investigation in the even more challenging case of $2 \leq \alpha < 4$, unveils a new phenomenon that the minimax testing rate has no sparse regime, i.e. testing sparse changes is information-theoretically as hard as testing dense changes. This phenomenon implies a phase transition of the minimax testing rates at $\alpha = 4$.

Publication
arXiv preprint, arXiv:2305.18987
Yudong Chen
Yudong Chen
LSE Fellow in Statistics

My research interests include changepoint detection, high-dimensional statistics, robust statistcs, online algorithms and machine learning.