Seminario: "High-dimensional banking data: cluster analysis and outlier detection"

Venerdì 31/03/2023 alle ore 12:00 presso l'aula 1 del Palazzo delle Scienze il Prof. Matteo Farnè (Università di Bologna, Dipartimento di Scienze Statistiche) terrà un seminario dal titolo "High-dimensional banking data: cluster analysis and outlier detection".

Di seguito l'abstract del talk:

Abstract: In this talk, we describe two statistical methodologies tailored to high-dimensional banking data. First, we investigate the credit risk in the loan portfolio of banks following different business models. We develop a data-driven methodology for identifying the business models of the 365 largest European banks that is suitable for very granular harmonised supervisory data. Our dataset allows us to take into account the full range of the activities in which banks are involved. The proposed method, a trimmed version of factorial k-means, combines in an optimal way data clustering, dimensionality reduction and outlier detection, building upon the clustering algorithm proposed by Vichi and Kiers (2001) enhanced by a procedure to detect ‘outlier’ banks. We identify four business models and exclude as ’outliers’ banks that follow idiosyncratic business models. Furthermore, empirical evidence is provided that banks following different business models differ significantly with respect to the credit risk they undertake in their loan portfolios. Traditional commercial banks are characterized by the lowest levels of credit risk while the loan portfolios of securities holding banks are riskier compared to the other banks.
Second, we present a methodology, called ROBOUT, to identify outliers conditional on a set of linearly related predictors, retrieved from a large number of variables. In particular, ROBOUT is able to identify observations with outlying conditional mean or variance when the dataset contains bad leverage outliers in the predictors, highly correlated variables, and a large dimension compared to the sample size. ROBOUT entails a preliminary robust imputation procedure, that prevents bad leverage outliers from corrupting predictor recovery, a selection stage of the statistically relevant predictors (via cross-validated LASSO-penalized Huber or LAD loss), the estimation of a robust regression model based on the selected predictors (via LTS or MM regression), and a criterion to identify conditional outliers. We conduct a comprehensive simulation study in which the different variants of the proposed algorithm are tested under a wide range of perturbation scenarios. The ROBOUT combination formed by LASSO-penalized Huber loss and MM regression turns out to be the best in terms of predictor retrieval and conditional outlier detection under the above described perturbed conditions, also compared to existing integrated methodologies like SPARSE-LTS and RLARS. The proposed methodology is finally applied to granular balance sheet data collected by the European Central Bank.

Referenti:
Prof. Di Mari, Prof. Drago, Prof. Giarlotta

________________________________________
Categoria: 
Seminari
Data di Pubblicazione: 
Giovedì, 30 Marzo, 2023