On the Consistency of a Random Forest Algorithm in the Presence of Missing Entries

Abstract

This paper tackles the problem of constructing a non-parametric predictor when the latent variables are given with incomplete information. The convenient predictor for this task is the random forest algorithm in conjunction to the so-called CART criterion. The proposed technique enables a partial imputation of the missing values in the data set in a way that suits both a consistent estimator of the regression function as well as a partial recovery of the missing values. A proof of the consistency of the random forest estimator is given in the case where each latent variable is missing completely at random (MCAR).

This is an original manuscript of an article published by Taylor & Francis in Journal of Nonparametric Statistics on 06 June 2023, available at: http://www.tandfonline.com/doi/full/10.1080/10485252.2023.2219783.

To cite this article:
Irving Gómez-Méndez & Emilien Joly (2023). On the consistency of a random forest algorithm in the presence of missing entries, Journal of Nonparametric Statistics.

To get the .bib format for the citation clic on the Cite button above.

Publication
On the consistency of a random forest algorithm in the presence of missing esntries