Some Remarks on the Data Imputation Using “missForest” Method

Misztal, Małgorzata

dc.contributor.author	Misztal, Małgorzata
dc.date.accessioned	2015-06-23T12:40:07Z
dc.date.available	2015-06-23T12:40:07Z
dc.date.issued	2013
dc.identifier.issn	0208-6018
dc.identifier.uri	http://hdl.handle.net/11089/10081
dc.description.abstract	Missing data are quite common in practical applications of statistical methods and imputation is a general statistical method for the analysis of incomplete data sets. Stekhoven and Bühlmann (2012) proposed an iterative imputation method (called “missForest”) based on Random Forests (Breiman 2001) to cope with missing values. In the paper a short description of “missForest” is presented and some selected missing data techniques are compared with “missForest” by artificially simulating different proportions and mechanisms of missing data using complete data sets from the UCI repository of machine learning databases.	pl_PL
dc.description.abstract	W pracy Stekhovena i Bühlmanna (2012) zaproponowano nową iteracyjną metodę imputacji (nazwaną „missForest”) opartą na metodzie Random Forests Breimana (2001). W niniejszym artykule omówiono metodę „missForest” i porównano kilka wybranych technik postępowania w sytuacji występowania braków danych z metodą „missForest”. W tym celu wykorzystano podejście symulacyjne generując różne proporcje i mechanizmy powstawania braków danych w zbiorach danych pochodzących głównie z repozytorium baz danych na Uniwersytecie Kalifornijskim w Irvine.	pl_PL
dc.language.iso	en	pl_PL
dc.publisher	Wydawnictwo Uniwersytetu Łódzkiego	pl_PL
dc.relation.ispartofseries	Acta Universitatis Lodziensis, Folia Oeconomica;285
dc.subject	missing values	pl_PL
dc.subject	single and multiple imputation	pl_PL
dc.subject	random forests	pl_PL
dc.subject	missForest	pl_PL
dc.title	Some Remarks on the Data Imputation Using “missForest” Method	pl_PL
dc.title.alternative	Kilka uwag o imputacji danych z wykorzystaniem metody "missforest"	pl_PL
dc.type	Article	pl_PL
dc.page.number	[169]-179	pl_PL
dc.contributor.authorAffiliation	University of Lodz, Department of Statistical Methods	pl_PL
dc.references	Allison P. D. (2002), Missing data, Series: Quantitative Applications in the Social Sciences 07–136, SAGE Publications, Thousand Oaks, London, New Delhi	pl_PL
dc.references	Blake C., Keogh E., Merz C. J. (1988), UCI Repository of Machine Learning Datasets, Department of Information and Computer Science, University of California, Irvine	pl_PL
dc.references	Breiman, L. (2001), Random Forests, “Machine learning” 45(1): 5–32	pl_PL
dc.references	Little R. J. A., Rubin D. B. (2002), Statistical Analysis with Missing Data, Second Edition, Wiley, New Jersey	pl_PL
dc.references	Oba S., Sato M., Takemasa I., Monden M., Matsubara K., Ishii S. (2003), A Bayesian Missing Value Estimation Method for Gene Expression Profile Data, “Bioinformatics” 19(16): 2088–2096	pl_PL
dc.references	Städler N., Bühlmann P. (2010), Pattern Alternating Maximization Algorithm for High- Dimensional Missing Data, Arxiv preprint arXiv:1005.0366	pl_PL
dc.references	Stekhoven D. J., Bühlmann P. (2012), MissForest – Nonparametric Missing Value Imputation for Mixed-Type Data, “Bioinformatics” 28(1): 112–118	pl_PL
dc.references	Troyanskaya O., Cantor M., Sherlock G., Brown P., Hastie T., Tibshirani R., Botstein D., Altman R. (2001), Missing Value Estimation Methods for DNA Microarrays, “Bioinformatics” 17(6): 520–525	pl_PL
dc.references	van Buuren S., Groothuis-Oudshoorn K. (2011), MICE: Multivariate Imputation by Chained Equations in R, „Journal of Statistical Software”, 45(3): 1–67	pl_PL

Files in this item

Name:: 18-misztal.pdf
Size:: 426.6Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Acta Universitatis Lodziensis. Folia Oeconomica nr 285/2013 [28]

Show simple item record