Measures of Diversity and the Classification Error in the Multiple-model Approach

Gatnar, Eugeniusz

dc.contributor.author	Gatnar, Eugeniusz
dc.date.accessioned	2015-04-02T10:38:46Z
dc.date.available	2015-04-02T10:38:46Z
dc.date.issued	2009
dc.identifier.issn	0208-6018
dc.identifier.uri	http://hdl.handle.net/11089/7664
dc.description.abstract	Multiple-model approach (model aggregation, model fusion) is most commonly used in classification and regression. In this approach K component (single) models C1(x), C1(x), … , CK(x) are combined into one global model (ensemble) C(x), for example using majority voting: K C = arg max {Σ I (Ck(x)=y)} (1) y k=1 Turner i Ghosh (1996) proved that the classification error of the ensemble C*(x) depends on the diversity of the ensemble members. In other words, the higher diversity of component models, the lower classification error of the combined model. Since several diversity measures for classifier ensembles have been proposed so far in this paper we present a comparison of the ability of selected diversity measures to predict the accuracy of classifier ensembles.	pl_PL
dc.description.abstract	Podejście wielomodelowe (agregacja modeli), stosowane najczęściej w analizie dyskryminacyjnej i regresyjnej, polega na połączeniu M modeli składowych C1(x), ..., CM(x) jeden model globalny C(x): K C = arg max {Σ I (Cm(x)=y)} y k=1 Turner i Ghosh (1996) udowodnili, że błąd klasyfikacji dla modelu zagregowanego C(x) zależy od stopnia podobieństwa (zróżnicowania) modeli składowych. Inaczej mówiąc, najbardziej dokładny model C(x) składa się z modeli najbardziej do siebie niepodobnych, tj. zupełnie inaczej klasyfikujących te same obiekty. W literaturze zaproponowano kilka miar pozwalających ocenić podobieństwo (zróżnicowanie) modeli składowych w podejściu wielomodelowym. W artykule omówiono związek znanych miar zróżnicowania z oceną wielkości błędu klasyfikacji modelu zagregowanego.	pl_PL
dc.description.sponsorship	Zadanie pt. „Digitalizacja i udostępnienie w Cyfrowym Repozytorium Uniwersytetu Łódzkiego kolekcji czasopism naukowych wydawanych przez Uniwersytet Łódzki” nr 885/P-DUN/2014 zostało dofinansowane ze środków MNiSW w ramach działalności upowszechniającej naukę	pl_PL
dc.language.iso	en	pl_PL
dc.publisher	Wydawnictwo Uniwersytetu Łódzkiego	pl_PL
dc.relation.ispartofseries	Acta Universitatis Lodziensis. Folia Oeconomica;225
dc.subject	Multiple-model approach	pl_PL
dc.subject	Model fusion	pl_PL
dc.subject	Classifier ensemble	pl_PL
dc.subject	Diversity measures	pl_PL
dc.title	Measures of Diversity and the Classification Error in the Multiple-model Approach	pl_PL
dc.title.alternative	Miary zróżnicowania modeli a błąd klasyfikacji w podejściu wielomodelowym	pl_PL
dc.type	Article	pl_PL
dc.page.number	[101]-109	pl_PL
dc.contributor.authorAffiliation	Katowice University of Economics, Chair of Statistics	pl_PL
dc.references	Вreiinan L.(1996), Bagging predictors, “Machine Learning”, 24, 123-140.
dc.references	Вreiman L. (1998), Arcing classifiers, “Annals o f Statistics”, 26, 801-849.
dc.references	Вreiman L.(1999), Using adaptive bagging to debias regressions. Technical Report 547, Department of Statistics, University of California, Berkeley.
dc.references	Breiman L.(2001), Random forests, “Machine Learning”, 45, 5-32.
dc.references	Cunnigham P., Carney J.(2000), Diversity versus quality in classification ensembles based on feature selection, [in:] Proceedings of European Conference on Machine Learning, LNCS, vol. 1810, Springer, Berlin, 109-116.
dc.references	Dietteriсh T., Bakiri G.(1995), Solving multiclass learning problem via error-correcting output codes, “Journal of Artificial Intelligence Research”, 2, 263-286
dc.references	Fleiss J.L.(1981), Statistical methods for rates and proportions, John Wiley and Sons, New York.
dc.references	Freund Y. , Schapire R.E.(1997), A decision-theoretic generalization of on-line learning and an application to boosting, “Journal of Computer and System Sciences”, 55, 119-139.
dc.references	Gatnar E.(2001), Nonparametric method for classification and regression, PWN, Warszawa (in Polish).
dc.references	Gatnar E.(2005), A diversity measure for tree-based classifier ensembles, [in:] Data analysis and decision support, eds D. Baicr, R. Decker, L. Schmidt-Thieme, Springer-Verlag, Heidelberg-Berlin, 30-38.
dc.references	Giасinto G., Roli F.(2001), Design of effective neural network ensembles for image classification processes, “Image Vision and Computing Journal”, 19, 699-707.
dc.references	Hansen L.K., Salamon P.(1990), Neural network ensembles, “IEEE Transactions on Pattern Analysis and Machine Intelligence”, 12, 993-1001.
dc.references	Но T.K.(1998), The random subspace method for constructing decision forests, “ IEEE Transactions on Pattern Analysis and Machine Intelligence”, 20, 832-844.
dc.references	Kuncheva L., Whitaker C., Shipp D., Duin R (2000), Is independence good for combining classifiers, [in:] Proceedings of the 15th International Conference on Pattern Recognition, Barcelona, Spain, 168-171.
dc.references	Kuncheva L., Whitaker C.(2003): Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy, “Machine Learning”, 51,181-207.
dc.references	Margincantu M.M. , Dietterich T.G.(1997), Pruning adaptive boosting, [in:] Proceedings of the I4tli International Conference on Machine Learning, Morgan Kaufmann, San Mateo, 211-218.
dc.references	Oza N.C., Tumar K.(1999), Dimensionality reduction through classifier ensembles, Technical Report, NASA-ARC-IC-1999-126, Computational Sciences Division, NASA Ames Research Center
dc.references	Partridge D., Krzanowski W.J.(1997), Software diversity: practical statistics for its measurement and exploitation, “Information and software Technology”, 39, 707-717.
dc.references	Partridge D., Yates W.B.(1996), Engineering multiversion neural-net systems, “Neural Computation”, 8, 869-893
dc.references	Sharkey A., Sharkey N.(1997), Diversity, selection, and ensembles of artificial neural nets, [in:] Neural Networks and their applications, NEURAP-97, 205-212.
dc.references	Skalak D.В. (1996), The sources of increased accuracy for two proposed boosting algorithms, [in:] Proceedings of the American Association for Artificial Intelligence AAAI-96, Morgan Kaufmann, San Mateo
dc.references	Turner K., Ghosh J.(1996), Analysis of decision boundaries in linearly combined neural classifiers, “Pattern Recognition”, 29, 341-348.
dc.references	Wоlpert D.(1992), Stacked generalization, “Neural Networks”, 5, 241-259.

Files in this item

Name:: 98-106.pdf
Size:: 3.875Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Acta Universitatis Lodziensis. Folia Oeconomica nr 225/2009 [30]
Methodological Aspects and Applications of Multivariate Statistical Analysis edited by Czeslaw Domański and Anna Witaszczyk

Show simple item record