Feature Selection and the Chessboard Problem

Kubus, Mariusz

dc.contributor.author	Kubus, Mariusz
dc.date.accessioned	2015-11-26T15:52:23Z
dc.date.available	2015-11-26T15:52:23Z
dc.date.issued	2015
dc.identifier.issn	0208-6018
dc.identifier.uri	http://hdl.handle.net/11089/14486
dc.description.abstract	Feature selection methods are usually classified into three groups: filters, wrappers and embedded methods. The second important criterion of their classification is an individual or multivariate approach to evaluation of the feature relevance. The chessboard problem is an illustrative example, where two variables which have no individual influence on the dependent variable can be essential to separate the classes. The classifiers which deal well with such data structure are sensitive to irrelevant variables. The generalization error increases with the number of noisy variables. We discuss the feature selection methods in the context of chessboard-like structure in the data with numerous irrelevant variables.	pl_PL
dc.description.abstract	W artykule podjęto dyskusję nad aspektem przeszukiwania w metodach selekcji zmiennych. Posłużono się znanym z literatury przykładem szachownicy, gdzie zmienne, które indywidualnie nie mają mocy dyskryminacyjnej (mają jednakowe rozkłady w klasach) mogą rozpinać przestrzeń, w której klasy są dobrze separowalne. Uogólniając ten przykład wygenerowano zbiór z trójwymiarową strukturą szachownicy i zmiennymi zakłócającymi, a następnie zweryfikowano metody selekcji zmiennych. Rozważono też możliwość zastosowania analizy skupień jako narzędzia wspomagającego etap dyskryminacji.	pl_PL
dc.language.iso	en	pl_PL
dc.publisher	Wydawnictwo Uniwersytetu Łódzkiego	pl_PL
dc.relation.ispartofseries	Acta Universitatis Lodziensis. Folia Oeconomica;311
dc.subject	chessboard problem	pl_PL
dc.subject	feature selection	pl_PL
dc.subject	feature relevance	pl_PL
dc.subject	problem szachownicy	pl_PL
dc.subject	selekcja zmiennych	pl_PL
dc.subject	ważność zmiennych	pl_PL
dc.title	Feature Selection and the Chessboard Problem	pl_PL
dc.title.alternative	Selekcja zmiennych a problem szachownicy	pl_PL
dc.type	Article	pl_PL
dc.rights.holder	© Copyright by Uniwersytet Łódzki, Łódź 2015	pl_PL
dc.page.number	[17]-25	pl_PL
dc.contributor.authorAffiliation	Department of Mathematics and Applied Computer Science, Opole University of Technology.	pl_PL
dc.identifier.eissn	2353-7663
dc.references	Blum A.L., Langley P. (1997), Selection of relevant features and examples in machine learning, Artificial Intelligence, v. 97 n. 1–2, p. 245–271.	pl_PL
dc.references	Caruana R.A., Freitag D. (1994), How useful is relevance? Working Notes of the AAAI Fall Symposium on Relevance (pp. 25–29). New Orleans, LA: AAAI Press.	pl_PL
dc.references	Forman G. (2003), An extensive empirical study of feature selection metrics for text classification. Journal of Machine Learning Research, 3: 1289–1305.	pl_PL
dc.references	Gatnar E. (2005), Dobór zmiennych do zagregowanych modeli dyskryminacyjnych, in: Jajuga K., Walesiak M. (Eds.), Taksonomia 12, Klasyfikacja i analiza danych – teoria i zastosowania, Prace Naukowe Akademii Ekonomicznej we Wrocławiu, n. 1076, p.79–85.	pl_PL
dc.references	Guyon I., Elisseeff A. (2006), An introduction to feature extraction, in I. Guyon, S. Gunn, M. Nikravesh, L. Zadeh (Eds.), Feature Extraction: Foundations and Applications, Springer, New York.	pl_PL
dc.references	Guyon I., Weston J., Barnhill S., Vapnik V. (2002), Gene Selection for Cancer Classification using Support Vector Machines, Machine Learning, 46: 389–422.	pl_PL
dc.references	Hall M. (2000), Correlation-based feature selection for discrete and numeric class machine learning, Proceedings of the 17th International Conference on Machine Learning, Morgan Kaufmann, San Francisco.	pl_PL
dc.references	Hellwig Z. (1969), Problem optymalnego wyboru predykant, ,,Przegląd Statystyczny”, n. 3–4.	pl_PL
dc.references	Jensen D. D., Cohen P. R. (2000), Multiple comparisons in induction algorithms. Machine Learning, 38(3): p. 309–338.	pl_PL
dc.references	John G.H., Kohavi R., Pfleger P. (1994), Irrelevant features and the subset selection problem. In Machine Learning: Proceedings of the Eleventh International Conference, Morgan Kaufmann, p. 121–129.	pl_PL
dc.references	Kira K., Rendell L. A. (1992), The feature selection problem: Traditional methods and a new algorithm. In Proc. AAAI-92, p. 129–134. MIT Press.	pl_PL
dc.references	Koller D., Sahami M. (1996), Toward optimal feature selection. In 13th International Conference on Machine Learning, p. 284–292.	pl_PL
dc.references	Kononenko I. (1994), Estimating attributes: Analysis and extensions of RELIEF, In Proceedings European Conference on Machine Learning, p. 171–182.	pl_PL
dc.references	Ng K. S., Liu H. (2000), Customer retention via data mining. AI Review, 14(6): 569–590.	pl_PL
dc.references	Quinlan J.R., Cameron-Jones R.M. (1995), Oversearching and layered search in empirical learning. In Mellish C. (ed.), Proceedings of the 14th International Joint Conference on Artificial Intelligence, Morgan Kaufman, p. 1019–1024.	pl_PL
dc.references	Xing E., Jordan M., Karp R. (2001), Feature selection for high-dimensional genomic microarray data. In Proceedings of the Eighteenth International Conference on Machine Learning, p. 601–608.	pl_PL
dc.references	Yu L., Liu H. (2004), Redundancy based feature selection for microarray data. In Proceedings of the Tenth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, p. 737–742.	pl_PL
dc.identifier.doi	10.18778/0208‐6018.311.03
dc.relation.volume	1	pl_PL

Pliki tej pozycji

Nazwa:: 3-Kubus.pdf
Rozmiar:: 314.1KB
Format:: PDF

Oglądaj/Otwórz

Pozycja umieszczona jest w następujących kolekcjach

Acta Universitatis Lodziensis. Folia Oeconomica nr 311(1)/2015 [9]

Pokaż uproszczony rekord