Near-Boundary Data Selection for Fast Suppor Vector Machines

Doosung Hwang; Daewon Kim

FULL TEXT

Published: Mar 1, 2012

Keywords:

Support Vector Machine Nearest Neighbor Rule Tomek Link Data Selection

Doosung Hwang

Department of Computer Science, Dankook University

Daewon Kim

Department of Multimedia Engineering, Dankook University

Abstract

Support Vector Machines(SVMs) have become more popular than other algorithms for pattern classification. The learning phase of a SVM involves exploring the subset of informative training examples (i.e. support vectors) that makes up a decision boundary. Those support vectors tend to lie close to the learned boundary. In view of nearest neighbor property, the neighbors of a support vector become more heterogeneous than those of a non-support vector. In this paper, we propose a data selection method that is based on the geometrical analysis of the relationship between nearest neighbors and boundary examples. With real-world problems, we evaluate the proposed data selection method in terms of generalization performance, data reduction rate, training time and the number of support vectors. The results show that the proposed method achieves a drastic reduction of both training data size and training time without significant impairment to generalization performance compared to the standard SVM.

Downloads

Download data is not yet available.

How to Cite

Hwang, D., & Kim, D. (2012). Near-Boundary Data Selection for Fast Suppor Vector Machines. Malaysian Journal of Computer Science, 25(1), 23–37. Retrieved from http://jice.um.edu.my/index.php/MJCS/article/view/6588

Issue

Vol. 25 No. 1 (2012): Malaysian Journal of Computer Science

Section

Articles

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details