支持向量机特征选择方法综述Review on support vector machines based feature selection
吴青,付彦琳
摘要(Abstract):
支持向量机(support vector machine,SVM)主要解决分类和回归问题,基于支持向量机的特征选择可以有效地去除不相关的冗余特征,在新的更少的数据集上建模,提高支持向量机的效率和泛化性能。从评价标准、搜索方式和监督信息等角度探究特征选择的分类方法,论述基于支持向量机Wrapper、Embedded和Filter-Wrapper等3种特征选择方法,进一步地探讨支持向量机特征选择方法未来的发展趋势。
关键词(KeyWords): 支持向量机;模式识别;特征选择
基金项目(Foundation): 国家自然科学基金项目(51875457);; 陕西省重点研发计划项目(2018GY-018);; 西安市科技计划项目(2020KJRC0109)
作者(Author): 吴青,付彦琳
DOI: 10.13682/j.issn.2095-6533.2020.05.003
参考文献(References):
- [1] VAPNIK V N.The nature of statistical learning theory[M].[S.l.]:Springer,2000:25-314.DOI:10.1007/978-1-4757-3264-1_1.
- [2] 吴青,王婉,王玲芝.指数光滑支持向量分类机[J].西安邮电大学学报,2014,19(4):9-14.DOI:10.13682/j.issn.2095-6533.2014.04.002.
- [3] WU Q,ZHANG L Y,WANG W.New family of piecewise smooth support vector machine[J].Journal of Systems Engineering and Electronics,2015,26(3):618-625.DOI:10.1109/jsee.2015.00069.
- [4] DASH M,KOOT P W.Feature selection for clustering[M].Berlin Heidelberg:Springer,2016:94-10.DOI:10.1007/9781-4419- 1428-6_4101.
- [5] MAO J,HU Y,JIANG D,et al.CBFS:A clustering-based feature selection mechanism for network anomaly Detection[J].IEEE Access,2020(99):1.DOI:10.1109/access.2020.3004699.
- [6] ALSHAER H N,OTAIR M A,ABUALIGAH L,et al.Feature selection method using improved CHI square on arabic text classifiers:analysis and application[J].Multimedia Tools and Applications,2020:1-18.DOI:10.1186/s40537-020-00344-3.
- [7] 卢光跃,张宏建,闫真光.基于特征选择和SVM的电信客户离网预测[J].西安邮电大学学报,2019,24(2):21-25.DOI:10.13682/j.issn.2095-6533.2019.02.005.
- [8] MALDONADO S,LOPEZ J.Dealing with high-dimensional class-imbalanced datasets:Embedded feature selection for SVM classification[J].Applied Soft Computing,2018:94-105.DOI:10.1016/j.asoc.2018.02.051
- [9] MU Y S,LIU X D,WANG L D.A pearson's correlation coefficient based decision tree and its parallel implementation[J].Information Sciences,2017,435:40-58.DOI:10.1016/j.ins.2017.12.059.
- [10] LY A,MARSMAN M,WAGENMAKERS E J.Analytic posteriors for Pearson's correlation coefficient[J].Statistica Neerlandica,2018,72(1):4-13.DOI:1111/stan.12111.
- [11] DASH M,LIU H.Feature selection for classification[J].Intelligent Data Analysis,1997,1(1):131-156.DOI:10.1016/S1088-467X(97)00008-5.
- [12] MOLINA L C,BELANCHE L,NEBOT A.Feature selection algorithms:A survey and experimental evaluation[C]//Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002).Maebashi:IEEE,2002,1:306.DOI:10.1109/ICDM.2002.1183917.
- [13] LEI Y,LIU H.Efficient feature selection via analysis of relevance and redundancy[J].The Journal of Machine Learning Research,2004,5(1):1205-1224.DOI:10.1023/B:JODS.0000045365.56394.b4.
- [14] 吴青,臧博研,祁宗仙,基于压缩感知的多核稀疏最小二乘支持向量机[J].系统工程与电子技术,2019,41(9):1930-1936.DOI:10.13682/j.issn.2095-6533.2019.02.014.
- [15] HOQUE N,BHATTACHARYYA D K,KALITA J K.MIFS-ND:A mutual information-based feature selection method[J].Expert Systems with Applications,2014,41(14):6371-6385.DOI:10.1016/j.eswa.2014.04.019.
- [16] CADENAS J M,CARRIDO M C,MARTINEZ R.Feature subset selection filter-wrapper based on low quality data[J].Expert Systems with Applications,2013,40(1):6241-6252.DOI:10.1016/j.eswa.2013.05.051.
- [17] OH I S,LEE J S,MOON B R.Hybrid genetic algorithms for feature selection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2004,26(11):1424-1437.DOI:10.1109/tpami.2004.105.
- [18] ALI S I,SHAHZRD W.A feature subset selection method based on conditional mutual information and ant colony optimization[J].International Journal of Computer Applications,2012,60(11):5-10.DOI:10.1109/ICET.2012.6375420.
- [19] SARAFRAZI S,NEZAMA H.Facing the classification of binary problems with a GSA-SVM hybrid system[J].Mathe-matical and Computer Modelling,2013,57(1/2):270-278.DOI:10.1016/j.mcm.2011.06.048.
- [20] LIU H,YU L.Toward integrating feature selection algorithms for classification and clustering[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(4):491-502.DOI:10.1109/TKDE.2005.66.
- [21] OZKAN O,ERMIS M,BEKMEZCI I.Reliable communication network design:The hybridisation of metaheuristics with the branch and bound method[J].Journal of the Operational Research Society,2020:71.DOI:10.1080/01605682.2019.1582587.
- [22] HE B,SHAH S,MAUNG C,et al.Heuristic search algorithm for dimensionality reduction optimally combining feature selection and feature extraction[J].AAAI Conference on Artificial Intelligence,2019,33:2280-2287.DOI:10.1609/aaai.v33i01.33012280.
- [23] GANESAN K,ACHARYA R U,CHUA C K ,et al.Decision support system for breast cancer detection using mammograms[J].The Journal of Engineering in Medicine,2013,227(7):721-732.DOI:10.1177/0954411913480669.
- [24] PUDIL P,NOVOVICOVA J,KITTLER J.Floating search methods in feature selection[J].Pattern Recognition Lett,1994,11:1119-1125.DOI:10.1016/0167-8655(94)90127-9.
- [25] YE Q,ZHANG X,SUN Y.Dual global structure preservation based supervised feature Selection[J].Neural Processing Letters,2020,51(6):2765-2787.DOI:10.1007/s11063-020-10225-8
- [26] WESTON J,MUKHERJEE S,CHA-PELLE O.Feature selection for SVMs [J].Advances in Neural Information Processing Systems,2000,13:668-674.
- [27] FOITHONG S,PINNGERN O,ATTACHOO B.Feature subset selection wrapper based on mutual information and rough sets[J].Expert Systems with Applications,2012,39(1):574-584.DOI:10.1016/j.eswa.2011.07.048.
- [28] GUYON I,WESTON J,BARNHILL S.Gene selection for cancer classification using support vector machines[J].Machine Learning,2002,46(1):389-422.DOI:10.1023/a.1012487302797.
- [29] XIE S,GUO R,LI N F.Brain fMRI processing and classification based on combination of PCA and SVM[J].International Joint Conference on Neural Networks,2009:3384-3389.DOI:10.1109/ijcnn.2009.5179085.
- [30] LIN J,XU L,LIU L.Feature selection method based on SVM-RFE and particle swarm optimization[J].Journal of Chinese Computer Systems,2015,36(8):1865-1868.DOI:10.3969/j.issn.1000-1220.2015.08.040.
- [31] 吴青,祁宗仙,臧博研,等.自适应局部稀疏线性嵌入降维算法[J].西安邮电大学学报,2019,24(2):67-71.DOI:10.13682/j.issn.2095-6533.2019.02.014.
- [32] BRADLEY P S,MANGASARIAN O L,STREET W N.Feature selection via mathematical programming[J].Informs Journal on Computing,1998,10(2):209-217.DOI:10.1287/ijoc.10.2.209.
- [33] TAYAL A,COLEMAN T F,LI Y.Primal explicit max margin feature selection for nonlinear support vector machines [J].Pattern Recognition,2014,47(6):2153-2164.DOI:10.1016/j.patcog.2014.01.003.
- [34] 王方红,黄文彪.孪生支持向量机的特征选择研究[J].浙江工业大学学报,2016,44(2):146-149.DOI:10.3969/j.issn.1006-4303.2016.02.006.
- [35] YANG Z M,HE J Y,SHAN Y H.Feature selection based on linear twin support vector machines[J].Procedia Com-puter Science,2013,17:1039-1046.DOI:10.1016/j.procs.2013.05.132.
- [36] DUBEY V K,SAXENA A K.Cosine similarity based filter technique for feature selection[C]//Proceedings of the International Conference on Control.Allahbad:IEEE,2017:1-6.
- [37] HANCER E,XUE B,ZHANG M.Differential evolution for filter feature selection based on information theory and feature ranking[J].Knowledge Based Systems,2018,140(15):103-119.DOI:10.1016/j.knosys.2017.10.028.
- [38] LIU Y,ZHENG Y F.FS_SFS:A novel feature selection method for support vector machines[J].Pattern Re-cognition,2006,39(7):1333-1345.DOI:10.1016/j.patcog.2005.10.006.