开篇:润墨网以专业的文秘视角,为您筛选了一篇基于Fisher类内散度的支持向量机分类面修正方法范文,如需获取更多写作素材,在线客服老师一对一协助。欢迎您的阅读与分享!
摘要:
针对支持向量机(SVM)训练不平衡样本数据产生最优分类面的偏移会降低分类模型泛化性的问题,提出一种基于fisher类内散度平均分布比的分类面修正方法。对样本数据进行SVM训练后获得分类面的法向量;通过计算两类样本在该法向量方向上的Fisher类内散度来评价这两类样本的分布情况;依据类内散度综合考虑样本个数所得到的平均分布比重新修正最优分类面的位置。在benchmarks数据集上的实验结果说明该方法能够提高SVM分类模型在处理不均衡数据集时对于少数类的识别率,从而有助于提高模型的泛化性。
中图分类号:TP391.4
文献标志码:A
4结语
通过引入Fisher判别中的类内散度的概念分析了SVM训练中各类样本的分布情况,并根据类内散度决定的平均分布比对SVM的最优分类面进行修正以提高SVM分类模型的泛化性。这种方法不仅可以解决样本个数差异给训练带来的问题,而且可以解决样本分布不均造成的问题,具有较强的适应性。这为SVM方法在更看重少数类样本识别率(如故障诊断、健康评估、信用风险评估等)数据不均衡问题的实际运用中提供了算法支撑。在今后的研究中尝试将修正最优分类面的SVM运用于此类问题中。
参考文献:
[1]VAPNIK V. The nature of statistical learning theory[M]. Berlin:SpringerVerlag,1995.
[2]ORRU G, PETTERSSONYEO W, MARQUAND A F, et al.Using support vector machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review[J]. Neuroscience and Biobehavioral Reviews, 2012,36(4): 1140-1152.
[3]GUO L, GE P S, ZHANG M H, et al. Pedestrian detection for intelligent transportation systems combining AdaBoost algorithm and support vector machine [J]. Expert Systems with Application,2012,39(4):4272-4286.
[4]MOUNTRAKIS G, IM J, OGOLE C. Support vector machines in remote sensing: a review[J]. ISPRS Journal of Photogrammetry and Remote Sensing,2011,66(3):247-259.
[5]JIN W, ZHANG J Q, ZHANG X. Face recognition method based on support vector machine and particle swarm optimization[J]. Expert Systems with Applications, 2011,38(4): 4390-4393.
[6]WU G, CHANG E. Classboundary alignment for imbalanced dataset learning[C]// Proceedings of ICML 2003 Workshop on Learning from Imbalanced Data Sets. Washington, DC:AAAI Press,2003:786-795.
[7]金鑫,李玉鑑,不平衡支持向量机的惩罚因子选择方法[J].计算机工程与应用,2011,47(33):129-133.
[8]郑恩辉,李平,宋执环.不平衡数据挖掘:类分布对支持向量机的影响[J]. 信息与控制, 2005,34(6):703-708.
[9]CRISTIANINI N, SHAWETAYLOR J.支持向量机导论[M].李国正,王猛,曾华军,译.北京:电子工业出版社,2004.
[10]CHAWLA N V, JAPKOWICZ N, KOTCZ A. Special issue on learning from imbalanced data sets[J].SIKDD Explorations Newsletters,2004,6(1):1-6.
[11]FU Y, SUN R X, YANG Q, et al. A blockbased support vector machine approach to the protein homology prediction task in KDD cup[J]. ACM SIGKDD Explorations Newsletter,2004,6(2):120-124.
[12]AKBANI R, KWEK S, JAPKOWICZ N. Applying support vector machine to imbalanced datasets[C]// ECML 2004: Proceedings of the 15th European Conference on Machine Learning, LNCS 3201. Berlin: SpringerVerlag,2004:39-50.
[13]SUN X, LIM EP, LIU Y, On strategies for imbalanced text classification using SVM: a comparative study[J]. Decision Support Systems,2009,48(1):191-201.
[14]周皓,李少洪.SVM最优分类面相对位置的修正[J]. 北京航空航天大学学报,2009,35(11):1302-1305.
[15]MIKA S, RATSCH G, JASON G. Fisher discriminant analysis with kernels[C]// Proceedings of the 1999 IEEE Signal Processing Society Workshop. Piscataway, NJ: IEEE Press,1999: 41-48.
[16]VEROPOULOS K, CAMPBELL C, CRISTIANINI N. Controlling the sensitivity of support vector machines[C]// Proceedings of the International Joint Conference on Artificial Intelligence. Washington, DC:Morgan Kaufmann,1999:55-60.