基于特征生成方法的Android恶意软件检测方法-AET-电子技术应用

基于特征生成方法的Android恶意软件检测方法

2020年信息技术与网络安全第11期

冯垚，王金双，张雪涛

陆军工程大学指挥控制工程学院，江苏南京210001

摘要： 针对传统特征工程中需要大量专家经验和人力的不足，研究了基于特征生成方法的Android恶意软件检测方法。基于UC Berkeley的ExploreKit自动特征生成方法，通过对原始特征计算获得大量候选特征，根据候选特征的元特征预测其性能并进行评估排序，使用贪心算法从中选出能够提升模型性能的新特征。从APK中提取了敏感API、危险权限等多种特征，在根据信息增益对特征进行筛选后，输入到特征生成框架中，使用C4.5、SVM和随机森林等作为分类模型。实验证明，该方法使错误率平均降低了24.6%，准确率达到了96.5%，曲线下面积(Area Under Curve，AUC)达到了0.99。

关键词： 恶意软件检测特征工程特征生成 ExploreKit

中图分类号： TP393
文献标识码： A
DOI： 10.19358/j.issn.2096-5133.2020.11.002
引用格式：冯垚，王金双，张雪涛. 基于特征生成方法的Android恶意软件检测方法[J].信息技术与网络安全，2020，39(11)：8-13.

Android malware detection based on feature generation method

Feng Yao，Wang Jinshuang，Zhang Xuetao

Institute of Command Control Engineering，Army Engineering University，Nanjing 210001，China

Abstract： Considering the expert experience and manpower required in traditional feature engineering, a detection method based on feature generation was proposed in this paper. ExploreKit was adopted as the automated feature generation method, which can improve Android malware detection model performance by generating and selecting new features. A large of candidate features are obtaioned by calculating the initial features. According to the meta-features of the candidate features, their performance is predicted and evaluated. New features that will improve the performance of the model are selected by greedy algorithms. The time spent on feature generation is reduced by filtering the initial features input to ExploreKit and limiting the number of generated features. Many features were extracted from APK, such as sensitive API and dangerous permission. By using C4.5, SVM and random forest as the classification models, our experiments show that the error rate of Android malware detection decreases 24.6% on average, the accuracy reaches 96.5%, and the AUC reaches 0.99.

Key words : malware detection；feature engineering；feature generation；ExploreKit

0 引言

机器学习在Android恶意软件检测中得到广泛应用，特征工程是基于机器学习的Android恶意软件检测的关键环节。目前使用的特征主要包括静态特征和动态特征。但特征工程的过程严重依赖于专家经验，反复试验调优才能确定候选特征集合。

针对传统特征工程需要大量专家经验和人力的不足，本文提出了基于特征生成方法的Android恶意软件检测方法。该方法提取了多类特征，基于UC Berkeley的ExploreKit^[1]方法进行自动化特征生成计算，筛选得到能够提升模型性能的新特征，得到了良好的检测性能。

本文详细内容请下载：http://www.chinaaet.com/resource/share/2000003054

作者信息：

冯垚，王金双，张雪涛

(陆军工程大学指挥控制工程学院，江苏南京210001)

原创声明：此内容为AET网站原创，未经授权禁止转载。

相关内容