2021 7th IEEE Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS) | 2021
Webshell Detection Technology Based on Deep Learning
Abstract
In this paper, we use a Deep Learning technique called Long Short Term Memory (LSTM) recurrent neural networks to detect Webshell which is a kind of trojan scripts written by hackers and causes great security risks to web servers. We mainly use deep learning theory to intelligently extract the characteristics of the opcode sequences of malicious codes written by PHP and study the classification model. In this paper, we compile PHP files into opcode sequences, build Webshell detection model by using LSTM, which also includes word embedding conversion, multi-layer LSTM structure and so on. The trained single-layer model finally shows over 95% accuracy for detecting Webshells. However, accuracy of multi-layer models is reduced on the contrary. The double-layer model shows 93% accuracy and the triple-layer model even shows lower than 90% accuracy. It turns out that more layers is not always better probably due to gradient vanishing or overfitting caused by multiple layers. The result indicates that single-layer model may perform best on Webshell detection.