Shengfeng Tian | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shengfeng Tian is active.

Explore More

Publication

Featured researches published by Shengfeng Tian.

Expert Systems With Applications | 2009

Feature selection for text classification with Naïve Bayes

Jingnian Chen; Houkuan Huang; Shengfeng Tian; Youli Qu

As an important preprocessing technology in text classification, feature selection can improve the scalability, efficiency and accuracy of a text classifier. In general, a good feature selection method should consider domain and algorithm characteristics. As the Naive Bayesian classifier is very simple and efficient and highly sensitive to feature selection, so the research of feature selection specially for it is significant. This paper presents two feature evaluation metrics for the Naive Bayesian classifier applied on multi-class text datasets: Multi-class Odds Ratio (MOR), and Class Discriminating Measure (CDM). Experiments of text classification with Naive Bayesian classifiers were carried out on two multi-class texts collections. As the results indicate, CDM and MOR gain obviously better selecting effect than other feature selection approaches.

european symposium on research in computer security | 2008

Online Risk Assessment of Intrusion Scenarios Using D-S Evidence Theory

C. P. Mu; Xiangjun Li; Houkuan Huang; Shengfeng Tian

In the paper, an online risk assessment model based on D-S evidence theory is presented. The model can quantitate the risk caused by an intrusion scenario in real time and provide an objective evaluation of the target security state. The results of the online risk assessment show a clear and concise picture of both the intrusion progress and the target security state. The model makes full use of available information from both IDS alerts and protected targets. As a result, it can deal with uncertainties and subjectiveness very well in its evaluation process. In IDAM&IRS, the model serves as the foundation for intrusion response decision-making.

Expert Systems With Applications | 2010

An ontology-based intrusion alerts correlation system

Wan Li; Shengfeng Tian

Alert correlation techniques effectively improve the quality of alerts reported by intrusion detection systems, and are sufficient to support rapid identification of ongoing attacks or predict an intruders next likely goal. In our previous work, an alert correlation approach based on our XSWRL ontology has been proposed. This paper focuses on how to develop the intrusion alerts correlation system according to our alert correlation approach. At first, the multi-agent system architecture consisting of agents and sensors is shown. The sensors collect security relevant information, and the agents process the information. Then we present each modules of the system in detail. The State Sensor collects information about security state and the Local State Agent and Center State Agent preprocess the security state information and convert it to ontology. The Attack Sensor collects information about attack and the Local Alert Agent and Center Alert Agent preprocess the alert information and convert it to ontology. The Attack Correlator correlates the attacks and outputs the attack sessions.

Knowledge Based Systems | 2008

A selective Bayes Classifier for classifying incomplete data based on gain ratio

Jingnian Chen; Houkuan Huang; Fengzhan Tian; Shengfeng Tian

Actual data sets are often incomplete because of various kinds of reasons. Although numerous algorithms about classification have been proposed, most of them deal with complete data. So methods of constructing classifiers for incomplete data deserve more attention. By analyzing main methods of processing incomplete data for classification, this paper presents a selective Bayes Classifier for classifying incomplete data with a simpler formula for computing gain ratio. The proposed algorithm needs no assumption about data sets that are necessary for previous methods of processing incomplete data in classification. Experiments on 12 benchmark incomplete data sets show that this method can greatly improve the accuracy of classification. Furthermore, it can sharply reduce the number of attributes and so can greatly simplify the data sets and classifiers.

Expert Systems With Applications | 2010

Feature selection for SVM via optimization of kernel polarization with Gaussian ARD kernels

Tinghua Wang; Houkuan Huang; Shengfeng Tian; Jianfeng Xu

Feature selection aims at determining a subset of available features which is most discriminative and informative for data analysis. This paper presents an effective feature selection method for support vector machine (SVM). Unlike the traditional combinatorial searching method, feature selection is translated into the model selection of SVM which has been well studied. In more detail, the basic idea of this method is to tune the hyperparameters of the Gaussian Automatic Relevance Determination (ARD) kernels via optimization of kernel polarization, and then to rank all features in decreasing order of importance so that more relevant features can be identified. We test the proposed method with some UCI machine learning benchmark examples and show that it can dramatically reduce the number of features and outperforms SVM trained using the features selected according to correlation coefficient and using all features.

Neurocomputing | 2007

Sequence-similarity kernels for SVMs to detect anomalies in system calls

Shengfeng Tian; Shaomin Mu; Chuanhuan Yin

In intrusion detection systems (IDSs), short sequences of system calls executed by running programs can be used as evidence to detect anomalies. In this paper, one-class support vector machines (SVMs) using sequence-similarity kernels are adopted as the anomaly detectors. Edit distance-based kernel and common subsequence-based kernel are proposed to utilize the sequence information in the detection. Algorithms for efficient computation of the kernels are derived with the techniques of dynamic programming and bit-parallelism. The experimental results indicate that the proposed kernels can significantly outperform the standard RBF kernel.

international conference on natural computation | 2005

Applying genetic programming to evolve learned rules for network anomaly detection

Chuanhuan Yin; Shengfeng Tian; Houkuan Huang; Jun He

The DARPA/MIT Lincoln Laboratory off-line intrusion detection evaluation data set is the most widely used public benchmark for testing intrusion detection systems. But the presence of simulation artifacts attributes would cause many attacks in this dataset to be easily detected. In order to eliminate their influence on intrusion detection, we simply omit these attributes in the processes of both training and testing. We also present a GP-based rule learning approach for detecting attacks on network. GP is used to evolve new rules from the initial learned rules through genetic operations. Our results show that GP-based rule learning approach outperforms the original rule learning algorithm, detecting 84 of 148 attacks at 100 false alarms despite the absence of several simulation artifacts attributes.

Neurocomputing | 2008

High-order Markov kernels for intrusion detection

Chuanhuan Yin; Shengfeng Tian; Shaomin Mu

In intrusion detection systems, sequences of system calls executed by running programs can be used as evidence to detect anomalies. Markov chain is often adopted as the model in the detection systems, in which high-order Markov chain model is well suited for the detection, but as the order of the chain increases, the number of parameters of the model increases exponentially and rapidly becomes too large to be estimated efficiently. In this paper, one-class support vector machines (SVMs) using high-order Markov kernels are adopted as the anomaly detectors. This approach solves the problem of high-dimension parameter space. Furthermore, a rapid algorithm based on suffix tree is presented for the computation of Markov kernels in linear time. Experimental results show that the SVM with Markov kernels can produce good detection performance with low computational cost.

ieee international conference on fuzzy systems | 2002

An efficient optimality test for the fuzzy c-means algorithm

Jian Yu; Houkuan Huang; Shengfeng Tian

The fuzzy c-means algorithm (FCM) is proved to converge to either local minimum or saddle point by Bezdek et al.(1987). However, it is problematical to judge the local minimum of a solution of the FCM in an easy way. In this paper, the Hessian matrix of one reduced objective function of the FCM is obtained and analyzed. Based on this study, a new optimality test of fixed points of the FCM is given, and its efficacy is verified by the examples in the paper. Moreover, a new stopping criterion for the FCM is also proposed.

computational intelligence and security | 2005

Intrusion detection alert verification based on multi-level fuzzy comprehensive evaluation

Chengpo Mu; Houkuan Huang; Shengfeng Tian

Alert verification is a process which compares the information referred by an alert with the configuration and topology information of its target system in order to determine if the alert is relevant to its target system. It can reduce false positive alerts and irrelevant alerts. The paper presents an alert verification approach based on multi-level fuzzy comprehensive evaluation. It is effective in achieving false alert and irrelevant alerts reduction, which have been proved by our experiments. The algorithm can deal with the uncertainties better than other alert verification approaches. The relevance score vectors obtained from the algorithm facilitate the formulation of fine and flexible security policies, and further alert processing.

Explore More