Encroachment into the World of Intrusion Detection Systems: Guidelines to Use Machine Learning for Anomaly Based Network Intrusion Detection
517: Information Assurance (IA) and the Secure Development Lifecycle (SDL)
Instructor: Marc J Dupis
An intrusion detection system (IDS) is software or hardware that analyzes host system or the network for abnormal activity or rule violations. Any identified abnormal activity or rule violation is revealed either to a system administrator or is logged in as an attempt of intrusion. A system administrator will then take the appropriate action. Commonly used techniques for network intrusion are signature-intrusion detection and anomaly-based intrusion detection systems. Anomaly-based intrusion detection technique is a preferable and widely used technique. The paper discusses the use of anomaly-based intrusion detection systems and the guidelines to use machine learning techniques in intrusion detection. A strength of machine learning techniques is they use an anomaly based approach in the training phase to find out the activities that are similar to something seen in the past. In current paper, I am attempting to study the use of an anomaly-based technique, different machine learning methods used in intrusion detection systems, and problems which are commonly faced while using machine learning in intrusion detection. This study helps to identify the principles and guidelines that can be used for anomaly based machine learning intrusion detection systems quality research, development, and implementation.
Keyword: intrusion detection system, machine learning, anomaly-based intrusion detection
Encroachment into the World of Intrusion Detection Systems: Guidelines to Use Machine Learning for Anomaly Based Network Intrusion Detection
Lots of studies have been published performing the survey of intrusion detection methods and their effective use but very few focus on effective guidelines to use them in the operational environment especially in anomaly-based intrusion detection. So, there is need of performing the study and analysis of existing and newly coming methods in the anomaly based network intrusion detection world. Throughout the discussion, I am going to focus on the anomaly network intrusion detection system.
Intrusion detection system. An intrusion detection system (IDS) is software or hardware that analyzes the host system or the network for abnormal activity or rule violations. Any identified abnormal activity or rule violation is revealed either to a system administrator or is logged in as an attempt of intrusion. An intrusion detection system is more dynamic in nature for detecting the malicious attacks as compared to the firewall. Fig. 1. shows the block diagram of functional IDS model. IDS software or hardware monitors network traffic or a host system. It uses a knowledge base to respond to the identified suspicious events using analysis engine and generate the alert.
Figure1: Functional block diagram of Intrusion Detection System
The current approach to detect intrusion. Three main intrusion detection technics were used.
Signature or misuse based detection technique. This detection technique is based on previously stored threat information. For example, most of the antivirus software uses the predefined attack definition, which can be used during the attack of the same virus in future.
Anomaly detection technique. Anomaly-based detection system creates a baseline for network, system, or program behavior. Any event that deviates from the baseline will be characterized as a possible attack on the system or network.
Hybrid detection technique. This technique uses both signature-based and anomaly-based detection technique to detect the malicious attacks.
Flaws and general challenges in intrusion detection. 1. In today’s information era, a vast amount of information is generated every minute. With the emerging IoT environments, we are facing cybersecurity challenges like never before. The challenge of dealing with such huge and complex chunks of data securely in the highly connected world is really difficult.
2.An anomaly detection system learns the behavior of malicious attacks without any manual intervention so in a scenario where such a system learns malicious behavior as a normal behavior could lead to a threat.
3. In signature-based detection techniques, defining the signatures and updating systems is a costly job. Also, the system remains susceptible to new or slightly varied signature attacks. 4. High amounts of the false positive reports generation could lead to a high cost in terms of finance, etc.
II Machine learning and data mining in intrusion detection system.
Machine learning can be defined as the ability of the machine to learn from events and improve the performance of any task that is assigned, i.e. learning from the experiences and applying it in a real world scenario. They change their execution based on the newly learned information. Machine learning uses different algorithms supervised or unsupervised, both using the basics of classification. The efficiency of the classification contributes to the performance of the system. An anomaly-based detection system creates a baseline for the network, host system, or program behavior. Any event that deviates from the baseline will be characterized as a possible attack on the system or network and it will generate the alert. “The basic assumption underlying any anomaly detection system—malicious activity exhibiting characteristics not observed for normal usage—” was first presented by Denning in her seminal work on the host-based IDES system (D. E. Denning, 1987).
“The observation that machine learning works much better for such true classi?cation problems then leads to the conclusion that anomaly detection is, in fact, better suited for ?nding variations of known attacks, rather than previously unknown malicious activity”(Robin Sommer, Vern Paxson, 2010) In such environment, one can teach the model with examples of the attacks as they are known and with usual background network traf?c, and so reach a much more dependable decision process. (Robin Sommer,Vern Paxson, 2010)
Anomaly detection using machine learning methods for network intrusion detection.
Statistical anomaly detection. In this method, the system monitors the activity of the network and learns the pattern of a network traffic. The more stable the network traffic pattern, the more accurate the result obtained. It analyses the network traffic and applies a statistical algorithm to the monitored information. It monitors each packet on the network, and compares it with already stored information or pattern. If the anomaly score deviates from a certain threshold value it generates an alert.
The benefit of a statistical based intrusion detection system is the capability to detect the anomaly from day zero. It does not require a prior knowledge of the attack. The disadvantage of this technique is that the attacker can teach the system to accept the weird behavior as acceptable behavior.
Also, an assumption that network traffic has stable traffic flow, is hard to achieve in real time scenarios which may reduce its efficiency to detect malicious attacks or increase false positives.
Bayesian network. Heckerman stated that “A Bayesian network is a model that encodes probabilistic relationships among variables of interest. This technique is generally used for intrusion detection in combination with statistical schemes, a procedure that yields several advantages (Heckerman D. A, 1995) including the capability of encoding interdependencies between variables, and of predicting events, as well as the ability to incorporate both prior knowledge and data”. Yet, as mentioned by Kruegel C., Mutz D., Robertson W., and Valeur F, (2003), “a serious disadvantage of using Bayesian networks is that their results are similar to those derived from threshold-based systems, while a considerably higher computational effort is required”.
Principal components analysis(PCA). This is another statistical method used for intrusion detection in high speed and distributed networks. To deal with the challenge of complex data sets in big networks, a principal component analysis technique is used. It is widely used in intrusion detection. M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, and L. Chang (2003) projected an anomaly detection system, where this technique was used and was implemented to reduce the dimensionality of the inspection data and identify a function of the principal components.
Markov models. This model has also been widely used for anomaly-based network intrusion detection. In network intrusion detection, the examination of network packets has directed to the use of Markov models in some cases (Yeung DY, Ding Y, 2003).
In different situations, the model obtained for the destination machine has provided an effective result, whereas, in Bayesian networks, the results are dependent on the behavior of target system.
Data mining based intrusion detection. There are different data mining techniques such as genetic algorithm, fuzzy logic, and clustering, which are also used in anomaly-based network intrusion detection systems.
Neural networks. The majority of the research approachs try to simulate brain function to identify the sequence of instruction for a network. The Neural network comprises of a group of processing events that are highly interrelated and converts a set of given feeds to a set of outcomes. It does not use prior knowledge to identify the malicious behavior. The drawback of this technique is that it requires high resources and computation power.
Fuzzy logic techniques. In this method, reasoning is approximate instead of sharply deduced from classical predicate logic. It is basically derived from fuzzy set theory. Fuzzy logic practices are used in the field of anomaly detection primarily because the structures to be measured can be understood as fuzzy variables (Bridges and Vaughn, 2000, 14). Such type of processing method considers an readings as regular if it occurs within a given limit (Dickerson, 2000). Fuzzy logic is effective, against port scans and probes. Its major drawback is the high resource consumption to perform the computation.
Clustering and outlier detection. Clustering is the method of identifying the pattern in an unlabeled group of data. This method requires very less amount of training to be provided to the system. So, the amount of time required for an anomaly-based intrusion detection system to learn the behavior is reduced. After that, every new data point is assorted to appropriate cluster based on the closeness to the respective point (Portnoy et al., 2001). Clustering and outliers methods are used in current IDS field (Barnett and Lewis, 1994; Sequeira and Zaki, 2002), with different variations based on how the query ”Is the isolated outlier an anomaly?” is answered. For instance, the KNN (k-nearest neighbor) tactic (Liao and Vemuri, 2002) practices the Euclidean distance to describe the association of data points to a certain cluster.
Genetic algorithms. Genetic algorithms are adaptive heuristic search algorithms. They use random searches to solve optimization problems. They are a class of evolutionary biology computation. Li mentions that, genetic algorithms establish alternative type of machine learning-based method, capable of descending classification rubrics (Li, 2004) choosing suitable characteristics or minimal parameters for the search method (Bridges and Vaughn, 2000).
This technique provides flexible and robust wide searches in different directions without prior information about networks behavior. Like other data mining techniques, this method also uses a high amount of resources for computation.
III Why it is hard to apply machine learning solutions to intrusion detection?
There are some inherent characteristics of the intrusion detection system which might create difficulty in applying machine learning solutions. Discussing these characteristics and challenges–
Correct classification model. If we have to propose the use of machine learning based model then the effective classification model is of utmost importance.
High False positive in intrusion detection. Axelsson mentions in his studies attack detection efficacy is lowered due to the higher rate of false positive reports. (Axelsson S, 2000). This aspect is generally explained as arising from the lack of good studies on the nature of the intrusion events. The problem calls for the exploration and development of new, accurate processing schemes, as well as better structured approaches to modeling network systems.
Assessing the results. IDS become hard to implement in absence of proper rubrics and assessment methods, as well as a overall framework for assessing and comparing different IDS techniques. (Stolfo SJ, Fan W., 2000)
Cost involved is high. “Low throughput and high cost, mainly due to the high data rates (Gbps) that characterize current wideband transmission technologies”(Stolfo SJ, Fan W., 2000). The cost of dealing with the false positives, additional resources, and computing powers to analyze the data becomes additional overhead while implementing the machine learning based anomaly network intrusion detection systems.
Difficulty in implementation. (Axelsson S, 1998) shares in his studies that mansy IDS systems accomplish poorly in protecting themselves from malicious attacks. Though there are different techniques mentioned in literature to evade IDS, (Ptacek and Newsham, 2003) substantial exertions should be done to advance intrusion detection technology.
IV Guidelines to use machine learning for intrusion detection.
Intrusion detection techniques are continuously evolving. These guidelines will help to improve the performance of the model in the real-world scenarios.
Realizing the threat model. Before starting to decide on or develop an anomaly based intrusion detector one should decide the framework in which the model will be used.
i) Thinking from the attacker’s perspective and coming up with a list of possible intrusion attacks and their impact on business.
ii) Incase if the attack goes unnoticed what will be the possible impact?
iii) The functionality of the model will be highly dependent on the network size as well as the domain in which it will be implemented, such as academic, corporate business, or small entrepreneur.
Defining the scope. i)It is essential to decide the scope of the system and based on that perform an assessment of which machine learning based tool and model will best suit the requirement.
ii) Define the required features of the model and tool.
When deciding on a specific machine-learning algorithm , one should find an appropriate reason for why the specific choice will accomplish well in the envisioned situation—not just on mathematical basis but bearing in mind area-specific characteristics. As discussed by Duda et al. (D. E. Denning, 1987), there are “no context-independent . . . reasons to favor one learning . . . method over another” (emphasis added); they call this the “no free lunch theorem”.
Reducing the Cost. The more the anomaly based model is efficient in detecting the malicious activity, the less will be the cost in terms of financial, human resource, reputation and, function. So, reducing the false positive reports is important. It should be kept the priority of the agenda while developing the systems.
Also, how accurately one defines the scope and understands the threat model will decide the objective of the anomaly detection model. It will help to decide the tolerable level of false positive. All this analysis will result in reducing cost in all different areas mentioned above. One must also consider the network traffic diversity and its effect on learning the pattern.
Evaluate the anomaly detection system. The major area of concern in using the machine learning based technique is the ability to assess the effectiveness of the model. Questions like limitations of system, what percentage of malicious activity it is capable of detecting and which it cannot, performance of the systems, cost, and impact analysis, all these surveys would help to get the concrete answers for an operational model for a particular domain.
Also comparing the results with the already available anomaly network intrusion detection system would give a better idea of the evaluation of the system.
The present paper in detail discusses about the anomaly-based network intrusion detection system technology. Primarily it studies different old techniques, upcoming machine learning, and data mining based anomaly intrusion detection models. It takes an overview of these models, their advantages, and drawbacks. It further discusses the challenges in using the machine learning based techniques in anomaly based intrusion detection systems, and comes up with some important suggestions that can be used while employing these models. It is always said that no encryption or cyber security tool provides the silver bullet to be safe from attacks, but is one step towards the better defense which will keep on developing new intrusion detection techniques and models. It can be concluded from the study from various research papers on the topic of anomaly based intrusion detection systems using machine learning, that challenges continue to exist to identify new attacks, and the research will continue to evolve to overcome them.
(Axelsson S, 2000) Axelsson S. The Base-rate fallacy and its implications for the difficulty of
intrusion detection. ACM Transactions on Information and System Security 2000; 3:186–205
(Axelsson S, 1998) Axelsson S. Research in intrusion detection systems: a survey. Technical report. Chalmers University of Technology. Goteborg 1998.
(Bridges and Vaughn, 2000) Bridges S.M., Vaughn R.B. Fuzzy data mining and genetic algorithms applied to intrusion detection. In: Proceedings of the National Information Systems Security Conference; 2000. pp. 13–31.
(Kruegel C., Mutz D., Robertson W, 2003) Kruegel C., Mutz D., Robertson W., Valeur F. Bayesian event classification for intrusion detection. In: Proceedings of the19th Annual Computer Security Applications Conference;2003.
(D. E. Denning, 1987) D. E. Denning. An Intrusion-Detection Model. IEEE Transactions on Software Engineering; 1987. vol. 13. no. 2. pp. 222–232.
(Dickerson J.E., 2000) Dickerson J.E., Fuzzy network profiling for intrusion detection. In: Proceedings of the 19th International Conference of the North American Fuzzy Information Processing Society (NAFIPS); 2000. pp. 301–306.
(Liao and Vemuri, 2002) Y. Liao, V.R. VemuriUse of K-nearest neighbor classifier for intrusion detection Computers & Security, 21 (2002), pp. 439-448
(M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, L. Chang, 2003) M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, L. Chang,A novel anomaly detection scheme based on principal component classifier, in: Proceedings of the IEEE Foundations and New Directions of Data Mining Workshop, Melbourne, FL, USA, 2003, pp. 172–179
(Portnoy et al., 2001) Portnoy L., Eskin E., Stolfo S.J. Intrusion detection with unlabeled data using clustering. In: Proceedings of The ACM Workshop on Data Mining Applied to Security; 2001
(Ptacek and Newsham, 2003) T. Ptacek, T. Newsham. Insertion, evasion and denial of service: eluding network intrusion detection, Secure Networks (2003)
(Robin Sommer, Vern Paxson , 2010) Robin Sommer, Vern Paxson Outside the Closed World: On Using Machine Learning For Network Intrusion Detection, IEEE Symposium on Security and Privacy, 2010
(Stolfo SJ, Fan W., 2000) Stolfo SJ, Fan W. Cost-based modeling for fraud and intrusion detection: results from the JAM project. DARPA Information Survivability Conference & Exposition 2000:130–44.
(V. Barnett, T. Lewis, 1994) V. Barnett, T. Lewis Outliers in statistical data 9780471930945, Wiley (1994)
(W. Li, 2004) W. LiUsing genetic algorithm for network intrusion detection C.S.G. Department of Energy (2004) p. 1–8
(Yeung DY, Ding Y, 2003) Yeung DY, Ding Y. Host-based intrusion detection using dynamic and static behavioral models. Pattern Recognition 2003;36(1): 229–43.