Anomaly Detection in DNS Query Logs using Improved Binary Black Hole Optimization Algorithm

Domain Name System (DNS) log information supplies a unique perspective on domain names usage by both legitimate users and anomaly users. More than analyzing DNS queries in traditional manner, this research work aims in design and development of binding approach based on Improved Binary Black Hole Optimization Algorithm IBBHOA for feature selection in SVM classifier. At first unremitting black hole optimization algorithm is portrayed. Next an improved binary black hole optimization algorithm is presented. After that binding approach based on IBBHOA for feature selection in SVM Classifier is carried out. SVM classifier is chosen for performing the classification task for characterizing DNS lookup behaviors by means of log-mining. DNS query logs are obtained from the dataset from various sources. Feature selection is performed and then the SVM classifier is used to classify anomaly behaviors, DNS failed requests, time taken for feature selection and time taken for classification are the performance metrics chosen for comparison.


Literature Review
Passive Testing or offline analysis of network protocols has been reached its maturity since its first application in network fault detection by Boloutas et al. [2], the main idea in PT is to model the protocol as a finite state machine (FSM), extended finite state machine (EFSM) [3], and/or communicating finite state machines (CFSM) [4], [5], and then test network traces against the model to check conformance/performance. There are several related works that detect anomalies by monitoring a DNS server. White et al. showed that DNS is available for monitoring anomalies in a network [6]. They relied on correlation between DNS's and the other protocol's traffic in an enterprise network, and detected the propagation of worms. However, it needs the other protocols' traffic also, and cannot detect anomalies without other traffic. Musashi et al. and other successors proposed host anoma-lies detection monitoring only DNS servers [7], [8]. Ren et al. presented Flying Term, a new perceptually motivated visual metaphor for visualizing the dynamic nature of DNS queries [9]. Hadi et al. offered a comprehensive review of network security visualization and provided a taxonomy in the form of five use-case classes encompassing nearly all recent works in this area [10]. Schonewille and Helmond's research was a first glance at the usability of DNS traffic and logs for detection of this malicious network activity. It is possible to detect the bots through the information od DNS gathered from the network by placing counters and triggers on specific events in the data analysis [11]. David and Paul considered three classes of DNS traffic: canonical, overload and unwanted, and showed preliminary results on how DNS analysis could be coupled with general network traffic monitoring to provide a useful perspective for network management and operations [12]. Kirkpatrick et al. introduced a method for clustering misconfigured DNS sources [13]. Using machine learning methods, they analyzed 24 h of DNS requests that were collected on the A-root DNS server. Their research provided preliminary results that were validated via discussion with DNS system operators. Shan et al. proposed an interactive visual analysis system for the DNS log files to intuitively detect the anomalies in DNS query logs [14]. Albrecht-Buehler used motion to visualize trends among texttheme relationships and allowed user interaction of the temporal controls and theme relations [15]. Brandes et al. used animation to illustrate the dynamics of international political and military conflicts [16]. In the research work of Pieter's an approach namely visual analytics is used on a huge set of DNS packet captures into ways that authoritative name servers were abused for denial of service attacks [17]. Several tools were developed to identify patterns in DNS queries and responses.Visualization analysis tool was presented by Yu for analyzing, catching and acknowledging to the Distributed Denial of Service attack termed the Domain Name Service (DNS) amplification attack [18]. In Born's study both quantitative analysis and visual aids were provided that allowed the user to make determinations about the legitimacy of the DNS traffic [19].

Proposed Work
The proposed work introduces binding approach based on IBBHOA for feature selection in SVM classifier. At first unremitting black hole optimization algorithm is portrayed. Next an improved binary black hole optimization algorithm is presented. After that binding approach based on IBBHOA for feature selection in SVM Classifier is carried out. SVM classifier is chosen for performing the classification task for characterizing DNS lookup behaviors by means oflog-mining. DNS query logs are obtained from the dataset from various sources. Feature selection is performed and then the SVM classifier is used to classify anomaly behaviors, DNS failed requests, time taken for feature selection and time taken for classification are the performance metrics chosen for comparison.

Unremitting Black Hole Optimization Algorithm (UBHOA)
The black hole optimization algorithm is a robust stochastic optimization technique based on simulation of the behavior of black hole in external space. The below steps explain manner of simulating UBHA from black hole phenomenon: Step 1: Outer space is full of known and unknown stars. In real space black hole is formed by collapsing individual stars so UBHOA begins with the population of stars that located arbitrarily in the explore space. In UBHOA each star has a fitness value, which is evaluated by a fitness function to be optimized. The best star that has the best fitness value is selected as the black hole.
Step 2: In the real space, a black hole is an object of extreme density with an intense gravitational attraction. This leads to a great amount of gravitational force pulling stars around it. UBHOA has followed the same behavior. By Eq. (1) all the stars began moving toward the black hole.
Step 3: The sphere shaped bound of a black hole in outer space is known as the event horizon. The event horizon radius is called as the Schwarzschild radius. The red circle in Fig. 1 shows the event horizon of black hole. In the real space the Schwarzschild radius is computed by Eq. (2) and in UBHOA is computed by Eq. (3).
Step 4: Because of extreme density and strong gravitational attraction of black hole when a star crosses the event horizon, it will be swallowed by the black hole and disappear. In the region of event horizon the escapee speed is tantamount to the speed of the light, so nothing can get away from within the event horizon. In UBHOA, the Euclidean distance between black hole and star is computed. If this distance is less than Schwarzschild radius, substitute it with a fresh star in the random location in the search space.
Step 5: In UBHOA if a star reaches a location with lower cost than the black hole, in that case theirs locations need to be altered.

The Proposed Improved Binary Black Hole Optimization Algorithm (IBBHOA)
The UBHOA was originally developed for unremitting valued spaces. But there exist a number of discrete combinatorial optimization problems, such as feature selection, in which the values are not unremitting numbers but rather discrete binary integers. The unremitting black hole algorithm reason, we have introduced binary version of UBHOA and mentioned the same as IBBHOA. Binarization techniques can be categorized into two groups: Two steps binarization and unremitting-binary operator transformation. The proposed binarization technique belongs to the first group. In the first group without any modifications in the operators, only two steps are added after the unremitting iteration. In solving feature selection problem the search space must be modeled as a d-dimensional Boolean web, where the i th star moves around the d -dimensional space. Since the problem is to select or not select of a given feature, the position of a star only takes the values 1 or 0. Therefore, a transfer function is needed to forces stars to move in a binary space. Transfer functions define the probability of changing position's elements from 0 to 1 and vice versa. In the proposed approach, Hyperbolic Tangent function is utilized to modify the position of stars as in the Eq. (4) and (5).
S( X id (t+I )) = abs(tanh(X id (t+i)))…(4) X id (t+i) = 1 If S(X id (t+i))>r and 0 otherwise … (5) Where rand is a uniform random number between 0 and 1. In Eq. (5), instead of rand threshold 0.6 can also be considered. In IBBHOA we only need to set number of stars. The proposed algorithm does not suffer from some of other optimization algorithms difficulties such as the slow convergence rate and adjusting several parameters. Compared with other optimization algorithms, IBBHOA is easier to implement, depend on a single parameter for configuring the model, requires much less memory, and converges more rapidly.

The proposed Binding approach based on IBBHOA for Feature Selection in SVM Classifier
At the beginning of IBBHOA, the primary population of the star's position is initialized randomly. Each star encodes a candidate feature subset based on a bit string. The length of the string is equivalent to the total number of features in the dataset of interest. In the binary encoding, a bit of one implies the feature is chosen and a bit of zero means that the feature is not chosen. Similar to other optimization algorithms, the fitness value of each star is calculated by using an evaluator.
In the part of evaluating fitness value of stars, when two founded stars have identical fitness value, the one with smaller number of features is chosen as the best star (black hole).The procedure stops once stopping criteria (maximum number of iterations) is met. The parameters for IBBHOA specify 25 iterations of population consisting of 10 stars. At the end of the IBBHOA wrapper based FS algorithm, the star with the best performance is selected. The position of this star gives the selected features. In order to avoid producing random results and provide an assurance for impartial comparison of the classification performances, assessing the efficiency of SVM classifier for selected features by optimization algorithms is executed 100 times. SVM is a supervised machine learning classifier which is applied for categorization. SVM finds the best possible surface to separate the positive samples from the negative samples. SVM is comparatively better than that of text classification when compared to Naive Bayes (NB) classifier and maximum entropy based classifiers.
The fundamental aim of SVM during the training process is to hit upon a maximum margin hyperplane to solve the feature review's classification task. There exist limitless possible boundaries in order to break up the two different classes. For choosing the best class, it is significant to prefer a decision boundary which contains a maximum margin between any points from both classes. The decision boundary with a maximum margin would be less likely to make prediction errors, which is close to the boundaries of one of the classes. In this part of research a simplified SVM that is capable enough to classify multi-class and performs dual roles. In the beginning, making a model for the training data set and then using that model to conclude facts of a testing data set. The SVM procedure includes the following steps.
Step -1: Transform data to the format of an SVM package Step -2: Conduct simple scaling on the data.
Step -4: Features are selected based on IBBHOA to train the whole training dataset.
Step -5: Test with the testing dataset.
After preprocessing, the above procedures are carried out for training the SVM. The basic form of features and its classification is illustrated in the following equation. ɸ = (Ds × C s) {P, N} … (6) where D S is a set of documents and C S is a set of categories. If ∅ : (D S × C S ) = P, then D Si is a positive member of C Sj If ∅ : (D S × C S ) = N, then D Si is called a negative member of C Sj . The SSVM method gives a positive value (+1) in the most appropriate holding data points and a negative value (−1) in rest of the places. Furthermore, the non-linear mapping function, that maps the training data can be defined as follows.
where R N is a non-linear mapping that represents training data for feature space R F . Hence, there is a need for performing optimization in order to segregate the dataset.
The kernel functions provide more decision functions when the data are nonlinearly separable. The kernel functions used the following polynomial function and Gaussian Radial-Basis Function (RBF). The RBF kernels of SVM are used in our system to build models. These models predict information for the testing data set. The representation points for each feature vector lay on a 1D plane and cannot be separated by a linear hyperplane. Therefore, the system will first use a kernel function that maps the points into feature space and then separates the points by hyperplane. The kernel function that will do the job is k(xi, xj) = ∅ (xi) × ∅ (xj). In addition, the kernel polynomial function maps the feature space points into 2D by multiplying the points to the power of two.

Results and Discussions
DNS traffic usually makes use of User Datagram Protocol (UDP) or Transmission Control Protocol (TCP) on port 53 in order to perform the communication task [20]. Almost all the DNS communication are carried out through UDP, which is the default protocol used by resolvers, i.e. few applications those have communication with DNS servers for other applications when they need to resolve a DNS query. TCP was formerly has been made use for zone transfers, other than RFC 1123 [21] expanded the use of TCP as a backup communication protocol when the answer needs to be larger than 512 octets. With this connection, the first UDP DNS response has only fractional answers. The truncation bit is set with the intention that the resolver possibly replicates the query over TCP. On the other hand, RFC 2671, "EDNS0" [22], defined a new opcode field / pseudo resource record that allows UDP DNS traffic to be bigger than 512 octets. This is due to roughly all of today's DNS traffic uses UDP as its transport protocol. Sorting the obtained records by different criterion is used to detect unusual records or activities. At the same time as searching for records with low TTL values can generally be useful in detection of fast flux domains. The top 10 wireless client IP addresses and the details are shown in Table 1. Table 2 depicts the details which consists of queries per day, domain name, number of anomaly received requests and number of failed requests about the top 10 destination domains.     Table 3.portrays detection accuracy of anomaly requests sent from client side. From the Table 3 it is evident that the proposed IBBHOA based SVM detect more anomaly behavior in wireless clients than that of PCA-SVM. It is noteworthy that from the Table 4 it is evident that the proposed IBBHOA based SVM detect more anomaly behavior in domain name servers than that of PCA-SVM. Table 5 encompasses the time taken for performing feature selection task and time taken for performing classification task. It is evident that the proposed mechanism outperforms the existing one.

Conclusion
This research work aims to detect anomalies in DNS query logs. The data are obtained from CAIDA data server [23]. For that reason, the proposed work introduces binding approach based on IBBHOA for feature selection in SVM classifier. Initially unremitting black hole optimization algorithm is portrayed. After that an improved binary black hole optimization algorithm is presented. Then the binding approach based on IBBHOA for feature selection in SVM Classifier is carried out. SVM classifier is chosen for performing the classification task for characterizing DNS lookup behaviors by means of log-mining. Feature selection is performed and then the SVM classifier is used to classify anomaly behaviors, time taken for feature selection and time taken for classification are the performance metrics chosen for comparison. From the obtained results, it is evident that the proposed mechanism outperforms than that of the existing one.