Liver disorder diagnosis using linear, nonlinear and decision tree classification algorithms

- In India and across the globe, liver disease is a serious area of concern in medicine. Therefore, it becomes essential to use classification algorithms for assessing the disease in order to improve the efficiency of medical diagnosis which eventually leads to appropriate and timely treatment. The study accordingly implemented various classification algorithms including linear discriminant analysis (LDA), diagonal linear discriminant analysis (DLDA), quadratic discriminant analysis (QDA), diagonal quadratic discriminant analysis (DQDA), naive bayes (NB), feed-forward neural network (FFNN) and classification and regression tree (CART) in an attempt to enhance the diagnostic accuracy of liver disorder and to reduce the inefficiencies caused by false diagnosis. The results demonstrated that CART had emerged as the best model by achieving higher diagnostic accuracy than LDA, DLDA, QDA, DQDA, NB and FFNN. FFNN stood second in comparison and performed better than rest of the classifiers. After evaluation, it can be said that the precision of a classification algorithm depends on the type and features of a dataset. For the given dataset, decision tree classifier CART outperforms all other linear and nonlinear classifiers. It also showed the capability of assisting clinicians in determining the existence of liver disorder, in attaining better diagnosis and in avoiding delay in treatment.


INTRODUCTION
Liver is one of the most vital part and is the largest internal organ in human body. It carries out several metabolic functions like producing bile, making certain proteins for blood clotting, filtering blood, helping in fat digestion, decomposing red blood cells and most prominently detoxifying harmful chemicals [1]. Liver disease is defined as the improper functioning of complex metabolic functions which further leads to serious health ramifications. Liver disease can be acute (for short time) or chronic (for long time) that can put the life at risk [2]. It is generally caused by accumulation of fat in excess, inherited disorders, virus infected damaged hepatocytes, bacteria or fungi, contaminated food and acute consumption of alcohol or drugs [3][4][5]. The severity of disease may begin from a healthy individual to viral hepatitis infection, to cirrhosis and more seriously to liver cancer. Its wide and hidden presence worldwide makes it a serious area of concern in medicine. Liver disorders have been persistently listed as one of the top ten fatal diseases around the globe costing millions of lives every year.
Ability of liver to function normally even when partially damaged resists its early presence and makes it more alarming as by then it has suffered significant or permanent eternal damage. This designates that early diagnosis of liver disorder is crucial so that timely treatment can take place [2,3]. Clinical interpretations from a collection of symptoms, risk factors, laboratory examination tests and other vital examination figures is a highly demanding task in medical diagnosis. The task even becomes more complex if the existing figures are fuzzy. It also stretches the decision time of clinicians even if they are experienced and if they are novice then it may take years for physicians to gain substantial expertise in analyzing the uncertain medical records of patients. Moreover, the accurate diagnosis is still not guaranteed as humans are prone to errors no matter whatever may the reason be like abundant clinical workload or a poor health. Hence, to interpret multifaceted datasets, to avoid clinical inexperience and to reduce the evaluation time, computer-aided systems are built using a diversity of intelligent classification algorithms for liver disorder diagnosis.
In addition to individual classifiers, hybridization of classification algorithms has also been widely employed. ANN-CBR integration was used to examine the existence of liver disorders and to determine the types of liver disorders [4]. ANN and decision tree combination was used by Calisir and Dogantekin (2011) to diagnose hepatitis [35], by Bologna (2003) to diagnose liver disorders [36] and by Hashem et al. (2012) to predict liver fibrosis degree in patients with chronic hepatitis C [37]. AIS-FL was used by  to classify liver disorders [38] and by Mezyk and Unold (2011) to assess prediction accuracy of liver disorders [39]. CBR-GA was used by Park et al. (2011) to find total misclassification cost of CSCBR in hepatitis patient's records [40]. FL-GA was used by Wang et al. (1998) [41] and Chowdhury et al. (2007) [42]; AIS-ANN-FL was used by Kahramanli and Allahverdi (2009) [43] and ANN-CBR-RBR was used by Obot and Uzoka (2009) [44] to diagnose hepatitis disease. ANN-FL integration was deployed by Dogantekin et al. (2009) to diagnose hepatitis [45], by  to deal with class imbalance problem with medical datasets and to enhance the classification accuracy [46], by Ceylan et al. (2011) to diagnose liver cirrhosis [47], and by Comak et al.  [48][49][50][51][52].
In recent years, medical diagnostic systems have been widely practiced in hospitals and have been comprehensively assisted physicians in analyzing patients' therapeutic history. Large data centers are created with the use of hardware and software technologies for resourcefully storing medical records in great amount. For experimentation and learning, classification algorithms are being applied on these records which can be quickly retrieved any time with the help of computer processing systems. Although, it is proved from literature study that most famous and widely applied classifiers come under non-linear classification but yet the rest also have their own significance in providing comprehensive information as per the scalability and diversity of data. Each classifier follows distinctive steps for data processing and computation which makes them distinct in producing results. This study accordingly provides a contribution to the liver disorder diagnosis process by shortening the time through the use of distinctive linear, nonlinear and decision tree classification algorithms. These algorithms help physicians to evaluate complex cases that are otherwise hard to perceive. The classifiers include linear discriminant analysis (LDA), diagonal linear discriminant analysis (DLDA), quadratic discriminant analysis (QDA), diagonal quadratic discriminant analysis (DQDA), naive bayes (NB), feedforward neural network (FFNN), and classification and regression tree (CART). Linear classification includes LDA and DLDA, nonlinear classification includes QDA, DQDA, NB and FFNN and decision tree classification includes CART.
The remaining paper is arranged as follows. Section 2 presents methodologies containing description of techniques used. Section 3 discusses the experimental results. Finally, conclusion is drawn in Section 4.

METHODOLOGIES
Certainly clinicians play a decisive role in medical diagnosis and treatment. However, deployment of classification algorithms enhances the diagnostic efficiency and also facilitates physicians to make sound judgments on the presence of sickness. Therefore, the study deployed a variety of classifiers to diagnose liver disorder and also evaluated their performances to find the finest one. The steps involved in finding the best prediction model are shown in Figure 1. The classifiers implemented includes linear discriminant analysis (LDA), diagonal linear discriminant analysis (DLDA), quadratic discriminant analysis (QDA), diagonal quadratic discriminant analysis (DQDA), naive bayes (NB), feed-forward neural network (FFNN), and classification and regression tree (CART) which are introduced as follows.
LDA is a classification method based on covariance matrix originally developed by R. A. Fisher in 1936. It works on the concept of searching for a linear combination of variables that best separates two classes. The variables are the predictors and the classes are the actual targets in numerical form [53,54]. LDA works efficiently for disproportionate within-classes frequencies by maximizes the ratio of between-classes variance to within-classes variance for drawing decision region between the given classes. For example, let's assume that the dataset have X classes; class j mean vector is where j=1, 2, . . X; is the number of samples within class j where j=1, 2, .. X.
where N is defined as the total number of samples, M a is the within-class scatter matrix, M b is the between-class scatter matrix and µ is the mean of entire dataset. On the other hand, DLDA is the extension of linear discriminant analysis where covariance matrices are assumed equal across groups. QDA is considered as the more generalized version of LDA used for heterogeneous variance-covariance matrices. It calculates a quadratic score function for each of the groups. This function belongs to the mean vectors of population and the variance-covariance matrices for jth group. The parameters are estimated by maximizing joint likelihood of feature and their classes. On the other hand, DQDA is the extension of quadratic discriminant analysis where covariance matrices are used in which all off-diagonal elements are set to be zero [55]. NB classifier follows class conditional independence which means the effect of a value of a feature (a) on a given class (t) is independent of the values of other features [56]. NB is based on bayes theorem which describes the mode of calculating posterior probability P(t|a) from P(t), P(a) and P(a|t).
where P(t|a) is the class probability given the feature, P(t) is the class prior probability, P(a) is the feature prior probability and P(a|t) is the feature probability given the class. The reason of multiplying the probabilities of all n attributes is because of the class conditional independence. Firstly an occurrence table for each feature is constructed against the class. Then the likelihood tables are created by transforming the occurrence tables for executing naïve bayes equation in order to calculate the posterior probability for each class. Among all the classes, the one with the maximum posterior probability will become the output of prediction.
ANN based models have wide applicability in medical diagnosis. It works by selecting data, creating and training a network, validating and testing the targets and evaluating the performance using confusion matrices and mean square error. A feed-forward neural network (FFNN) with sigmoid hidden and output neurons is trained with scaled conjugate gradient backpropagation network (BPN) in this study. FFNN is based on supervised learning and is a biologically inspired classification algorithm. The hidden layer consisted of eight neurons placed in parallel for performing a weighted summation of inputs and then passing an activation function through a sigmoid nonlinear transfer function [57]. These inputs are connected to neurons by a weight and the weighted sum of inputs calculated by neurons is called as activation. Using nonlinear output neurons were found advantageous for classification. Backpropagation algorithm applied for training minimizes the cost function which is equal to mean squared difference between actual and desired output values through gradient descent technique. Its structure is mainly based on batch learning. The study uses one hidden layer as it is discovered that too many hidden layers generate incompetent results. Fitting values for learning rate, momentum Prediction model for liver disorder diagnosis coefficient and transfer function interval were used to obtain efficient classification results and to reach optimal convergence.
CART is one of the key methods of data mining and had dominated the field of advance analytics. It is a nonparametric method that automatically performs variable selection. It increases the performance by revealing the important relationships of features in dataset and represents them in the form of tree [4]. It can easily handle both categorical and numerical variables. It worked in three parts that includes building of maximum tree also known as tree growing, right tree selection also known as tree pruning and classification of testing data using built tree. The tree growing was done through splitting the learning data using a gini impurity criterion. In the growing stage, splitting of training samples up to last observations was recursive until the gini diversity index was minimized in each terminal node. The impurity function used by gini splitting rule is as follows.

| | 7
where n is a node, x, y are class labels, | is the conditional probability of observing a sample from class y at node n. After minimization, optimal tree is selected by tree pruning procedure. The gini splitting criteria (∆ , is defines as follows. are probabilities left child node and right child node correspondingly. Cost-complexity function ( ) finds the optimal proportion between the misclassification error and tree complexity where is the misclassification rate in tree N, sum of terminal nodes in the tree, is the complexity measure. CART methodology uses surrogate split methods for dealing with missing data in attributes but if the dependent attribute in a subject is missing or all the attributes in a subject is missing than the specific sample will be ignored.

EXPERIMENTAL RESULTS
For experimental evaluation, Indian Liver Patient Dataset was taken from UCI (University of California at Irvine) machine learning repository. The dataset characteristic is multivariate and it includes 10 attributes, 2 classes and 583 samples. The attributes are age, gender, total bilirubin, direct bilirubin, albumin and globulin ratio, alkaline phosphotase, albumin, alamine aminotransferase, aspartate aminotransferase and total proteins. The two classes are categorized as normal and diseased. Among 583 instances, 416 are liver patients and 167 are healthy individuals. These records were collected from north east of Andhra Pradesh, India. Each line in the data file constitutes a record of a single male or female individual. In total there are 441 male and 142 female records. The basic attributes/indices for liver disorder are described in Table 1.
In general, performance of a classifier depends upon the structure of a dataset. It was observed from the experiments that LDA, DLDLA, QDA, DQDA and NB had not shown adequate results. ANN performed better than the mentioned algorithms and CART was superior among all. In ANN based model, a three layered FFNN was deployed having input, hidden and output layers. The structure was designed with ten inputs, one hidden and one output layer. Initially multiple number of nodes (4,6,8,16,17,21) were tested in the hidden layer. The range of epochs was set from 10 to 1000. Based on output results, the best architecture finalized was 10-8-1 which means 10 neurons in input layer, 8 neurons in hidden layer, and 1 neuron in output layer with 27 epochs. Mean squared error (MSE) and receiver operating characteristics (ROC) of training, validation and testing data are presented for examining the convergence of the architecture in Figure 2, 3, 4, 5 and overall ROC is presented in Figure 6. The computation of finding differences between desirable output and actual output, squaring the differences and finding the averages of all classes and internal validation leads to describe MSE. The diagnostic accuracy rate of best FFNN architecture was 75.90%.      CART works competently in two type of output based classification as it follows strict binary tree structure having two terminal nodes. The binary splitting procedure is recursive until further division is impossible. In the proposed CART model, the samples were first split into training and testing groups. Training data was used for building the CART model and testing data was used for examining the performance. The built model extracted rules from the health examination data and classified it into diagnosed class (class 1) for patients suffering from liver disorder or normal class (class 2) for healthy individuals. Each terminal node was associated with a set of rules once the optimal tree was built. The optimal decision tree and a set of classification rules extracted from the optimal tree built by CART model were mentioned in figure 7 and table 2 respectively. The result of validation method used as testing data showed a diagnostic rate of 84.22%. To validate the proposed classification algorithms, selected dataset was partitioned into two parts (training set and testing set) for LDA, DLDA, QDA, DQDA, NB and CART and into three parts (training set, validation set and testing set) for ANN. Partitioning was done using holdout cross validation method in order to minimize the potential bias of samples. Seventy percent data was used for training and thirty percent data was used for testing to evaluate and compare the diagnostic accuracy rates of classifiers. Both training and testing data remained same for all classifier excluding ANN which used sixty percent for training, 20 percent for validation and 20 percent for testing. Partitioning also helped in estimating misclassification probabilities. In order to recognize the most efficient predictive model for liver disorder diagnosis, obtained experimental results of classifiers (LDA, DLDA, QDA, DQDA, NB, ANN and CART) were compared with each other. Table 2 shows the achieved diagnostic accuracy rates and CART model appears to take the lead, followed by FFNN as the runner-up model. Based on the literature study, it is proved that performance of a classification algorithm differs from one data structure to another. For instance, Yildirim (2003) found the performance of ANN based model better than decision trees, naive bayes and bayesian networks while Floares (2009) developed a decision tree based model which was superior to SVMs, bayesian networks and various neural networks architectures. This study also scrutinized number of data mining methods for attaining adequate results in diagnosing liver disorder. For example, a variety of ANN models with n number of hidden nodes and learning parameters were examined to select the best architecture. CART also achieved significant results and was considered as an optimal classifier for assisting physicians by forming a path through clinical rules to conclude whether an individual is sick or healthy. Its interpretation and structure is simple that makes the complex clinical co-relations easy to understand.

CONCLUSION
Liver disease is one of the major causes of mortality in India as well as around the world. Its wide and hidden presence makes it a serious area of concern in the universal set of medicine. It has been consistently listed as one of the top ten fatal diseases around the globe costing millions of lives every year. Lack of timely diagnosis and appropriate treatment is visible with the registered cases of liver disorders in hospitals. Accurate assessment is therefore highly important and obligatory to save the human lives. Analysis and interpretation from a collection of symptoms, risk factors, laboratory examination tests and other vital examination figures is a highly demanding task in medical diagnosis and becomes more complex if the figures are fuzzy. It also stretches the decision time of clinicians even if they are experienced. Moreover, if they are novice then it may take years for the physicians to judge and gain substantial expertise in analyzing the complex and uncertain examination data of patients. The accurate diagnosis is still not guaranteed as humans are prone to errors no matter whatever may the reason be like abundant clinical workload or a poor health.
Therefore, to interpret multifaceted datasets, to avoid clinical inexperience and to reduce time period and effort needed, the study accordingly deployed a number of linear, nonlinear and decision tree classification algorithms and presented a predictive model for liver disorder diagnosis. These algorithms include LDA, DLDA, QDA, DQDA, NB, ANN and CART out of which CART was found superior and has taken the lead in terms of accuracy rates. Apart from the best performance it has also built a set of rules to provide valuable insight into relationships between predictable attributes and target attributes for the diagnosis. Implementation of these diagnostic system models has contributed a major transformation in the field of information retrieval, and the medical domain has also been widely affected by this renovation. Intelligent classification algorithms imitate these diagnostic systems to work like human brain. Number of authors have penned about the role of computational intelligence in medicine. Though disease diagnosis primarily relies on physician's clinical experience but computational intelligence does help in making appropriate judgments. A lot of scope seems for the future research in recognizing the efficient classification algorithms by changing the structure and increasing the samples in dataset. Number of attributes can also be extended for finding decisive correlations between them. Two or more classification algorithms can also be integrated to refine and diversify the achieved results.