Multi-tasks Deep Learning Model for classifying MRI images of AD/MCI Patients

-The accurate diagnosis of Alzheimer's diseases (AD) and prodromal stage like Mild Cognitive Impairment (MCI) play a vital role in preventing progression of Alzheimer diseases and mild cognitive impairment. In view of that, a multi-kernel classifier model with noninvasive imaging technique for AD/MCI patients is proposed in this paper. The proposed model includes four techniques, such as PCA, Stability Selection, Multitask deep models with dropout and AD/MCI diagnosis with kernel SVM leads into the deep learning framework. Also, the proposed approach is evaluated with real-world ADNI datasets (Alzheimer's diseases Neuroimaging Initiative) and its results are analyzed.


II. RELATED WORKS
In this section, some of the related works with wide range of approaches and techniques are presented with different datasets that achieve different accuracy levels.
Xiaohong W.Gao et al, [5] introduced machine learning approach(CNN) for finding the Alzheimer's disease or normal ageing or lesion by using 2D and 3D scans (282 subjects: AD-51, lesions-118, normal healthy subjects -117). These scanned images have a larger thickness and the depth direction of 5mm. The accuracy value of 86.8% was achieved. Francesco Carlo Morabito et al, [6]discussed the use of different biomarkers yielding promising accurate outcome and various image modalities giving a different view of the function of brain images for diagnosing AD/MCI using CNN approach (119 dementia patients: 63 AD subjects, 56 MCI subjects). For example, EEG signals classify EEG pattern of AD from the prodromal version of dementia. CNN approach attained an accuracy level of 95%.
Only few research studies focus on multimodal neuroimaging for AD/MCI classification. For example, Ehsan Hosseini-Asl et al, [7] introduced structural MRI scans using deep learning approach (3D CNN). The proposed method involved 3D ACNN classifier approach; enabled to find the AD using structural MRI in an accurate manner (210 subjects of ADNI).The performance of ACNN classifier approach reached an accuracy level of 95%.
Saman Sarraf et al, [8], discussed state functional magnetic resonance imaging (rs-fMRI)data and fMRI data to recognize the AD disease during a clinical scan. The proposed method involves SVM and CNN approach (ADNI -24 female and 19 male). The accuracy value of 96.85% was achieved.
Saman Sarraf et al, [9] introduced machine learning approach (CNN)with SVM to classify Alzheimer's disease from a normal healthy subject using single image modalities(ADNI -age group > 75). The accuracy value of 98.84% accomplished.
Sigi Liu et al [10], introduced stacked autoencoder (SAE) and softmax layer for diagnosing AD and MCI in the early stages. The proposed method incorporates optimized graph cut algorithm, single kernel SVM (SKSVM) and multi-kernel SVM (MKSVM). It is a semi-supervised method, which allows unlimited use of unlabeled data samples (ADNI-from 311 subjects: 65 AD, 67 cMCI, 102 ncMCI, 77 normal controls). SAE and softmax layer concluded with an accuracy value of 87.76%. Chen Zu et al., [11] Chen developed two main components (i.e.) multimodal classification and multitask feature selection for AD/MCI patients. Multi-kernel SVM method is proposed to classify the multimodality image data for final classification (ADNI 202 participants: 51 AD, 99 MCI, and 52 normal Control). Multikernel SVM method shows an accuracy value of 95.95%.
Heung-Il Suk et al [12], introduced deep learning approach stacked auto-encoder(SAE) for AD/MCI diagnosis. The SAE model permits us to find the best possible parameter in adjustment with the exact sample data (ADNI (51 AD patients, 99 MCI52, HC subjects). This method involved SAE learned feature representation in brain diseases and its achieved maximum accuracy based on the classification. The accuracy value of 98.8% was achieved.
The above-discussed work focuses on using machine learning approach and deep learning approach (CNN, SVM&CNN, SKSVM, and MKSVM) with single image modalities and multi-image modalities. The classifier approach has some cons, larger datasets and increased memory cost, data redundancy, the privacy of original data, scalability, efficiency and increase memory cost. To overcome this disadvantage the following approaches are proposed. PCA, Ridge regression technique, lasso method, Elastic Net in Stability selection, deep learning and dropout in addition to MoCA, BNT and CFT achieving better accuracy.

III. DESIGN OF PROPOSED FRAMEWORK
In this section method of this proposed work is discussed. Principal Component Analysis (PCA) is used for reducing the dimensionality and feature extraction from the preprocessed data. Stability selection is used to identify and preserve commonly found dominant features. Ridge and Lasso are an eminent technique instability selection, where a myriad number of features have been generated. The Elastic Net method is particularly used when a large number of correlations are present between the feature variables. In deep learning, the preferred features are processed by dropout techniques. Dropout is used to enhance the generalization capability of the model. And finally, support vector machine (SVM) is used to classify AD/MCI [15].
In support of the techniques used, complementary clinical scores are provided to achieve better accuracy in diagnostic. The complementary clinical scores are generated by clinical experts. There is no single test that proves a person has Alzheimer's. There are 5 additional clinical scores added namely Minimum Mental State Examination (MMSE) [16], the ADAS-CogIRT [17] [18], BNT, MoCA and CFT [19]. When the first and second of these methodologies are incorporated simultaneously it requires fewer data sets and minimal trial durations. The information from the score is related to identify the AD diagnosis. The main concept behind this approach is the deep learning structure as a multi-learning framework (MTL). The proposed work consists of multiple structures: PCA, Stability selection, unsupervised feature learning, multi-deep learning, and kernel SVM, as shown in Fig 2. In the following section, each of this components is explained in detail.

A. Principal Component Analysis
Principle Component Analysis is an efficient technique to process the medical images to find the exact diseases in the medical field. The main objective of the PCA is to rotate the sample data multiple times to align the maximum variance with the reference coordinate. For each rotation, the next maximum variance is recognized for feature selection. PCA derives uncorrelated variables from a set of features. It actually converts and changes the direction of the variables in the best manner where the required components can be seen clearly. The biggest variance is PC-1 among the original datasets, which has been used much in transformation such as predicting, redundancy detection and removal extraction of required variables and compression of data.

B. Stability Selection
Initially, PCA is applied to reduce the dimensions. Secondly, Stability method is applied for feature selection method. Some of the stability selection methods widely accepted is ridge regression, Lasso tool [20], and Elastic Net. Ridge regression technique is applied to find covariance value of the given data. It starts with the smallest value among the PC and proceeds toward the diagonal elements of the matrix. One of the main goals of using ridge regression technique is to select an exact value. The feature values obtained in ridge regression is fed to lasso tool for further extraction [21] The lasso method will identify the top level feature for AD/MCI diagnosis for example (a1, a2, a3, a4, a5) as well as try to reduce the cost of function for feature selection. Then lasso picks the distinct features (PCs) in PCA, s= [s1, s2, s3….st] t . The idea behind using this technique is to improve the selection procedure method multiple times based on the data. The above lasso method is repeated for 50 times. Then we applied the Elastic Net technique to solve the regularization and high dimensional data. Lasso will select only one covariance but Elastic Net will select multiple covariances of the variable.

C. Multi-task Learning
Multi-task Learning is mainly used for improving the performance and to learn simultaneously from the appropriate given information. The baseline assessment is applied to identify the exact AD/MCI patients namely, MMSE, ADAS-CogIRT, BNT, MoCA and CFT. These five different tasks perform collaboratively for the final outcome.
Minimum Mental State Examination (MMSE) is one of the eminent clinical tests to assess the AD patients. ADAS-Cog is low sensitive to find out the AD diseases. To achieve high sensitivity ADAS-CogIRT (Alzheimer's diseases Cog methodology with IRT modeling) has been implemented.
Boston Naming Test involves handing over line drawing containing an outline of the objects for the patients to recognize. These objects are of the frequently encountered type in the day to day life. The score from the test is taken as an input.
CFT is a test of calculation of semantic memory (i.e. mother tongue and verbal fluency). The participant will be asked to identify the exemplars from the given semantic category.
MoCA assessment is designed as a fast screening tool for mild cognitive impairment. MoCA involves a number of tests to assess the different type of cognitive domains (i.e. attention and concentration, language, memory, visuoconstructional skills, executive function, calculation, orientation, and thinking.)

D. Adaptive Dropout Technique:
The Dropout technique is very effective and simple way to characterize the deep neural network model. It is a very successful technique to train the deep learning model. It can process thousands or millions of parameter in dropout technique within minutes. It finds some weight coadaptation by haphazardly dropping out some smaller unit in the model during the training period [24]. Adaptation dropout technique is applied in this proposal to improve the AD/MCI diagnosis patients.

E. Support Vector Machine:
SVM plays a vital role in machine learning. Supervised learning model and associated learning algorithm studies data used for regression analysis and classification [25]. This SVM technique cannot be used in deep learning. Here the baseline method is only available for further selection with SVM steps. Linear kernel method will not improve the classification accuracy because the scalar method is in dual form. All the experiments were carried out with three hidden layer model.

IV. EXPERIMENTAL ANALYSIS A. Dataset
The Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset is used for training and testing the proposed multi-task deep model.
The Subjects are classified into three main categories. Such as MRI scans from 51 AD patients, 52 normal healthy control (NC) as well as 99 MCI patients [43 MCI patients who changed over to AD (MCI.C) 56 MCI Patients not converted into AD with 18 months (MCI.NC)]. The detailed information acquiring from ADNI data set is used for processing of AD/MCI diseases to improve the patients with a better life. MRI image modalities are sensed using 1.5 T scanners. The MRI scans were downloaded from the ADNI site(adni.loni.usc.edu) with the help of Digital Imaging and Communication in Medicine(DICOM).This dataset contains five additional clinical scores MMSE, ADAS-CogIRT, BNT, MoCA, CFT for each and every patient. The process of image processing is applied 3-D MRI images with skull stripping, cerebellum removal, anterior commissure correction, and spatially normalization.

B. Experiments
The proposed work is to identify the AD/MCI patients based on their brain image dataset. To identify AD/MCI diseases choose a normal healthy image to differentiate AD/MCI patients with other normal patients. This can be done by four different combinations i.e. (i) AD/normal healthy control (AD/NC), (ii) MCI subject/ healthy control (MCI/NC), (iii) MCI patients versus AD patients (MCI/AD), and (iv) MCI non-converted versus MCI converted (MCI.NC/MCI.C). In the first combination, one AD brain image and one normal brain image is loaded from the set of images. Later the clinical exports will provide the clinical scores to the loaded images. Based on these clinical scores and brain images, the variation between these two images is calculated by using four techniques one by one. Initially, PCA is applied and the output of PCA is given as input for stability selection. In stability selection, the covariance value of the given data is identified. Later in multitasking, the clinical scores are added and dropout technique is used to remove the smaller units. Finally, the SVM classifier is used to classify those images. The same can be followed for the other three combinations and a comparative table is drawn to show this results.

C. Comparison method
The proposed framework is compared with baseline method. Baseline method has same components as in proposed work excluding deep learning concepts and evaluated with more than two image modalities for diagnosing AD/MCI patients. The accuracy of proposed work is tested with single image modalities and is compared with individual and baseline components as shown in Table.1. Table 1 demonstrates the overall experience of the proposed system in the framework. Our proposed method performed each component such as PCA, Stability Selection, and Drop out technique with higher accuracy. The high-level accuracy was achieved better than baseline method.

D. Results
It is found that in the conversion diagnosis (MCI.C versus MCI.NC), the PCA component has, in a small degree downgraded the proposed method (i.e. from 58.8 % to 57.9 %). However, it still holds a significantly better position than the baseline method (58.8% versus 50.6%).
In comparison to all the components, it is highly evident that "dropout" has the most important in deep learning. This is because without "dropout," deep learning did not have much impact on the baseline method (67.65 % versus 69.42 % in terms of average accuracy.).

V. CONCLUSION
In recent years, Deep learning is an emerging area in the healthcare industry. An improved model with multikernel support vector machine (MKSVM) for multitasking AD is proposed to increase the accuracy levels with high true positive and true negative values. The proposed model used four techniques, Principal Component Analysis (PCA), Stability Selection (SS), Dropout and Multitask deep learning with SVM. Good accuracy level is achieved with the help of introduced dropout technique in deep learning using MRI image modalities for the subject of AD, MCI, and HC. The future work is extended by evaluating multimodal images using multi-kernel classifier approach.