An Implementation of RCA to Find High Accuracy & Least Error Rate Using Weka Tool over the Various Data Sets

— Data mining is the approaching inquiries range to solve a number of difficulties and classify datasets is one of the main in the applied of data mining. Data mining denotes to mining knowledge as of big size of records. In this implementation, many datasets are paralleled with RCA (Reliability Classification Algorithm). This classification is used to sort each in set of data in to one of the established in advance set of classes. The data sets are arranged and collected from the Educational Sector, Health care & Agriculture etc., The current study designed to do the great performance result of decision tree Reliability Classification Algorithm using freely available data mining WEKA tool over the different data sets. A reliability classification algorithm for a several data sets are chosen based on excessive classification accuracy and minimum fault rate. This work has been carried out to make a performance evaluation of RCA. The results in this paper demonstrate that the efficiency of RCA is good to perform over various datasets better than other classifiers.

of patient treatment and to save human resources. There are various data mining techniques [11] such as Association, Classification, Clustering, Neural Network and Regression.

II. RELATED WORK A. Improved J48 Classification: Prediction of Diabetes
In this paper the authors Gaganjot Kaur & Amit Chhabra et al. [3] discussed about efficient data mining procedure for predicting diabetes patient medical records using Improved J48 classification. It increase accuracy rate of collected data sets. J48 reduce the classification errors which are being produced by in the diabetes data sets. This implementation programmed in WEKA as API in Matlab. The experimental result is improved J48 is effectively predicting the diabetes from medical records. Using J48 it proved that the algorithm can achieve accuracy up 99.87% with comparing the existing classification methods. B. Using various Classification Techniques on Healthcare datasets to find performance analysis In this implementation made by Shelly Gupta, Dharminder Kumar and Anand Sharma et al. [1] described on highest classification accuracy and least error rate over the healthcare datasets. In this paper, they are using three various Machine Learning tools namely WEKA, Tanagra & Clementine. Using this tool, the authors carried out the performance analysis of various decision tree algorithms, kNN, SVM, NB, MLP & CART on particular datasets. The outcome of this implementation on the specific datasets depending on the nature of their attributes and size. From the results two prediction methods that are "SVM & kNN" very nearly related with high accuracy rate (96.74% & 97.28%) displays.

C. Estimation of missing values using decision tree approach
The authors Gimpy, Dr. Rajan Vohra & Minakshi et al. [5] detailed discussed with missing data or value in a data from the prepared datasets. In this paper they take to estimate of missing data in student records of university using C4.5/J48 classification algorithm and this approach can implementing by data mining tool named WEKA. The input of the data set format is MS.Excell and WEKA is converted in to .csv format. The particular input dataset processed by using Matrix calculation called Confusion Matrix. A confusion matrix contains information about actual and predicted classifications done by a J48 classification system. The J48 algorithm is used and accuracy is calculated for both incomplete data and the imputed data. And as a result accuracy is greater for imputed dataset as compared to incomplete dataset.

D. Decision Tree Approach to Detect Characteristic of Bt Cotton Base on Soil Micro Nutrient
In this paper, the authors Youvrajsinh Chauhan & Jignesh Vania [7] presented to improve crop production and identify crop disease with helps soil systems, used throughout a large amount of crop fields or areas in the environment of agriculture. In this concept a J48 classification to make true decision making on agriculture. To determine and predict true result from the dataset by using data mining machine learning tool WEKA. It gives more accuracy results of predicts soil fertility. The all soil or crop dataset calculating with Bt Cotton gives different ranging values that show yellow crop disease is available or not. This paper proposed algorithm is gives 83.74% true classified results. So, this paper proved to the J48 algorithm is provide highest true result of 91.90% for predicts the crop production and crop disease identification.

E. To Predict Slow Learners in Education Sector Based Data Mining Classification Algorithm
This paper focused on identifying the slow learners among students [9] and displaying it by a predictive data mining model using classification based algorithms such as MLP, NB, SMO, J48 and REPTree, by using open source machine learning tool WEKA. In this research taken to process 152 high school dataset from educational data mining. The dataset inserted in to the attribute evaluator. The data declare in to some variables. Then the author applied classification algorithms to compare, find the output and also applied variables in Ranker Search method technique on WEKA tool. The dataset tested with five classification algorithms and that are provided accuracy results. Finally, the author investigate that MPL (Multi Layer Perception) technique performs best with accuracy 75%. Therefore, performance of MLP is relatively higher than other classification algorithms.
III. METHODOLOGY A.
WEKA Tool Waikato Environment for Knowledge Learning WEKA [12] is a computer software package that was established by the student of the University of Waikato in New Zealand for the resolve of classifying data from large data gather round from agricultural fields. Data preprocessing, classification, grouping, association, regression and feature selection these standard data mining tasks are supported by Weka. It is an open source software which is easily available in web.
In Weka datasets should have arranged to the ARFF file format. The Weka Explorer will use these mechanically if it does not identify an assumed file as an ARFF file format. Classify tab in Weka Explorer is used for the classification purpose to classify data. A huge varying sum of classifiers are used in weka such as bayes, function, tree etc.
Process to put on classification methods on data set and come to be end result in Weka:  Process 1: Bring input dataset and convert specify file format.
 Process 2: Apply the Reliability Classification Algorithm on the collected data set.
 Process 3: Remark the degree of accuracy given by the RCA and time required for execution.
 Process 4: Accuracy provided with Reliability classification algorithms for particular dataset.
The experiments are conducted in a system with configuration Intel Core Processor, 2 GB DDR3 Memory and 500 GB HDD. Experiments are conducted 3 times and an average accuracy and time is recorded. • Choose any suitable test selection.
• Go to Start button & output will be showed.

B. RCA-Reliability Classification Algorithm
A Reliability Classification decision tree carry out the classification of a specified data sampling through different stages of decisions to support us reach a final decision. Such a structure of decisions is on behalf of in a tree structure. The tree structure is used in classifying indefinite data records.
All dataset to be studied will be of the definite type and therefore continuous data will not be examined at this stage. The algorithm will however leave room for adaption to include this capability. The algorithm will be tested against for verification purposes.
In Weka, the application of a certain learning algorithm is summarized in a class, and it may be determined by on other classes for some of its functionality. RCA class builds a tree structure data. Each time the Weka executes RCA, it generates an example of this class by assigning retention for building and storing a decision tree classifier. The algorithm, the classifier it builds, and a procedure for outputting the classifier is all part of that instantiation of the RCA class.
Larger datasets are usually split into more than one class. The RCA class does not essentially cover any code for building a tree structure. It consist of positions to instances of other classes that do most of the work. When there are a number of classes as in Weka software they become difficult to comprehend and navigate.

C. Datasets
There are three datasets we have used in our paper. The details of each datasets are shown in Table 1. The data sets used for the tests come from the UCI Machine Learning repository. We are dealing with classification tasks, thus we have selected datasets of which the class values are nominal. Selection of the datasets further depended on their size, larger data sets generally means higher confidence. We choose different kinds of data sets, because we also wanted to test if the performance of an algorithm depended on the kind of set that is used.

Datasets
IV. RESULT For calculating a classifier superiority we can use confusion matrix. Consider the algorithm RC running on various dataset in WEKA, for this dataset we obtain three classes then we have 3x3 confusion matrix. The all data set are applied to the RCA for classify the data and established for constructed the model by using a training model for classify the training data set and see the outcomes of the correctly classified instances. Apply data set to RCA, it gives better result by experiment with high accuracy and low error rate. The confusion matrix helps us to find the various evaluation measures like Accuracy, Recall, Precision etc. For using Education dataset, accuracy parameters have shown in Table 2 and Fig 2. RC Algorithm is better way to provide accuracy for this dataset.  Various decision tree algorithms can be used for prediction and classification for different datasets. This studies showed that Reliability Classification Algorithm gives 83.74 % accuracy; hence it can be used as a base learner. We make better prediction model that help to improve prediction and classification of data.
V. CONCLUSION This research has conducted a study on a various dataset which is using data mining toolkit Weka to find high accuracy and low error rate. After analyzing the results, we found that are able to generate tree model in very less time. Weka tools is very efficient in generating decision trees. However, in terms of classifiers applicability, we conclude that the Weka tool is better in terms of the ability to run the classifier and in terms of error rate. Also, Weka is faster than other tree generation as its internal structure is organized in columns in memory. Through this study, we conclude that Weka is better tool for our proposed Reliability Classification Algorithm to predict the data. Also, we found that Reliability Classification Algorithm works well in decision tree induction. In future, we can implement this algorithm with more data and larger set of patient records, Educational records and Agriculture records to produce better results better than other classification algorithms.