Performance Evaluation and Prediction of Parallel Big Data using MOA

— Currently, massive data has been compiled and examined at explosive volumes. The vast volume of data stream structured by current technological sources in up-to-the-minute society has amplified at an incredible pace, stimulating processing capacity and data curation. The traditional event-driven simulation models to investigate huge data can be no longer used but the paradigm to decision-making is now taken into all activities in the society, business applications, and scientific accounts. The overhead of data collection, redundant storage and processing improvement cost are on the rise for big data applications. In addition, there are technical aspects such as data inconsistency, redundancy, privacy, scalability, time-series and unusefulness [1]. Parallel processing is an essential method particularly for quantifying scale and time-series experiments. Simulation results for several datasets fit prediction results. Speedup and cost effective analysis will be considered as performance metrics as well.

service at individual processing unit. At this point it is to identify a partition which pinpoints situated concurrency. To route subtask to any processors indicates how they are to be transferred over parallel processing units based upon queuing discipline and thus checks load balancing in each directions. The latter can be inferred by the compiler.
III. PREDICTION METHOD A queuing network involves several servers which offer services of some mechanism to incoming customers. Customers who experience all servers busy will have to wait for their turn at queue lines attached to servers. Therefore, these waiting lines are called queuing systems. There are numerous examples which can be defined as queuing systems, such as communication networks, banking service, computer systems, production systems and etc. A queuing model can be characterized by six components. Firstly, the customer's arrival rate will be described by interarrival time and the distribution function. In general practice customer will land regarding to a Poisson fashion (exponential interarrival time). Customer may arrive individually or substantially. An example of bulky arrivals is the immigration office at the border where tourist's passports must individually be controlled. Secondly, the behavior of customers may be patiently waiting in line or impatiently leaving after a certain while. Especially, call centers may face the hang-up customers when they cannot wait until next operator avails, whether or not they will try calling again. Thirdly, it is the service time which is independent of, in general, exponential distribution. Service time may depend on queue size. For example, the processing hours of an individual customer at the bank can be amplified if the number of waiting customers is too outsized. Fourthly, the service discipline explains how each customer will receive a service, one on one or bulk. Many patterns are assumed based upon the order in which customers come in, such as first come first serve, last come first serve, random sequence, priorities, or time-sharing. Fifthly, the service capacity can be either a single server or several servers handling the customers. Lastly, the waiting place can be limited to number of waiting customers in queue. For example, in computer network, only limited packets can be buffered at a router. The proper buffer space is an essential in the network design.
Analytical model used for prediction in this research will be brought up. First, the big data can be decomposed by n autonomous data sets in which are called tasks. The parallel processing system comprises of n+2 servers, which are a server for partitioning (PS), a server for merging (MS) and n parallel processing devices. Big data arrives at PS with an exponential service time. At PS, big data will spawn n tasks evenly. Spitted tasks are referred to sibling. All siblings will proceed directly at zero delay time to parallel processing facility as shown in  Siblings are autonomously processed. Tasks with priority can help lower the processing time on parallel system as presented in [10] but subsequently it will reflect longer reassembly time. As sibling's processing is finished from parallel network, it leads to MS at once. The sibling must wait in the buffer for all siblings' completion before merging is active. Once all sibling completion is achieved, they are reassembled into big data. At this point, the execution process may keep iterating. The period of time from big data entering PS to big data leaving MS is referred to residual time (RT) as listed in Eq 1. Prediction method introduced in [4] is applied for collecting the performance measurements such as processing time and residual. These results will be used for comparing in subsequent section of the paper The M/M/n system depicted in Fig 2. is applicable for the employment of MOA. Big data arrives with an exponential interarrival time (λ) and the service time distribution follows exponential function (mean = μ). Big data splitter based on [3] will be measured and assumed to be an average B and merger based on the same is assumed to be an average M. Results from MOA simulation will be compared to those from prediction.
Where B is average time for partitioning, S n is time spent at the n th server of the n th task and M is average time for reassembly after the accomplishment of all tasks. Five datasets will be simulated on n parallel servers. Each dataset will be divided initially into n tasks regarding to targeted parallel processors. The residual time calculation will take splitting time and reassembly time into the account of both simulation and prediction. Splitting is computed based upon GSplitter developed by [3]. As reference, residual time can be presented by Eq 1. The comparison between simulation and prediction results executing on single up to twenty processing units (PUs) is shown in table 1. The same table lists residual time (RT) from simulation against those RT results from the prediction method. Prediction results are neighboring to those simulation results. Prediction may help simulate here-and-now results; nonetheless, involve less energy, time-consumption and budget than former methods.  Speedup efficiency is analyzed using metrics from simulation. The speedup improvement is achieved by preconfiguration set for the experiment. Simulations are performed on a parallel system which has n processors. Let RT n represent the residual time which expires from the beginning of the partitioning to the reassembly. Speedup analysis of the calculation's residual time emphasizes on these folds. The experiment run by n processors is the total number of processing devices. The intercommunication overhead from synchronization is identical to the run-time for the calculation on a single processor, symbolized by RT 1 . Speedup is the speed achievement done by parallel processing comparable to single processor: Speedup n = RT 1 ⁄ RT n . If the speedup is speedup(n) for input n parallelism, the speedup metric follows a regression. Simulation results conclude head-to-head line to prediction ones accordingly. The speedup efficiency is remarkable for parallel processing benefits. Costeffectiveness figure is next step of calculation to further analyze for price-performance, which is a cost of processing units over achieved benefits. Comparison results between simulation versus prediction are summarized in table 2. Both simulation and prediction results are close.
Speedup metrics have been visualized for the sake of healthier comprehension as shown in Fig 4. Speedup increases linearly regarding to the higher number of parallel processing units using MOA simulation. Further analysis on optimization can be achieved by employing these fundamental collective data. Optimization for instance an image using microwave technique had been obtainable in [9].  Cost-effectiveness experiment is an analysis of economic point of view which associates the cost of investment with obtainable performance from different sequences of experiment. It is used in engineering field for the purpose of which experiment will worth the investment most. Normally it is equated in terms of a ratio of performance achievement done by parallel processing comparable to number of processing units: Costeffectiveness n = performance of n processors ⁄ n. The cost-effectiveness can be realistic to the development and strategic planning of various organizations. It is also applied in several practices. In the purchase of computer networks, for instance, network architecture is associated not only for computer costs, but also for factors such as their speedups, transaction processing rate, and bandwidth. If a network performance of a supplier is inferior to the competitor, but considerably cheaper and friendlier to use, top management may choose the inferior one due to cost effective analysis. On the other hand, if the difference of price scale is next to zero, but the higher bandwidth, more processing power and better speedup top management may select the other company as an alternative due to similar cost effectiveness concept. Cost effective visualization is presented in Fig 5. As results, the higher number of processors will be more cost effective accordingly. However, in the environment of both four and eight processing units, it is found that cost-effectiveness drops to approximately 80%. The tendency of decreasing follows both simulation and prediction results. The impact may be caused by particular dataset characteristics which will not approve the partition of both four and eight tasks accordingly.

Residual
V. CONCLUSIONS In this research, to conquer the computing load in big data curation is focused. In data curation for a big file, smaller tasks are recommended regarding to the parallel processing power as a huge dataset contains most statistical data. Thus the big data is firstly partitioned into independent tasks while the division time will be taken into account. The experiments succeed with increases from 1 up to 20 processing power with huge data volume. Next, parallel processing is applied accordingly before the reassembly of these parallel results. Similarly, merging time is considered. To obtain the performance of parallel processing system, simulation is used for five different datasets. Besides, in order to avoid computing burden, prediction method is proposed. Comparison results conclude prediction method contributes evenly for simulation. Lastly, prediction and simulation results are in our head-to-head comparison. Future works include optimization based on costeffective metrics and simulation with different configuration settings for machine learning as well as deep learning.