Solar Radiation Prediction for Dimensioning Photovoltaic Systems Using Artificial Neural Networks

this paper presents a prediction model of solar radiation for dimensioning photovoltaic generation systems in the Atlantic Coast of Colombia, using artificial neural networks. As a case of study is presented the municipality “El Carmen de Bolivar” located in this region. To obtain the model, the average data of daily temperature, relative humidity and solar radiation from the last ten years, reported by weather stations in this city were used. Six neural networks were designed with six variants of input variables (temperature, humidity and month) and the output variable (solar radiation). The best result was obtained using all input variables. In the training process, the correlation index (R) between solar radiation estimated by the model and the recorded data was 0.8. In validating the correlation index was 0.77. KeywordModelling, prediction, solar radiation, artificial neural networks, photovoltaic generation systems

daily maximum and minimum air temperature and saturation vapor pressures at the maximum and minimum temperature, providing good results.
In [2] a work developed in Barranquilla, Colombia is reported. Several statistical models are analyzed to determine the meteorological variables that are correlated with solar radiation. The analyzed variables are temperature, relative humidity and sunshine hours. As a result it was found that the best correlation is obtained with the relative humidity, the rest of the variables have a virtually constant behavior throughout the year. Solar radiation can also be estimated by using artificial neural networks (ANN). This artificial intelligence technique has been used in many engineering applications [12], [13]. Its application to the prediction of solar radiation is based on historical data collection, and with network training processes, allowing predictions for the design of photovoltaic generation systems. Scientific literature reports the development of several models of ANN, and in most cases the superiority of the use of this artificial intelligence technique is shown in relation to statistical methods.
In [14] several models of solar radiation estimation reported in literature that uses ANN were applied. The results show that the accuracy of outcomes depends on the combination of input parameters, training algorithms and architecture configuration. In [15] a prediction model of global solar radiation distribution on horizontal surfaces is presented. The designed model processes the correlation between weather conditions, the duration of daylight and peak value of radiation. As in [7], it is shown that the effectiveness of the results is highly dependent on the location of the weather station.
In [16] is shown that using temperature and humidity data as input, it can be obtained good results in the prediction of global solar radiation (GSR). In [17] is described the development of a model for estimating daily monthly diffuse solar radiation data of 9 stations having different climatic conditions distributed in different geographic locations in China using ANN. The results of this study demonstrated the model generalization ability in the rest of China. In [18] a model of neural network is presented using the features of interrelation of direct, diffuse and global solar radiation in India, developing an algorithm that includes the estimation of these radiations through clear sky conditions, and deviations due to random weather events.
In [19] six ANN models were developed for the estimation and modeling of daily global solar radiation. The input data were global radiation, diffuse radiation, air temperature and relative humidity. It was shown that results can be obtained with good accuracy using the sunshine duration and air temperature. In [20] an ANN is developed to estimate the hourly solar radiation in the province of Cordoba, Argentina. This model uses as input data the temperature, relative humidity, wind speed and rain.
There are several works in Colombia that use ANN for prediction of meteorological variables. However, in [21], [22] solar radiation is not studied, while in [23] a model for predicting solar radiation is proposed, but applied in a location outside Colombia.
This work proposes to implement an ANN model for obtaining daily solar radiation prediction in El Carmen de Bolivar, belonging to the Colombian Atlantic coast. In the proposed model, it is considered as input variables the daily average data of temperature, relative humidity and solar radiation from the last ten years, reported by weather stations in the city. As variable input is also used the month of the year. Before the neural networks design, an analysis of variability of studied parameters and a correlation between them are presented.
The contribution of this paper is to present a forecast analysis in a region where it has never done such a study, which constitutes a starting point for the projection of photovoltaic power plants in Colombia. The remainder of the paper is as follows. Material and methods are specified in section II. Section II shows the results got from the ANN process and their analysis. Finally, some conclusions are mentioned.

II. MATERIALS AND METHODS
The study is divided into two main stages. In the first stage a statistical analysis about the behavior of the used variables is achieved, and the correlation between these is done. The second step is the development of six neural networks, considering equal number of variants of combinations from input variables (temperature, humidity and month) and the output variable (solar radiation). To select the most appropriate network, behavioral analysis is performed during the training phase and validation.

A. Stage 1
In stage1, the following steps are executed: • Behavior analysis of the average for daily temperature, humidity and solar radiation per month.
• Statistical parameters determination of the average for daily temperature, humidity and radiation per month. • Behavior analysis of the average for daily temperature, humidity and solar radiation. • Statistical parameters determination of the average for daily temperature, humidity and radiation. • Correlation index (R) analysis between radiation vs temperature, radiation vs humidity, and humidity vs temperature. Six neural networks where developed in this step with input and output combinations shown in Table 1. For the development of networks, the "nntool" graphical interface from Matlab program was used. For all networks the same properties were used, in order to have a reference for comparing the performance of all networks and select the most appropriate. The applied properties were obtained from a sensitivity analysis and addressed criteria reported in specialized literature [15], [18], [24]. The properties of networks are presented in table 2. The developed neural network architecture neural network is shown in Fig. (1). the difference is the number of input variables. Networks 1 and 3 have an input variable. Networks 2, 4 and 5 have two input variables, while the network 6 has three input variables. Neural networks design was conducted in two phases; training and validation. In training phase, 70% of data were used, meanwhile the validation phase was carried out with 30% of the remaining data. This data selection method was applied in [24].
For the selection of the most appropriate network, were taken as criteria the minimum square error (mse) determined during the training process of each network and the correlation index (R) between the radiation values obtained by the model of networks and the real value. These criteria were applied in [16] and [19].
III. RESULTS AND DISCUSSION "El Carmen de Bolivar" is a city located inside the Department of Bolivar in the Colombian Atlantic Coast. It is at 114 km southeast of Cartagena de Indias. In this study, the daily average data of temperature, relative humidity and solar radiation from the last ten years were used. These data were collected from a weather station close to where the installation of photovoltaic power plant is projected. The weather station equipment is Colombia (IDEAM). Thus it meets the conditions suggested in [7] and [14], which recommends prediction studies near the measuring points of the meteorological variables. The results of the two stages above are presented.

A. Stage 1
In Fig. (2), the average variation of daily temperature, humidity and solar radiation per month is shown. This average was determined per month as the sum of each daily parameter, divided by the number of days in month. Table 3 shows the statistical parameters of minimum, maximum, their average and standard deviation corresponding to graphics.  It can be seen from table 3 according to the standard deviation values, that the parameters are not very variable, having the humidity the greater variation, this is also evident in the study reported in [2] for the city of Barranquilla, belonging to the same region. Nevertheless, although generally temperatures are high in Fig. (2-a), it is clear that the greatest records are reported from June to August, while December to February temperatures are lower. Regarding humidity, Fig. (2-b) shows that from February to May the lowest values are recorded, while in the remaining months values remain almost constant. According to Fig. (2-c), the highest solar radiation is registered in both June and July, while in November is lower.
Due to the observed influence of month in the behavior of meteorological variables, it is decided to incorporate it as one of the input variables in the development of neural networks.
In Fig. (3) the daily average record of the variables under study from the last ten years are shown. Table 4 shows the corresponding statistical parameters. The solar radiation annual average is 1371.5 kWh/m2, which represents a high potential of solar energy [11].  Table 4 shows similarly to the previous analysis of the behavior per month, that humidity is the most variable parameter. In Fig. (3) an instability in the values are observed at three cases. In the case of temperature, a defined pattern by the months of higher and lower temperature indicated above is observed. In the case of both humidity and radiation, it is not possible to clearly define a pattern in the variation. In order to identify whether there is relationship between the behavior of meteorological variables under study, a correlation analysis is performed between these. In Fig. (4) the correlation between radiation vs temperature, radiation vs. humidity, and humidity vs temperature are shown.
In Fig. 4 the low values observed from correlation coefficient (R) between the radiation vs temperature (0.004), radiation vs humidity (-0.4), and humidity vs temperature (0.1) shows that there is no simple relationship between these variables. This performance is also evident in [19]. For this reason, it is useful the artificial neural networks application as a tool to obtain a model for estimating solar radiation from other meteorological variables, such as humidity and temperature, also including the month of the year.   Table 5 shows the minimum square error (mse) obtained during the training stage per network and the correlation index (R) between the radiation obtained from the neural network model and the recorded radiation sample.  Fig. (5), Fig. (6) and table 5, it can be concluded that the best trained network, and therefore, which provides the best results is the network 6, with a correlation index of 0.8 and a minimum square error of 0.000203. These results are considered good, considering that no data was filtered. If network 5 with network 6 are compared, it is evident that considering the month in the model provided better results. As shown in Fig. (7), network 6 provides good results for the rest of the data that were not used in the training process, which validate applicability. This way the model can be used for dimensioning photovoltaic systems and analyze possible generation scenarios.

IV. CONCLUSIONS
Photovoltaic power generation is an alternative with great potential in Colombia, as a means of supplementing energy deficits due to natural phenomena such as "El Nino", which significantly affects heavily hydropower dependent systems.
This work shows the great potential offered by the Atlantic coast, where one of its areas, the city of "El Carmen de Bolivar", has a capacity of 1371.5 kWh/m2 per year and 4.81 kWh/m2 daily average.
In the evaluated models, was evident the importance of considering the month as an element that complements the use of meteorological variables, and thus improves the quality of the obtained model.
The obtained model with a correlation index of 0.8 during training and 0.7 in the validation, allows dimensioning photovoltaic systems and predict with good accuracy, the ability to generate daily, monthly and annually power to different scenarios of temperature and humidity.
The results show that it is possible to generalize the tool used in this work to tropical climates areas, such as the Colombian Atlantic coast, in which the installation of photovoltaic systems is projected. This is further facilitated by the requirement of few weather variables and the similarity in climate characteristics.