An Intelligent Model for Internet Advertising Selection Based on User-Profile

- With the growth of using the internet advertising, display error rate has been subsequently increased. As an instance of display error rate, it can be referred to advertisement inappropriate to user demand of modifying wrong advertising display. The most important problem related to marketing and advertising is to absolutely consider advertising true or false. To cope with such a problem, personalized advertising is made with respect to users’ profile and behavior in order that accurate internet advertising is selected, and each user receives her/his favorite internet advertising. In this study, we presented a new profile with the internet advertising in an online bookstore to students and gathered their responses. Then, we used decision tree in data mining applications and modeled two separated datasets in two states of with a profile and without a profile. The results obtained for both datasets revealed that users profile can highly influence proper classification of the internet advertising.

be targeted based on the interests of visitors. It is possible since users of virtual community are identified with their login and present personal information (age, gender, education, etc.) as well as behavioral information (sending or receiving invitation, comments, and times of use). Access to behavioral information has a certain competitive advantage for online social networks compared to other web portals. In the present research, we firstly investigate the potential superiority of behavioral data mining for web banners-based marketing campaign management. Then, we select the most appropriate data mining techniques for this particular issue.
The main problem is the optimization of banner advertising campaigns in marketing through targeting a proper user and the maximization of response analysis through the number of clicks. The issue of response analysis rate and marketing campaign optimization has been widely explained in data mining course book [12,13] and recently in online social network content [14].

Class Imbalance Problem
Class imbalance problem refers to a situation in which the number of objects of a class (a class of dependent variables) is obviously less than the number of other class objects. This problem is highly important particularly in response analysis in which customer's reaction (in this case, a click on banner) is significantly less than the number of messages (displays). Regarding marketing, churn models refer to gaining customers; while in other fields, they refer to fraud detection, medical diagnosis and so forth. Coping with this problem, there are two main approaches [15]: learning sample's (sampling techniques) structure change-based and costsensitive algorithms. Researchers propose one class learning in case of strong class imbalance problem [16]. This problem is due to the fact that gathering information about other class is sometimes difficult and domain nature automatically suffers from imbalance. Sometimes, creating classifiers using the items belonging to a class is successful sometimes. Some writers [17] distinguish cost-sensitive learning and ensemble classifiers, i.e. bootstrap procedure (bagging and random forests). Although this approach can include cost-sensitive learning algorithms, they are based on CART algorithm [18] (classification and regression trees) and employ misclassification costs and probably, CART.

Sampling Techniques of the Imbalanced dataset
Up-sampling (or over-sampling) is to reiterate items which belong to the minority class. This fact can occur randomly, directly or through synthetic cases, e.g. SMOTE algorithm [19]. Downsampling (undersampling or down-sizing) is to decrease the number of cases which belong to the majority class. Sometimes, over-represented cases related to redundant samples [20] are omitted based on Tomek's link [21].

Cost-sensitive Learning
Cost-sensitive learning is another approach which can contribute to overcoming class imbalance problem. The purpose of building such classifications is to increase the accuracy of predicting cases which belong to the given class. Researchers should allocate various costs to objects misclassification. [22] have detected two classes of cost-sensitive learning. One of them is a set of direct algorithms such as cost-sensitive decision tree and the other is cost-sensitive meta learning methods including CSC (cost-sensitive classification), ET (empirical threshold) or cost-sensitive naive Bayes. The two classes are different in facing bias data when they define misclassification costs.
For example, TN stands for true negative; that is, an object which belongs to negative class has been classified as negative. Since TN and TP refer to correct classification, costs are allocated to FN and FP. Creating classifiers for a dichotomous dependent variable often offer researchers to focus on positive class; therefore, the cost for FN should be higher than FP.
In other words, it is very important to decrease the error of positive class misclassification. If a higher cost is allocated to FN, the individual considers refusing to classify a positive object as a negative object. [23] emphasizes that costs cannot be merely monetarily considered.

Classification Methods
Data mining models such as single classification tree (CART algorithm),RF(random forest) and gradient tree boosting are widely used in marketing to evaluate selection. All these methods can employ a cost misclassification and detect prior probabilities. [24] proposed CAERT which is a recursive partitioning algorithm. This algorithm is used to build a classification tree, in case of the presence of nominal dependent variable, and a regression tree, in case of a continuous dependent variable. The purpose of the test is to predict customers' responses, which means to develop a classification model. To sum up, a graphic model of a tree can be presented as a set of if-then rules.
Visualizing a model is a very important advantage of this analysis approach in marketing. Prediction is an important task for marketing managers, but knowledge is vital in the considered area. Although CART has been proposed about 30 years ago, important features such as prior probabilities and misclassification costs cause to be useful in cost-sensitive learning.

2Advertising design
We select some of the online bookstore advertising. Using the experts' opinions, we determine marketing and advertising principles based on the references [1] and [10] to provide the content for each of the mentioned features. Table 1 shows the extracted features of advertising with content selection.

Profile design
The next and the most important stage of extracting features is users' profile. Most of the studies have used standard profiles existing in sites such as job, gender, age, education, field of study, and so forth [7]. Since personalizing is performed based on users' profile, it is necessary to consider other alternatives to increase accuracy. To this end, we gather two alternatives from various articles and put into our profile [4], [7]. The first alternative is the number of times a person announces it false after receiving advertising. This alternative is different for different individuals such that a person may announce an internet advertisement false in the first stage; however, another person may announce the same internet advertising false in high frequencies. The main cause of placing such an alternative in users' profile is gray internet advertising. The second alternative which is placed and questioned in users' profile is the ratio of person's tolerable errors in selecting which can be acceptable.
In fact, different persons have different behavioral features. Some of them state that none of their valid internet advertising should be falsely selected and in return, they accept receiving some daily false displays. On the contrary, some individuals are not willing to receive any false display although some of their valid internet advertising is falsely selected. In fact, different individuals can be behaviorally detected using these two alternatives. This part is in accordance with the first step, i.e. extracting features from the internet advertising. Table 2 shows the output of predicting accuracy coefficient of the model based on the predictor features in target variable row and evaluative adaptive matrix.

Statistical Population
Since our purpose was to refine and classify the internet advertising, it is necessary that the content of advertising is in the same regard. To carefully conduct the study, we first consider our statistical population and then, provide the content of the internet advertising. The statistical population includes academic community and students. Therefore, it is necessary to select the area in which the population has information and willingness. Accordingly, we select online bookstore advertising as the statistical sample. An internet advertising sample is obtained from the Cartesian product of the values presented in Table 1 such that 150 = 5 * 5 * 3 * of the internet advertising frame is obtained. The number of profile items is 10 and each item can have a different value. Each item of simulated advertising include users' profile and 150 designed internet advertising and response label. After designing through the web, the items of simulated advertising are answered by 150 students. After cleaning, 98 people were used (60 females and 38 males). In the following, we randomly split the gathered data and to evaluate the framework, we use the two following datasets:  The first type dataset: 2500 internet advertising which includes 1800 false displays.  The second type dataset: 8000 internet advertising which includes 1700 false displays.

User Behavior Classification
The final objective of the present research is to map the customers into two groups of avoidant (a user who leaves the page) or non-avoidant (a user who stay in the page). In the study, "object", refers to the customer and "class", refers to persistent planar or customer avoidance.

Constructing Decision Tree
Considering the fact that the patterns extracted from decision tree model are as sequences of if-then rules provides more efficient context to formulate marketing strategies for each of customers' class according to their demographic and behavioral features. Therefore, "decision tree" has been selected as the optimal alternative for the purpose of the study.
A decision tree can be constructed using various algorithms. We have used CHAID algorithm to construct the decision tree model. This algorithm organizes internal nodes of tree based on the correlation rate of each feature with target variable. To create leaf nodes, considering discrete, qualitative and divalent target variable of the research model, we have used independent test between the target variable and each of observation features to attribute each of the model observations to one of the two classes of the target variable (avoidant and persistent) based on observation features. To perform this test, we have formed an agreed-upon table for each feature. In the tables, the number of lines is correspondent to the rows of the feature and its two columns are correspondent to the rows of the target variable. We have computed the test statistic using the following formula: Where Oij indicates the expected frequency for the cell located in row i and column j; eij indicates the observed frequency of the cell located in row i and column j, and R indicates the number of table rows and C refers to the number of rows.

Results
After implementing the constructed model, it is the turn of the fourth step. In this step, we compare and evaluate the obtained results using the data obtained from the implementation. Figures 2 and 3 show the results obtained from the model for both sets.
In this table, we use common criteria of data mining to evaluate and compare. To prevent limiting comparison and evaluation merely to accuracy and involve two other types of error in the comparison, we employ some criteria such as FP Rate, Spam Recall and Spam Precision. a: a false display which has been predicted as false display d: a valid advertising which has been predicted as valid advertising b: a false display which has been predicted as valid advertising (FN) c: a valid advertising which has been predicted as false display (FP) Accuracy = (a+d)/(a+d+b+c) (1) Accuracy = 1 -Error Rate (2) FP Rate= c / (d+c) (3) Spam Recall = a/(b+a)   Table 3 to Table 11 measure the model prediction accuracy for each of the two states of the target variable in the adaptive matrix and totally evaluate the model predictions.  The group one includes free users what 90% of them leave the page after seeing advertising. The group two includes free students whose exit rate, compared to the previous group, is reduced by half.  Comparing Tables 8 and 9, we observe that academic users have very less avoidant. Comparing Tables 10 and 11, we observe that users with high educational degrees have very less avoidance.

Conclusion
Generally, there are few studies conducted on refining and classifying the internet advertising regarding marketing and advertising. Therefore, the presented research attempted to create a personalized advertising selector to estimate the importance the internet advertising and classifying users with respect to their behavior and profile. Classifying and refining through users' profile not only increases accuracy, but also decrease FP and FN errors. To implement the model, we used two separate datasets. In the selection step, features were determined such that our two proposed alternatives have the second and fourth selection. To implement each dataset, the internet advertising with a profile, we employed incomplete profile and without a profile. Then, comparing the determined evaluation criteria and the results obtained from implementing two separate datasets, we revealed that classifying the internet advertising with a profile has the highest accuracy. Other criteria mentioned n this comparison revealed that the increase of accuracy leads to the decrease of FP and FN errors. Decreasing these errors does not incurextra costs for advertising companies and users also receive their favorite internet advertising. In other words, some sort of compatibility is created among advertising selector, advertising companies and users' interest.