A STUDY ON SNA : TEXT MINING USING ACADEMIC SOCIAL NETWORKS

- Social network are sharing the knowledge from one to others. Now a day’s using social network is vast communications of people together. This is very useful in current era. Here in this paper mainly focuses on sharing the knowledge in research community. We have taken 2092356 research article and 80242869 citations among the researchers from various domains. This paper mainly focuses on the knowledge diffusion in research community. This knowledge diffusion not only homogeneous system but also heterogeneous system. Here measure the strength of research spectrum , authors contribution.

Layout of Text mining process II. LITERATURE SURVEY For academic search, several research issues have been intensively investigated, for example expert finding and association search. Expert finding is one of the most important issues for mining social networks. For example, both Nie et al. and Balog et al. [4] propose extended language models to address the expert finding problem. From 2005, Text REtrieval Conference (TREC) has provided a platform with the Enterprise Search Track for researchers to empirically assess their methods for expert finding . Association search aims at finding connections between people. For example, the ReferralWeb system helps people search and explore social networks on the Web. Adamic and Adar have investigated the problem of association search in email networks. However, existing work mainly focuses on how to find connections between people and ignores how to rank the found associations. In addition, a few systems have been developed for academic search such as, scholar.google.com, libra.msra.cn, citeseer.ist.psu,and Rexa.info. Though much work has been performed, to the best of our knowledge, the issues we focus on in this work (i.e., profile extraction, name disambiguation, and academic network modeling)have not been sufficiently investigated. Our system addresses all these problems holistically.

III. MATERIALS AND METHODS
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
In particular for educational purposes and research. Advantages of Weka include:  Free availability under the GNU General Public License.  Portability, since it is fully implemented in the Java programming language and thus runs on almost any modern computing platform.  A comprehensive collection of data preprocessing and modeling techniques.  Ease of use due to its graphical user interfaces. Weka supports several standard data mining tasks, more specifically, data preprocessing, clustering, classification, regression, visualization, and feature selection. All of Weka's techniques are predicated on the assumption that the data is available as one flat file or relation, where each data point is described by a fixed number of attributes (normally, numeric or nominal attributes, but some other attribute types are also supported). Weka provides access to SQL databases using Java Database Connectivity and can process the result returned by a database query. It is not capable of multi-relational data mining, but there is separate software for converting a collection of linked database tables into a single table that is suitable for processing using Weka. [4] Another important area that is currently not covered by the algorithms included in the Weka distribution is sequence modeling.
In this paper we are going to experiment and find out the algorithm to find out the best accuracy in the several classification methods. Here main part is find out the best classification methods in our dataset. Now we will apply the different types of classifications like as bayes, meta, misc, rules and trees .

IV. EXPERIMENTS AND RESULTS
Here we are going to apply the algorithms which are suitable for the dataset to find out the best algorithm for text mining of academic social network dataset. Whatever we collected three different folders large medium and small . we were dividing in this folder we follow 2 steps.
1. Using random selection process we collected 15000 records from individual files 2092356 records contain dataset. Using this step randomly we select records using rand() function in excel. Then only we move to next step of forming three different folders. 2. We applied text pre processing algorithm for dividing 3 different folders. i.e., large, medium, small.
Here in this step each and every folder contains 5000 individual text files. These text files are called individual records. pseudocode :text preprocess mining /* Conditions for inserting each records in each directory of folder #index is the starting field of each record set #% 0<size (number of % in each record) <5 then small folder #% 5<size(number of % in each record)<10 then medium folder #%10< size(number of % in each record) then large folder */ fileopen(a i ) for(#%) size =size+1 size if (0<size<5) then store (a i ) in small directory if (5<size<10) then store (a i ) in medium directory else size store(a i ) in large directory Here, a i -each records In this research work we classify several methods BayesNet, NaïveBayes, Attribute SelectedClassifiers, Dagging, DecisionStump, JRip, ZeroR, J48, HyperPipes, ComplimentNaiveBayes. Based on these classifications we find out the maximum accuracy of the result recommendation of the research work in our further research works in this dataset. After the applications of all the bayes, misc, meta, rules and meta functions now we are going to recommend or finalize the appropriate algorithm for the data set. That we can decide using the graphical representation of the tables. V. CONCLUSION In this paper mainly focuses on text mining process of Academic social networks. We classify J48 is the best classification method compare than other classifiers. In this research work J48 classification methods shows the maximum accuracy for the academic social network dataset. This can be extended to other datasets of different domains. Moreover one can be extend with other classifiers.