Author: Okfalisa, Mustakim, Ikbal Gazalba, Nurul Gayatri Indah Reza

Publish: 2017 2nd International Conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE)

Abstract:

Data mining is the process of handling information from a database which is invisible directly. Data mining is predicted to become a highly revolutionary branch of science over the next decade. One of data mining techniques is classification. The most popular classification technique is KNearest Neighbor (KNN). But there is also the Modified KNearest Neighbor (MKNN) classification algorithm which is the derived algorithm of KNN. In this paper we will analyze the comparison of KNN and MKNN algorithms to classify the data of Conditional Cash Transfer Implementation Unit (Unit Pelaksana Program Keluarga Harapan) which consist of 7395 records. Comparative analysis is based on the accuracy of both algorithms. Before classification, K-Fold Cross Validation was done to search for the optimal data modeling resulted in data modeling on cross 2 with accuracy of 93.945%. The results of KFold Cross Validation modeling will be the model for training data samples and testing data to test KNN and MKNN for classification. C assification result produced accuracy based on the rules of confusion matrix. The test resulted in the highest accuracy of KKN by 94.95% with average accuracy during the test was 93.94% and the highest accuracy of MKNN was 99.51% with the average accuracy during the test was 99.20%, almost all testing from the first test up to the tenth, MKNN algorithm is superior and has better accuracy value than KNN so it can be analyzed that the ability of MKNN algorithm in accuracy is better than KNN. It can be concluded that MKNN algorithm is capable of handling accuracy better for classification than KNN algorithm, by ignoring other aspects such as computerization, time efficiency, and algorithm effectiveness.

Conclusion:

Comparative analysis of both MKNN and KNN was done with the aim of knowing the accuracy capability for classification from the two algorithms. And also to know the optimal data patterns obtained from k-fold cross validation into ideal data training and data testing. As was found in Table 3 and Figure 2, good data modeling will affect data accuracy. The calculation of accuracy uses the rules of Confusion Matrix. A good data modelling in this paper was found in cross 2 with accuracy of 93.945%. This modeling will be taken and used as training data and testing data to be tested in KNN and MKNN to analyze the accuracy ratio with result that the highest accuracy of KNN was 94.95% with average accuracy during the test was 93.94% while MKNN’s highest accuracy was 99.51% and average accuracy during the test was 99.20%. so it can be said the ability of MKNN algorithm is better in terms of accuracy with the difference in accuracy by 5-7%.

Sumber Gambar