Author : Mustakim, Nurul Gayatri Indah Reza, Rice Novita, Oktaf Brillian Kharisma, Rian Vebrianto, Suwanto Sanjaya, Hasbullah, Tuti Andriani, Wardani Purnama Sari, Yulia Novita, Robbi Rahim

Publish : The 1st Workshop on Environmental Science,Society,and Technology. IOP Publishing. Journal of Physics:Conference Series 1363 (2019)

Abstract :

Open Access proceedings Journal of Physics: Conference series

Social media is one of the most common sources used to communicate, such as Twitter. Every tweet on Twitter contains data such as text which when collected can be processed into information. Data processed from Twitter tweet will create a trend which can be used for information such as in education, economics, politics, etc. This then created the concept of text mining. Text mining techniques are needed to find an interesting pattern in search of trends based on Twitter text with topics related to Pilkada Pekanbaru 2017. This research is intended to cluster Twitter text data using Density-Based Spatial Clustering of Application with Noise (DBSCAN) algorithm. This research was conducted with several experiments using different Eps and MinPts parameters for 2,184 text data which has been through several stages, such as cleaning, duplication removal, pre-processing like stemming and stopwords. Based on the highest average of Silhouette Index, Eps 0.1 and MinPts 10 with SI = 0.413 were chosen as paramaters, thus forming 31 clusters. According to the frequency of word occurrences in the cluster, the highest are “kpu”, followed by “firdaus”, “kota”, “pasang”, and “ayat”. As can be seen that the candidate pairs most often appear on cluster results are Firdaus-Ayat, and based on the results of Pilkada 2017, Firdaus-Ayat was chosen as Mayor and Vice Mayor of Pekanbaru.

Conclusion :

From clustering experiments using a combination of different Eps and MinPts from DBSCAN algorithm, Eps = 0.1 and MinPts = 10 were chosen based on the average validation of Silhouette Index thus produce 31 cluster, with cluster -1 labeled as noise and the value of SI is 0.413. The highest frequency of word occurrences in cluster are “kpu”, followed by “firdaus”, “kota”, “pasang”, and “ayat”. As can be seen that the candidate pairs most often appear on cluster results are Firdaus-Ayat, and based on the results of Pilkada 2017, Firdaus-Ayat was chosen as Mayor and Vice Mayor of Pekanbaru with 33.07% votes. The trend of social media in Pilkada Pekanbaru represent the real situation based on the result of Pekanbaru Mayor Election period 2017-2022, so it can be concluded that the pattern on predicting social media Twitter is similiar with the pattern of society in Pekanbaru. Furthermore, this research can be improved by optimizing the cleaning stage, pre-processing stages in text mining, and determination of Eps and MinPts parameters in order to obtain optimal results. And combine other algorithms such as association rules to know the correlation between words in each cluster and the classification algorithm to know the sentiments of word analysis.

Sumber Gambar