Adzkia Nur Nasution; Ardilla Syafitri Lubis; Keysa Shifa Adwitia Sitepu; Rezkya Nadilla Putri; Arnita Arnita

doi:10.32832/jurma.v9i1.2501

Adzkia Nur Nasution Universitas Islam Negeri Sumatera Utara
Ardilla Syafitri Lubis Universitas Negeri Medan
Keysa Shifa Adwitia Sitepu Universitas Negeri Medan
Rezkya Nadilla Putri Universitas Negeri Medan
Arnita Arnita Universitas Negeri Medan

DOI: https://doi.org/10.32832/jurma.v9i1.2501

Abstract

This study aims to identify iris flower species and detect outlier data using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) grouping method on the Google Colab platform. The data used were iris datasets from the UCI Machine Learning Repository, which consisted of three species: Setosa, Versicolor, and Virginica, with attributes such as sepal length and width and petals. In this study, the DBSCAN process includes the pre-processing stage of data, parameter determination, model building, and visualization of clustering results. DBSCAN was chosen because it is able to detect outliers and does not require a predetermined number of clusters, making it effective for irregular data. The results showed that DBSCAN managed to group the data into three main clusters, with clear identification of outliers. Cluster 0 includes all Setosa data, while cluster 1 consists of Versicolor and Virginica data. The -1 cluster, which contains data that is considered an outlier, suggests that some specimens have unusual characteristics. In conclusion, the DBSCAN method is effective in grouping iris flower data based on density and detecting different data points.

Flower Species Grouping to Find Out Outliers Using DBSCAN Clustering on Google Colab

Abstract