Flower Species Grouping to Find Out Outliers Using DBSCAN Clustering on Google Colab

  • Adzkia Nur Nasution Universitas Negeri Medan
  • Ardilla Syafitri Lubis Universitas Negeri Medan
  • Keysa Shifa Adwitia Sitepu Universitas Negeri Medan
  • Rezkya Nadilla Putri Universitas Negeri Medan
  • Arnita Arnita Universitas Negeri Medan
Kata Kunci: DBSCAN Clustering, Flower Species Clustering, Google Colab, Outlier Detection

Abstrak

This study aims to identify iris flower species and detect outlier data using the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) grouping method on the Google Colab platform. The data used were iris datasets from the UCI Machine Learning Repository, which consisted of three species: Setosa, Versicolor, and Virginica, with attributes such as sepal length and width and petals. In this study, the DBSCAN process includes the pre-processing stage of data, parameter determination, model building, and visualization of clustering results. DBSCAN was chosen because it is able to detect outliers and does not require a predetermined number of clusters, making it effective for irregular data. The results showed that DBSCAN managed to group the data into three main clusters, with clear identification of outliers. Cluster 0 includes all Setosa data, while cluster 1 consists of Versicolor and Virginica data. The -1 cluster, which contains data that is considered an outlier, suggests that some specimens have unusual characteristics. In conclusion, the DBSCAN method is effective in grouping iris flower data based on density and detecting different data points.

Diterbitkan
2025-06-04
Bagian
Artikel