Application of Soft-Clustering Analysis Using Expectation Maximization Algorithms on Gaussian Mixture Model

Muthahharah, Andi Shahifah and Tiro, Muhammad Arif and Aswi, Aswi (2022) Application of Soft-Clustering Analysis Using Expectation Maximization Algorithms on Gaussian Mixture Model. Jurnal Varian, 6 (1). pp. 71-80. ISSN 2581-2017

	Text (Artikel Jurnal Application of Soft-Clustering Analysis) Application of Soft-Clustering Analysis Using ExpectationMaximization Algorithms on Gaussian Mixture Model.pdf - Published Version Download (449kB)
	Text (Bukti Korespondensi jurnal Application of Soft-Clustering Analysis) Application of Soft-Clustering Analysis Bukti korespondesi jurnal Varian.pdf - Other Download (567kB)
	Text (Lembar Hasil Penilaian Artikel Jurnal Application of Soft Clustering) Lembar Hasil Penilaian sejawat sebidang_Application of soft clustering.pdf Download (1MB)

Abstract

Research on soft-clustering has not been explored much compared to hard-clustering. Soft-clustering algorithms are important in solving complex clustering problems. One of the soft-clustering methods is the Gaussian Mixture Model (GMM). This study aims to determine the number of clusters formed by using the GMM method. The data used in this study is synthetic data on water quality indicators obtained from the Kaggle website. The stages of the GMM method are: imputing the Not Available (NA) value, checking the data distribution, conducting a normality test, standardizing the data and estimating the parameters with the Expectation Maximization (EM) algorithm. The best number of clusters is based on the biggest value of the Bayesian Information Creation (BIC). The results showed that the best number of clusters was 3 clusters. Cluster 1 consisted of 1110 observations with lowquality category, cluster 2 consisted of 499 observations with medium quality category, and cluster 3 consisted of 1667 observations with high-quality category. The results of this study recommend that the GMM method can be grouped correctly when the variables used are generally normally distributed. This method can be applied to real data, both in which the variables are normally distributed or mixture of Gaussian and non-Gaussian

Item Type:	Article
Subjects:	FMIPA > STATISTIKA - (S1) FMIPA KARYA ILMIAH DOSEN Universitas Negeri Makassar > KARYA ILMIAH DOSEN
Divisions:	KOLEKSI KARYA ILMIAH UPT PERPUSTAKAAN UNM MENURUT FAKULTAS > KARYA ILMIAH DOSEN KARYA ILMIAH DOSEN
Depositing User:	Dr. Aswi Aswi
Date Deposited:	03 Apr 2023 07:07
Last Modified:	14 Apr 2023 01:47
URI:	http://eprints.unm.ac.id/id/eprint/27557

Actions (login required)

View Item