Skip to main content



K-means Clustering in SAS

Clustering of wheat kernel features using K-Means method
Grouping observations in to clusters require algorithms to compute distance between each pair of observations. Euclidean distance and Manhattan distances are the classical methods for this analysis. Clustering can be either hard or soft, based on the method of canning the data observations. Hard method, involves data points assigned in to a cluster completely e.g. in this case each wheat kernel type in 3 groups of Kama, Rosa, and Canadian. Soft method, however requires probability of data point to be in those clusters, i.e. probability of wheat kernel in Kama cluster or other.
K-means Clustering: It is an unsupervised iterative learning algorithm that creates clusters of the data elements based on their similarity. Algorithm is used to find the patterns in the data. K-means clustering requires, though, the number of clusters to be obtained from the data. The algorithm groups the observations from the data randomly into k groups su…

Latest Posts

LASSO Regression in SAS

Random Forest using SAS