This workflow reads in a dataset containing houses listed for sale and uses K-Means Clustering from Apache Spark ML to group listings.
Below is the workflow for creating a K-Means model for clustering the houses. It does the following:
Reads data from a sample dataset.
Prints the result.
Assembles the features for prediction.
Splits it.
Perform K-Means Clustering.
Prediction.
Print the prediction result.