dc.description.abstract | Clustering is an attempt to form groups of similar objects, and it is a powerful tool for
discovering valuable underlying patterns in the data. When clustering on high dimensional
data, the algorithms can suffer from the curse of dimensionality. This is a problem that
occurs when data becomes sparse due to many dimensions, and can lead to poor clustering
performance. Dimensionality reduction methods (DRMs) are thus designed to help alleviate this issue. For a time-series that is a temporal set of points, each consecutive point
in time can be considered a dimension and therefore it belongs to high dimensional data.
Time-Series K-Means (TSK-Means) with Dynamic Time Warping (DTW) is an algorithm
that has been proven successful for clustering time-series. However, TSK-Means is computationally complex and might require substantial training time due to the potentially high
dimensionality of time-series.
This thesis studies the clustering of time-series data, provided by temperature sensors installed in refrigerators, trying to make it less computationally complex by the use of the
DRMs Principal Component Analysis (PCA), Time-Series Autoencoder (TSA), and SelfOrganizing Maps (SOM). We utilize these methods in combination with three clustering algorithms, namely, K-Means, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), and Agglomerative Hierarchical Clustering (AHC), to potentially find valuable patterns in the provided data. The clusters and patterns were evaluated on a theoretical and
practical level regarding the application of pattern recognition and detection in the domain
of refrigerator temperature monitoring and logging. This is an effort to improve refrigerator
maintenance and quality assurance, deviation management, and to potentially reduce food
loss.
The results indicate that TSK-Means outperforms any other combination of DRMs and
clustering algorithms when it comes to detecting patterns in the data, despite being more
computationally complex. Regardless, the use of DRMs simplified the clustering process of
time-series, and allowed the K-Means algorithm to detect patterns more efficiently than the
TSK-Means algorithm. The clusters and patterns that were discovered seem promising for
the application of deviation management and refrigerator quality assurance. | |