Efficient Time Series Clustering: A Distance-Based Feature Engineering Framework with Minimal Hyperparameter Tuning

dc.contributor.advisorKhreich, Wael
dc.contributor.authorEl Ajouz, Marwa
dc.contributor.commembersNasr, Walid
dc.contributor.commembersOlleik, Majd
dc.contributor.commembersTaleb, Sirine
dc.contributor.degreeMSBA
dc.contributor.departmentSuliman S. Olayan School of Business
dc.contributor.facultySuliman S. Olayan School of Business
dc.contributor.institutionAmerican University of Beirut
dc.date2024
dc.date.accessioned2024-09-05T09:41:53Z
dc.date.available2024-09-05T09:41:53Z
dc.date.issued2024-09-04T21:00:00Z
dc.date.submitted2024-09-03T21:00:00Z
dc.description.abstractTime series clustering is a critical tool used to extract valuable insights from time series data. However, challenges accompany time series clustering due to time series unique properties, such as noise and data shifts. One major challenge lies in selecting appropriate distance measures used for clustering algorithms, significantly impacting the overall clustering performance. This research introduces an improved time series clustering approach based on a novel feature extraction technique that is founded on an enhanced vector-based distance measure. Our feature extraction process, named DBFE, converts time series data into distance-based feature vectors using the enhanced distance measure, which is both efficient and hyperparameter-free, overcoming time series challenges while remaining robust to noise, outliers, and simple shifts in data. Experimental results show that our proposed approach enhances clustering performance compared to state-of-the-art methods. When tested on 22 time series datasets and compared with traditional clustering approaches, clustering over DBFE resulted in better clustering results on 18 datasets, equivalent results on two datasets, and only failed on two datasets, one of which is not suitable for clustering and the other is too small to evaluate on. DBFE has also been expanded to multivariate data and, hence, is suitable for a wider range of time series applications in various domains such as medicine, finance, and marketing. By applying this enhanced clustering approach, researchers could more accurately discover patterns, detect anomalies, and recognize dynamic changes in data.
dc.identifier.urihttp://hdl.handle.net/10938/24578
dc.language.isoen
dc.subject.keywordsDistance-Based Feature Extraction (DBFE)
dc.subject.keywordsTime series
dc.subject.keywordsClustering
dc.subject.keywordsMPdist
dc.subject.lcshTime-series analysis
dc.subject.lcshCluster analysis--Data processing
dc.subject.lcshData mining
dc.subject.lcshExpert systems (Computer science)
dc.titleEfficient Time Series Clustering: A Distance-Based Feature Engineering Framework with Minimal Hyperparameter Tuning
dc.typeThesis
local.AUBID201803182

Files