Warped K-Means: An algorithm to cluster sequentially-distributed data

Autores UPV
Año
Revista INFORMATION SCIENCES

Abstract

Many devices generate large amounts of data that follow some sort of sequentiality, e.g., motion sensors, e-pens, eye trackers, etc. and often these data need to be compressed for classification, storage, and/or retrieval tasks. Traditional clustering algorithms can be used for this purpose, but unfortunately they do not cope with the sequential information implicitly embedded in such data. Thus, we revisit the well-known K-means algorithm and provide a general method to properly cluster sequentially-distributed data. We present Warped K-Means (WKM), a multi-purpose partitional clustering procedure that minimizes the sum of squared error criterion, while imposing a hard sequentiality constraint in the classification step. We illustrate the properties of WKM in three applications, one being the segmentation and classification of human activity. WKM outperformed five state-of- the-art clustering techniques to simplify data trajectories, achieving a recognition accuracy of near 97%, which is an improvement of around 66% over their peers. Moreover, such an improvement came with a reduction in the computational cost of more than one order of magnitude.