Abstract
Many devices generate large amounts of data that follow some sort of sequentiality, e.g.,
motion sensors, e-pens, eye trackers, etc. and often these data need to be compressed for
classification, storage, and/or retrieval tasks. Traditional clustering algorithms can be used
for this purpose, but unfortunately they do not cope with the sequential information
implicitly embedded in such data. Thus, we revisit the well-known K-means algorithm
and provide a general method to properly cluster sequentially-distributed data. We present
Warped K-Means (WKM), a multi-purpose partitional clustering procedure that minimizes
the sum of squared error criterion, while imposing a hard sequentiality constraint in the
classification step. We illustrate the properties of WKM in three applications, one being
the segmentation and classification of human activity. WKM outperformed five state-of-
the-art clustering techniques to simplify data trajectories, achieving a recognition accuracy
of near 97%, which is an improvement of around 66% over their peers. Moreover, such an
improvement came with a reduction in the computational cost of more than one order of
magnitude.