A comparative study of codification techniques for clustering heart disease database

Autores UPV
Año
Revista Biomedical Signal Processing and Control

Abstract

This paper compares various proposals for codifying categorical attributes in a heart disease database so that numerical clustering algorithms can be applied to them. An approach for the codification of categorical attributes based on polar coordinates is proposed. This is compared with other codifications and methods for clustering mixed databases found in the literature. Our proposal has many advantages: it is relatively easy to understand and apply; the increment in the length of the input matrix is not excessively large; and the committed error is under control. The proposed codification has been combined in this case with the well-known k-means algorithm and has shown a very good performance in a heart disease database benchmark. © 2010 Elsevier Ltd. All rights reserved.