Handwriting word recognition using windowed Bernoulli HMMs

Autores UPV


idden Markov Models (HMMs) are now widely used for off-line handwriting recognition in many lan- guages. As in speech recognition, they are usually built from shared, embedded HMMs at symbol level, where state-conditional probability density functions in each HMM are modeled with Gaussian mixtures. In contrast to speech recognition, however, it is unclear which kind of features should be used and, indeed, very different features sets are in use today. Among them, we have recently proposed to directly use columns of raw, binary image pixels, which are directly fed into embedded Bernoulli (mixture) HMMs, that is, embedded HMMs in which the emission probabilities are modeled with Bernoulli mix- tures. The idea is to by-pass feature extraction and to ensure that no discriminative information is filtered out during feature extraction, which in some sense is integrated into the recognition model. In this work, column bit vectors are extended by means of a sliding window of adequate width to better capture image context at each horizontal position of the word image. Using these windowed Bernoulli mixture HMMs, good results are reported on the well-known IAM and RIMES databases of Latin script, and in particular, state-of-the-art results are provided on the IfN/ENIT database of Arabic handwritten words.