On the Use of PU Learning for Quality Flaw Prediction in Wikipedia

Autores UPV
Revista CLEF Conference on Multilingual and Multimodal Information Access Evaluation


In this article we describe a new approach to assess Quality Flaw Prediction in Wikipedia. The partially supervised method studied, called PU Learning, has been successfully applied in classi cations tasks with traditional corpora like Reuters-21578 or 20-Newsgroups. To the best of our knowledge, this is the rst time that it is applied in this domain. Throughout this paper, we describe how the original PU Learning approach was evaluated for assessing quality flaws and the modi cations introduced to get a quality aws predictor which obtained the best F1 scores in the task \Quality Flaw Prediction in Wikipedia" of the PAN challenge.