Cross-Language Source Code Re-Use Detection

Autores UPV
Año
CONGRESO Cross-Language Source Code Re-Use Detection

Abstract

Assuming a source code as a piece of text with its syntax and formal structure, we aim at applying models for text re-use detection to source code. In this paper we compare models which do not rely on external resources for measuring cross-language similarity cross-language character n-grams, pseudo-cognateness, word count ratio, against corpora-dependent models: cross-language explicit semantic analysis and cross-language alignment-based similarity analysis.