Compiling texts for a specialized corpus in the biochemistry domain: theoretical and methodological aspects

Autores UPV
Revista Procedia - Social and Behavioral Sciences


At the present time it is practically unthinkable to carry out a linguistic study without resorting to a corpus. In accordance with the type of study we wish to perform, we will compile a set of texts based on pre-established criteria (type of documents selected (on divulgation, research, notes, subject syllabus, etc.); authors responsible for contents, etc.) that will enable us to compile a good-quality and reliable linguistic study. Our work will include an account of the process of the compilation of specialized texts in the German language in the field of biochemistry and describe the creation of the corpus. The fact that most of the biochemistry texts are now published in English, and that biochemistry is now a multidisciplinary field, makes it difficult to compile texts and consequently complicates the design of the corpus itself. Basing our work on the conceptual structure of the domain under study, we will define our project and use a set of criteria that will guarantee a textual corpus that will be representative of the selected sub-field and that will subsequently facilitate the extraction of specialist terminology (Cabré, 1999; Adelstein, 2004).