LitPC: A set of tools for building parallel corpora from literary works

Antoni Oliver, Sergi Álvarez

Producció científica: Capítol de llibreCapítolRecercaAvaluat per experts

2 Descàrregues (Pure)

Resum

In this paper, we describe the LitPC toolkit, a variety of tools and methods designed for the quick and effective creation of parallel corpora derived from literary works. This toolkit can be a useful resource due to the scarcity of curated parallel texts for this domain. We also feature a case study describing the creation of a Russian-English parallel corpus based on the literary works by Leo Tolstoy. Furthermore, an augmented version of this corpus is used to both train and assess neural machine translation systems specifically adapted to the author's style.
Idioma originalAnglès
Títol de la publicacióCTT 2024 - 1st Workshop on Creative-text Translation and Technology, Proceedings
EditorsBram Vanroy, Marie-Aude Lefer, Lieve Macken, Paola Ruffo
EditorEuropean Association for Machine Translation
Pàgines21-31
Nombre de pàgines11
ISBN (electrònic)9781068690730
Estat de la publicacióPublicada - 2024

Sèrie de publicacions

NomCTT 2024 - 1st Workshop on Creative-text Translation and Technology, Proceedings

Fingerprint

Navegar pels temes de recerca de 'LitPC: A set of tools for building parallel corpora from literary works'. Junts formen un fingerprint únic.

Com citar-ho