On the use of a fuzzy classifier to speed up the Sp-ToBI labeling of the Glissando Spanish corpus

David Escudero-Mancebo, Lourdes Aguilar-Cuevas, César González-Ferreras, Yurena Gutierrez-Gonzalez, Valentín Cardenoso-Payo

Research output: Book/ReportProceedingResearchpeer-review

1 Citation (Scopus)

Abstract

In this paper, we present the application of a novel automatic prosodic labeling methodology for speeding up the manual labeling of the Glissando corpus (Spanish read news items). The methodology is based on the use of soft classification techniques. The output of the automatic system consists on a set of label candidates per word. The number of predicted candidates depends on the degree of certainty assigned by the classifier to each of the predictions. The manual transcriber checks the sets of predictions to select the correct one. We describe the fundamentals of the fuzzy classification tool and its training with a corpus labeled with Sp-ToBI labels. Results show a clear coherence between the most confused labels in the output of the automatic classifier and the most confused labels detected in inter-transcriber consistency tests. More importantly, in a preliminary test, the real time ratio of the labeling process was 1:66 when the template of predictions is used and 1:80 when it is not.

Original languageAmerican English
Number of pages8
ISBN (Electronic)9782951740884
Publication statusPublished - 2014

Publication series

NameProceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014

Keywords

  • Fuzzy classifier
  • Prosodic labeling
  • Sp-ToBI

Fingerprint Dive into the research topics of 'On the use of a fuzzy classifier to speed up the Sp-ToBI labeling of the Glissando Spanish corpus'. Together they form a unique fingerprint.

  • Cite this

    Escudero-Mancebo, D., Aguilar-Cuevas, L., González-Ferreras, C., Gutierrez-Gonzalez, Y., & Cardenoso-Payo, V. (2014). On the use of a fuzzy classifier to speed up the Sp-ToBI labeling of the Glissando Spanish corpus. (Proceedings of the 9th International Conference on Language Resources and Evaluation, LREC 2014).