Novel clustering selection criterion for fast binary key speaker diarization

Héctor Delgado, Xavier Anguera, Corinne Fredouille, Javier Serrano

Research output: Contribution to journalArticleResearchpeer-review

3 Citations (Scopus)

Abstract

Speaker diarization has become an important building block in many speech-related systems. Given the great increase of audiovisual media, fast systems are required in order to process large amounts of data in a reasonable time. In this regard, the recently proposed speaker diarization system based on binary key speaker modeling provides a very fast alternative to state-of-the-art systems at the cost of a slight decrease in performance. This decrease is mainly due to drawbacks in the final clustering selection algorithm, which is far from returning the optimum clustering the system is actually able to generate. At the same time, we have identified potential points of our system which can be further sped up. This paper aims to face these two issues by first lightening the processing at the main identified bottleneck, and second by proposing an alternative clustering selection technique capable of providing near-optimum clustering outputs. Experimental results on the REPERE test database validate the effectiveness of the proposed improvements, obtaining a relative performance gain of 20% and execution times of 0.037 xRT (being xRT the Real-Time factor).

Original languageAmerican English
Pages (from-to)3091-3095
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2015-January
Publication statusPublished - 1 Jan 2015

Keywords

  • Binary key
  • Cosine distance
  • Elbow criterion
  • Speaker diarization
  • Within-class sum of squares

Fingerprint Dive into the research topics of 'Novel clustering selection criterion for fast binary key speaker diarization'. Together they form a unique fingerprint.

Cite this