Skip to main content

Showing 1–12 of 12 results for author: Landini, F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.07816  [pdf, other

    eess.AS cs.CL cs.SD

    Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio

    Authors: Lin Zhang, Xin Wang, Erica Cooper, Mireia Diez, Federico Landini, Nicholas Evans, Junichi Yamagishi

    Abstract: This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Counte… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  2. arXiv:2402.19325  [pdf, other

    cs.SD eess.AS

    Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?

    Authors: Lin Zhang, Themos Stafylakis, Federico Landini, Mireia Diez, Anna Silnova, Lukáš Burget

    Abstract: In this paper, we apply the variational information bottleneck approach to end-to-end neural diarization with encoder-decoder attractors (EEND-EDA). This allows us to investigate what information is essential for the model. EEND-EDA utilizes attractors, vector representations of speakers in a conversation. Our analysis shows that, attractors do not necessarily have to contain speaker characteristi… ▽ More

    Submitted 20 June, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted to Odyssey 2024. This arXiv version includes an appendix for more visualizations. Code: https://github.com/BUTSpeechFIT/EENDEDA_VIB

  3. arXiv:2312.04324  [pdf, other

    eess.AS cs.SD

    DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors

    Authors: Federico Landini, Mireia Diez, Themos Stafylakis, Lukáš Burget

    Abstract: Until recently, the field of speaker diarization was dominated by cascaded systems. Due to their limitations, mainly regarding overlapped speech and cumbersome pipelines, end-to-end models have gained great popularity lately. One of the most successful models is end-to-end neural diarization with encoder-decoder based attractors (EEND-EDA). In this work, we replace the EDA module with a Perceiver-… ▽ More

    Submitted 1 June, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted by IEEE/ACM Transactions on Audio, Speech, and Language Processing

  4. arXiv:2310.02732  [pdf, ps, other

    eess.AS cs.SD

    Discriminative Training of VBx Diarization

    Authors: Dominik Klement, Mireia Diez, Federico Landini, Lukáš Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara

    Abstract: Bayesian HMM clustering of x-vector sequences (VBx) has become a widely adopted diarization baseline model in publications and challenges. It uses an HMM to model speaker turns, a generatively trained probabilistic linear discriminant analysis (PLDA) for speaker distribution modeling, and Bayesian inference to estimate the assignment of x-vectors to speakers. This paper presents a new framework fo… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: Submitted to ICASSP 2024

  5. arXiv:2309.08377  [pdf, other

    eess.AS cs.CL cs.SD

    DiaCorrect: Error Correction Back-end For Speaker Diarization

    Authors: Jiangyu Han, Federico Landini, Johan Rohdin, Mireia Diez, Lukas Burget, Yuhang Cao, Heng Lu, Jan Cernocky

    Abstract: In this work, we propose an error correction framework, named DiaCorrect, to refine the output of a diarization system in a simple yet effective way. This method is inspired by error correction techniques in automatic speech recognition. Our model consists of two parallel convolutional encoders and a transform-based decoder. By exploiting the interactions between the input recording and the initia… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  6. arXiv:2305.13580  [pdf, other

    eess.AS cs.SD

    Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization

    Authors: Marc Delcroix, Naohiro Tawara, Mireia Diez, Federico Landini, Anna Silnova, Atsunori Ogawa, Tomohiro Nakatani, Lukas Burget, Shoko Araki

    Abstract: Combining end-to-end neural speaker diarization (EEND) with vector clustering (VC), known as EEND-VC, has gained interest for leveraging the strengths of both methods. EEND-VC estimates activities and speaker embeddings for all speakers within an audio chunk and uses VC to associate these activities with speaker identities across different chunks. EEND-VC generates thus multiple streams of embeddi… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted at Interspeech 2023

  7. arXiv:2211.06750  [pdf, other

    eess.AS cs.SD

    Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization

    Authors: Federico Landini, Mireia Diez, Alicia Lozano-Diez, Lukáš Burget

    Abstract: End-to-end diarization presents an attractive alternative to standard cascaded diarization systems because a single system can handle all aspects of the task at once. Many flavors of end-to-end models have been proposed but all of them require (so far non-existing) large amounts of annotated data for training. The compromise solution consists in generating synthetic data and the recently proposed… ▽ More

    Submitted 24 February, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

    Comments: Accepted by ICASSP 2023

  8. arXiv:2204.00890  [pdf, other

    eess.AS cs.SD

    From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization

    Authors: Federico Landini, Alicia Lozano-Diez, Mireia Diez, Lukáš Burget

    Abstract: End-to-end neural diarization (EEND) is nowadays one of the most prominent research topics in speaker diarization. EEND presents an attractive alternative to standard cascaded diarization systems since a single system is trained at once to deal with the whole diarization problem. Several EEND variants and approaches are being proposed, however, all these models require large amounts of annotated d… ▽ More

    Submitted 25 June, 2022; v1 submitted 2 April, 2022; originally announced April 2022.

    Comments: Accepted at Interspeech 2022

  9. arXiv:2012.14952  [pdf, other

    eess.AS cs.SD

    Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks

    Authors: Federico Landini, Ján Profant, Mireia Diez, Lukáš Burget

    Abstract: The recently proposed VBx diarization method uses a Bayesian hidden Markov model to find speaker clusters in a sequence of x-vectors. In this work we perform an extensive comparison of performance of the VBx diarization with other approaches in the literature and we show that VBx achieves superior performance on three of the most popular datasets for evaluating diarization: CALLHOME, AMI and DIHAR… ▽ More

    Submitted 29 December, 2020; originally announced December 2020.

    Comments: Submitted to Computer Speech and Language, Special Issue on Separation, Recognition, and Diarization of Conversational Speech

  10. arXiv:2010.11718  [pdf, ps, other

    eess.AS cs.SD

    Analysis of the BUT Diarization System for VoxConverse Challenge

    Authors: Federico Landini, Ondřej Glembek, Pavel Matějka, Johan Rohdin, Lukáš Burget, Mireia Diez, Anna Silnova

    Abstract: This paper describes the system developed by the BUT team for the fourth track of the VoxCeleb Speaker Recognition Challenge, focusing on diarization on the VoxConverse dataset. The system consists of signal pre-processing, voice activity detection, speaker embedding extraction, an initial agglomerative hierarchical clustering followed by diarization using a Bayesian hidden Markov model, a reclust… ▽ More

    Submitted 9 February, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Accepted to ICASSP 2021

  11. arXiv:2002.11356  [pdf, ps, other

    eess.AS

    BUT System for the Second DIHARD Speech Diarization Challenge

    Authors: Federico Landini, Shuai Wang, Mireia Diez, Lukáš Burget, Pavel Matějka, Kateřina Žmolíková, Ladislav Mošner, Anna Silnova, Oldřich Plchot, Ondřej Novotný, Hossein Zeinali, Johan Rohdin

    Abstract: This paper describes the winning systems developed by the BUT team for the four tracks of the Second DIHARD Speech Diarization Challenge. For tracks 1 and 2 the systems were mainly based on performing agglomerative hierarchical clustering (AHC) of x-vectors, followed by another x-vector clustering based on Bayes hidden Markov model and variational Bayes inference. We provide a comparison of the im… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

  12. arXiv:1910.08847  [pdf, ps, other

    eess.AS

    BUT System Description for DIHARD Speech Diarization Challenge 2019

    Authors: Federico Landini, Shuai Wang, Mireia Diez, Lukáš Burget, Pavel Matějka, Kateřina Žmolíková, Ladislav Mošner, Oldřich Plchot, Ondřej Novotný, Hossein Zeinali, Johan Rohdin

    Abstract: This paper describes the systems developed by the BUT team for the four tracks of the second DIHARD speech diarization challenge. For tracks 1 and 2 the systems were based on performing agglomerative hierarchical clustering (AHC) over x-vectors, followed by the Bayesian Hidden Markov Model (HMM) with eigenvoice priors applied at x-vector level followed by the same approach applied at frame level.… ▽ More

    Submitted 19 October, 2019; originally announced October 2019.