What is: End-to-End Neural Diarization?
Source | End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
End-to-End Neural Diarization is a neural network for speaker diarization in which a neural network directly outputs speaker diarization results given a multi-speaker recording. To realize such an end-to-end model, the speaker diarization problem is formulated as a multi-label classification problem and a permutation-free objective function is introduced to directly minimize diarization errors. The EEND method can explicitly handle speaker overlaps during training and inference. Just by feeding multi-speaker recordings with corresponding speaker segment labels, the model can be adapted to real conversations.