python进行语音分离和说话人识别_语音识别分离说话人

作者：木道寻08 | 2024-08-08 21:00:24

踩

语音识别分离说话人

场景：

一段音频中有多个说话人，将不同的人说的话分离出来
已知一些人的语音特征，跟分离出来的片段，分别求特征的余弦距离，余弦距离最小的作为说话的人

安装：

pip install pyannote.audio


# _*_ coding: utf-8 _*_
 
import torch
from pyannote.audio import Model, Pipeline, Inference
from pyannote.core import Segment
from scipy.spatial.distance import cosine
 
 
def extract_speaker_embedding(pipeline, audio_file, speaker_label):
    diarization = pipeline(audio_file)
    speaker_embedding = None
    for turn, _, label in diarization.itertracks(yield_label=True):
        if label == speaker_label:
            segment = Segment(turn.start, turn.end)
            speaker_embedding = inference.crop(audio_file, segment)
            break
    return speaker_embedding
 
# 对于给定的音频，提取声纹特征并与人库中的声纹进行比较
def recognize_speaker(pipeline, audio_file):
    diarization =

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/木道寻08/article/detail/949891