高級檢索

雙模態跨語料庫語音情感識別

Bimodal cross-corpus speech emotion recognition

  • 摘要: 語音情感識別(SER)在雙模態的跨數據庫語音情感識別研究較少🤹🏿‍♀️,跨數據庫情感識別過度減少數據集之間差異的同時,會忽視情感判別能力的特征的問題🌀。YouTube數據集為源數據♛,互動情感二元動作捕捉數據庫(IEMOCAP)為目標數據。在源數據和目標數據中,Opensmile工具箱用來提取語音特征,將提取的語音特征輸入到CNN和雙向長短期記憶網絡(BLSTM)🧑🏼‍🌾,來提取更高層次的特征,文本模態為語音信號的翻譯稿。首先雙向編碼器表示轉換器(Bert)把文本信息向量化👉🏼,BLSTM提取文本特征,然後設計模態不變損失來形成2種模態的公共表示空間。為了解決跨語料庫的SER問題⛳️,通過聯合優化線性判別分析(LDA)👨🏼‍🔧、最大平均差異(MMD)👶、圖嵌入(GE)和標簽回歸(LSR),學習源數據和目標數據的公共子空間。為了保留情緒辨別特征,情感判別損失與MMD+GE+LDA+LSR相結合🧑🏼‍🦱。SVM分類器作為遷移公共子空間的最終情感分類,IEMOCAP上的實驗結果表明😰,此方法優於其他先進的跨語料庫和雙模態SER.

     

    Abstract: In the field of speech emotion recognition(SER), a heterogeneity gap exists between different modalities and most cross-corpus SER only uses the audio modality. These issues were addressed simultaneously. YouTube datasets were selected as source data and an interactive emotional dyadic motion capture database (IEMOCAP) as target data. The Opensmile toolbox was used to extract speech features from both source and target data, then the extracted speech features were input into Convolutional Neural Network (CNN) and bidirectional long short-term memory network (BLSTM) to extract higher-level speech features with the text mode as the translation of speech signals. Firstly, BLSTM was adopted to extract the text features from text information vectorized by Bidirectional Encoder Representation from Transformers (BERT), then modality-invariance loss was designed to form a common representation space for the two modalities. To solve the problem of cross-corpus SER, a common subspace of source data and target data were learned by optimizing Linear Discriminant analysis (LDA), Maximum Mean Discrepancy (MMD) and Graph Embedding (GE) and Label Smoothing Regularization (LSR) jointly. To preserve emotion-discriminative features, emotion-aware center loss was combined with MMD+GE+LDA+LSR. The SVM classifier was designed as a final emotion classification for migrating common subspaces. The experimental results on IEMOCAP showed that this method outperformed other state-of-art cross-corpus and bimodal SER.

     

/

返回文章
返回
摩臣5娱乐专业提供:摩臣5娱乐😺、摩臣5摩臣5平台等服务,提供最新官网平台、地址、注册、登陆、登录、入口、全站、网站、网页、网址、娱乐、手机版、app、下载、欧洲杯、欧冠、nba、世界杯、英超等,界面美观优质完美,安全稳定,服务一流💿,摩臣5娱乐欢迎您。 摩臣5娱乐官網xml地圖