基于BOOSTING框架的视觉语音多模态情感识别检测方法.docVIP

下载本文档

3
0
约8.93千字
约 9页
2018-01-03 发布于河北
举报
版权申诉

基于BOOSTING框架的视觉语音多模态情感识别检测方法.doc

1、本文档共9页，可阅读全部内容。
2、有哪些信誉好的足球投注网站（book118）网站文档一经付费（服务费），不意味着购买了该文档的版权，仅供个人/单位学习、研究之用，不得用于商业用途，未经授权，严禁复制、发行、汇编、翻译或者网络传播等，侵权必究。
3、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。如您付费，意味着您自己接受本站规则且自行承担风险，本站不退款、不进行额外附加服务；查看《如何避免下载的几个坑》。如果您已付费下载过本站文档，您可以点击这里二次下载。
4、如文档侵犯商业秘密、侵犯著作权、侵犯人身权等，请点击“版权申诉”（推荐），也可以打举报电话：400-050-0827(电话支持时间：9:00-18:30)。
5、该文档为VIP文档，如果想要下载，成为VIP会员后，下载免费。
6、成为VIP后，下载本文档将扣除1次下载权益。下载后，不支持退款、换文档。如有疑问请联系我们。
7、成为VIP后，您将拥有八大权益，权益包括：VIP文档下载权益、阅读免打扰、文档格式转换、高级专利检索、专属身份标志、高级客服、多端互通、版权登记。
8、VIP文档为合作方或网友上传，每下载1次，网站将根据用户上传文档的质量评分、类型等，对文档贡献者给予高额补贴、流量扶持。如果你也想贡献VIP文档。上传文档

基于BOOSTING框架的视觉语音多模态情感识别检测方法

基于BOOSTING框架的视觉语音多模态情感识别检测方法张芬成都理工大学信息科学与技术学院 X 关注成功！加关注后您将方便地在我的关注中得到本文献的被引频次变化的通知！新浪微博腾讯微博人人网开心网豆瓣网网易微博摘????要：情感识别技术是智能人机交互的重要基础, 它涉及计算机科学、语言学、心理学等多个研究领域, 是模式识别和图像处理领域的研究热点。鉴于此, 基于Boosting框架提出两种有效的视觉语音多模态融合情感识别方法:第一种方法将耦合HMM (coupled HMM) 作为音频流和视频流的模型层融合技术, 使用改进的期望最大化算法对其进行训练, 着重学习难于识别的 (即含有更多信息的) 样本, 并将Ada Boost框架应用于耦合HMM的训练过程, 从而得到Ada Boost-CHMM总体分类器;第二种方法构建了多层Boosted HMM (MBHMM) 分类器, 将脸部表情、肩部运动和语音三种模态的数据流分别应用于分类器的某一层, 当前层的总体分类器在训练时会聚焦于前一层总体分类器难于识别的样本, 充分利用各模态特征数据间的互补特性。实验结果验证了两种方法的有效性。关键词：情感识别; 表情识别; Boosting方法; 情感数据库; 作者简介：张芬 (1975—) , 女, 硕士研究生, 副教授。从事计算机图形图像处理、数据库等应用研究。收稿日期：2017-05-09 基金：四川省软件工程专业卓越工程师质量工程项目支持 (11100-14Z00327) Boosting framework based multi-modal emotion recognition detection methods fusing vision and speech ZHANG Fen School of Information Science and Technology, Chengdu University of Technology; Abstract： As the important basis of intelligent human-computer interaction, the emotion recognition technology relates to the computer science, linguistics, psychology and other research fields, and is a research hotspot in pattern recognition and image processing fields. Based on the Boosting framework, two effective multi-modal emotion recognition methods fusing vision and speech are proposed. In the first method, the coupled hidden Markov model (HMM) is taken as the model-layer fusion technology of audio and video streams, and the improved expectation maximization algorithm is used to train it, and pay attention to the learning of the samples which are difficult to recognize emphatically; the Ada Boost framework is applied to the training process of HMM coupling to get the Ada Boost-CHMM general classifier. In the second method, the multi-layer Boosted HMM (MBHMM) classifier is constructed, and the data streams with the modals of facial expression, shoulder movement and speech are respectively applied to the classifier of a certain layer. The current layer′s overall classifier while training will focus on the sample