连续数字语音识别系统的定点DSP实时实现
周_ 燕,张友纯,王_ 蕾
(中国地质大学信息工程学院,湖北 武汉 430074)
摘 要:针对各种连续数字语音信号,实现了一种基于TMS320C5x评价模块(EVM)和定点数字信号处理器ADSP2181的与特定人无关的连续数字语音识别系统。在分析了连续概率密度的隐马尔可夫模型(CDHMM)基础上,利用LPC倒谱系数、LPC差分倒谱系数、能量归一化系数及其差分系数作为语音特征矢量,训练和识别采用Viterbi算法和Baum-Welch重估算法,并利用ADSP2181实现语音识别的算法。有效地提高了系统的识别率。给出了实现各个阶段所需的时间,比较了不同语音特征参数对识别率的影响。在具体实现中,着重处理了抗噪、定点实时实现及连续数字串识别人的身份等问题。实验结果表明,本系统在普通环境下取得较满意的效果,正确识别率达到93.2%,为其实用化提供了较为重要的技术途径。
关键词:隐马尔可夫模型;定点实时识别;倒谱系数;差分倒谱系数
中图分类号:TN912.3_ 文献标识码:A_ 文章编号:1811-8755(2004)0685
Connected Digits Speech Recognition system Based on DSP Real-Time Implementation
ZHOU yan, ZHANG you-chun, WANG lei
(China University of Geosciences ,Wuhan 430074,China)
Abstract:A speaker-independent speech recognition system of connected digits was developed based on TMS320C5x Evaluation Module (EVM). Cepstrum coefficients, derivative coefficients of cepstrum, log energy normalization coefficient and derivative coefficient of log energy were used as feature vector of speech recognition on the basis of analyzing continuous density hidden Markov model(CDHMM). The system uses Vterbi and Baum-Welch reestimation algorithms as training and recognition algorithms, which improve the recognition accuracy greatly. In the paper, Used a Connected Digits Speech Recognition called ADSP2181. and then compared according to different speech characteristic parameters. In specific implementation, resistant-noises and real-time performance were considered emphatically. Experimental results show that the recognition rate of this system is about 93.2%, which provides an important method for its practical purposes.
Key words:hidden Markov model; speaker recognition; cepstrum coefficients; derivative coefficients of cepstrum
当今社会是数字信息化时代,信用卡号码、电话语音拨号、个人身份证号码、电子密码等都具有数字化特征,因此,连接数字语音识别成为语音识别中极其重要的一项任务。一方面,连续数字自动语音识别可以识别用户说出的数字串,向用户提供最自然、最灵活和最经济的人机接口界面,从而能有效解决军用和民用领域中遇到的大量数据录入问题;另一方面,由于电话网络的日益普及,连续数字自动语音识别可用于电话人口统计,远程股票交易及各种号码的远程认证等。
本文从语音识别的响应时间、鲁棒性、抗噪能力及训练时间等方面综合考虑,实现了一个连接数字语音自动定点实时识别系统。