版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、<p><b> 中文3276字</b></p><p><b> 附錄</b></p><p><b> 英文原文:</b></p><p> Chinese Journal of Electronics</p><p> Vo1.15,No.3,July
2、 2006</p><p> A Speaker--Independent Continuous Speech</p><p> Recognition System Using Biomimetic Pattern Recognition</p><p> WANG Shoujue and QIN Hong</p><p> (La
3、boratory of Artificial Neural Networks,Institute ol Semiconductors,</p><p> Chinese Academy Sciences,Beijing 100083,China)</p><p> Abstract—In speaker-independent speech recognition,the disadv
4、antage of the most diffused technology(HMMs,or Hidden Markov models)is not only the need of many more training samples,but also long train time requirement. This Paper describes the use of Biomimetic pattern recognition(
5、BPR)in recognizing some mandarin continuous speech in a speaker-independent Manner. A speech database was developed for the course of study.The vocabulary of the database consists of 15 Chinese dish’s names, the length &
6、lt;/p><p> Key words—Biomimetic pattern recognition, Speech recogniton,Hidden Markov models(HMMs),Dynamic time warping(DTW).</p><p> I.Introduction</p><p> The main goal of Automati
7、c speech recognition(ASR)is to produce a system which will recognize accurately normal human speech from any speaker.The recognition system may be classified as speaker-dependent or speaker-independent.The speaker depend
8、ence requires that the system be personally trained with the speech of the person that will be involved with its operation in order to achieve a high recognition rate.For applications on the public facilities,on the othe
9、r hand,the system must be capable o</p><p> II.Introduction of Biomimetic Pattern Recognition and Multi—Weights Neuron Networks</p><p> Biomimetic pattern recognition</p><p> Tra
10、ditional Pattern Recognition aims at getting the optimal classification of different classes of sample in the feature space.However, the BPR intends to find the optimal coverage of the samples of the same type. It is fro
11、m the Principle of Homology—Continuity,that is to say,if there are two samples of the same class, the difference between them must be gradually changed. So a gradual change sequence must be exists between the two samples
12、. In BPR theory.the construction of the sample subspace o</p><p> 2.Multi-weights neuron and multi-weights neuron networks</p><p> A Multi-weights neuron can be described as follows: </p>
13、;<p> ,Where: are m-weights vectors;X is the input vector;is the neuron’s computation function; is the threshold;f is the activation function.</p><p> According to dimension theory, in the feature s
14、pace ,,the function construct a (n-1)-dimensional hypersurface in n-dimensional space which is determined by the weights .It divides the n-dimensional space into two parts.If is a closed hypersurface, it constructs a fi
15、nite subspace.</p><p> According to the principle of BPR,determination the subspace of a certain type of samples basing on the type of samples itself.If we can find out a set of multi-weights neurons(Multi-
16、weights neuron networks) that covering all the training samples,the subspace of the neural networks represents the sample subspace. When an unknown sample is in the subspace, it can be determined to be the same type of t
17、he training samples.Moreover,if a new type of samples added, it is not necessary to retrain anyon</p><p> III.System Description</p><p> The Speech recognition system is divided into two main
18、blocks. The first one is the signal pre-processing and speech feature extraction block.The other one is the Multi-weights neuron networks, which performs the task of BPR.</p><p> 1.Speech feature extraction
19、</p><p> Mel based Campestral Coefficients(MFCC) is used as speech features.It is calculated as follows:A/D conversion;Endpoint detection using short time energy and Zero crossing rate(ZCR);Preemphasis and
20、hamming windowing;Fast Fourier transform;DCT transform.The number of features extracted for each frame is 16,and 32 frames are chosen for every utterance.A 512-dimensiona1-Me1-Cepstral feature vector( numerical values) r
21、epresented the pronunciation of every word.</p><p> Multi-weights neuron networks architecture</p><p> As a new general purpose theoretical model of pattern Recognition, here BPR is realized b
22、y multi-weights neuron Networks. In training of a certain class of samples,an multi-weights neuron subNetwork should be established.The subNetwork consists of one input layer.one multi-weights neuron hidden layer and one
23、 output layer. Such a subNetwork can be considered as a mapping. ,Whereis the output of a Multi-weights neuron. There are m hidden Multi-weights neurons.i= 1,2, …,m, is the input vector.</p><p> IV .Trainin
24、g for MWN Networks</p><p> Basics of MWN networks training</p><p> Training one multi-weights neuron subNetwork requires calculating the multi-weights neuron layer weights.The multi-weights ne
25、uron and the training algorithm used was that of Ref.[4].In this algorithm,if the number of training samples of each class is,we can useneurons.In this paper,N =30.,is a function with multi-vector input,one scalar quanti
26、ty output.</p><p> Optimization method</p><p> According to the comments in IV.1,if there are many training samples, the neuron number will be very large thus reduce the recognition speed.In t
27、he case of learning several classes of samples, knowledge of the class membership of training samples is available. We use this information in a supervised training algorithm to reduce the network scales.</p><
28、p> When training class A,we looked the left training samples of the other 14 classes as class B. So there are 30 training samples in set and 420 training samples in set.Firstly select 3 samples from A, and we have
29、a neuron:.Let,where i= 1,2,…,30;</p><p> ,where j= 1,2,…420;,we specify a value ,.If ,removed from set A, thus we get a new set.We continue until the number of samples in set is,then the training is ended,
30、and the subNetwork of class A has a hidden layer of neurons.</p><p> V.Experiment Results</p><p> A speech database consisting of 15 Chinese dish’s names was developed for the course of study
31、. The length of each name is 4 Chinese words, that is to say, each sample of speech is a continuous string of 4 words, such as “yu xiang rou si”,“gong bao ji ding”,etc.It was organized into two sets:training set and test
32、 set. The speech signal is sampled at 16kHz and 16-bit resolution.</p><p> Table 1.Experimental result at of different values</p><p> 450 utterances constitute the training set used to train t
33、he multi-weights neuron networks. The 450 ones belong to 10 speakers(5 males and 5 females) who are from different Chinese provinces. Each of the speakers uttered each of the word 3 times. The test set had a total of 539
34、 utterances which involved another 4 speakers who uttered the 15 words arbitrarily.</p><p> The tests made to evaluate the recognition system were carried out on different from 0.5 to 0.95 with a step incr
35、ement of 0.05.The experiment results atof different values are shown in Table 1.</p><p> Obviously,the networks was able to achieve full recognition of training set at any.From the experiments,it was found
36、that achieved hardly the same recognition rate as the Basic algorithm. In the mean time, the MWNs used in the networks are much less than of the Basic algorithm. </p><p> Table 2.Experiment results of BPR b
37、asic algorithm</p><p> Experiments were also carried on to evaluate Continuous density hidden Markov models (CDHMM),Dynamic time warping(DTW) and Biomimetic pattern recognition(BPR) for speech recognition,
38、emphasizing the performance of each method across decreasing amounts of training samples as well</p><p> as requirement of train time. The CDHMM system was implemented with 5 states per word.Viterbi-algorit
39、hm and Baum-Welch re-estimation are used for training and recognition.The reference templates for DTW system are the training samples themselves. Both the CDHMM and DTW technique are implemented using the programs in Ref
40、.[11].We give in Table 2 the experiment results comparison of BPR Basic algorithm,Dynamic time warping (DTW)and Hidden Markov models (HMMs) method.</p><p> The HMMs system was based on Continuous density hi
41、dden Markov models(CDHMMs),and was implemented with 5 states per name.</p><p> VI.Conclusions and Acknowledgments</p><p> In this paper, A mandarin continuous speech recognition system based o
42、n BPR is established.Besides,a training samples selection method is also used to reduce the networks scales. As a new general purpose theoretical model of pattern Recognition,BPR could be used in speech recognition too,
43、and the experiment results show that it achieved a higher performance than HMM s and DTW. </p><p> References</p><p> [1]WangShou-jue,“Blomimetic (Topological) pattern recognition-A new model
44、of pattern recognition theory and its application”,Acta Electronics Sinica,(inChinese),Vo1.30,No.10,PP.1417-1420,2002.</p><p> [2]WangShoujue,ChenXu,“Blomimetic (Topological) pattern recognition-A new model
45、 of pattern recognition theory and its application”, Neural Networks,2003.Proceedings of the International Joint Conference on Neural Networks,Vol.3,PP.2258-2262,July 20-24,2003.</p><p> [3]WangShoujue,Zhao
46、Xingtao,“Biomimetic pattern recognition theory and its applications”,Chinese Journal of Electronics,V0l.13,No.3,pp.373-377,2004.</p><p> [4]Xu Jian.LiWeijun et a1,“Architecture research and hardware impleme
47、ntation on simplified neural computing system for face identification”,Neuarf Networks,2003.Proceedings of the Intern atonal Joint Conference on Neural Networks,Vol.2,PP.948-952,July 20-24 2003.</p><p> [5]
48、Wang Zhihai,Mo Huayi et al,“A method of biomimetic pattern recognition for face recognition”,</p><p> Neural Networks,2003.Proceedings of the International Joint Conference on Neural Networks,Vol.3,pp.2216-
49、2221,20-24 July 2003. </p><p> [6]WangShoujue,WangLiyan et a1,“A General Purpose Neuron Processor with Digital-Analog Processing”,Chinese Journal of Electornics,Vol.3,No.4,pp.73-75,1994.</p><p>
50、; [7]Wang Shoujue,LiZhaozhou et a1,“Discussion on the basic mathematical models of neurons in general purpose neuro-computer”,Acta Electronics Sinica(in Chinese),Vo1.29,No.5,pp.577-580,2001.</p><p> [8]Wan
51、gShoujue,Wang Bainan,“Analysis and theory of high-dimension space geometry of artificial neural networks”,Acta Electronics Sinica (in Chinese),Vo1.30,No.1,pp.1-4,2001.</p><p> [9]WangShoujue,Xujian et a1,“M
52、ulti-camera human-face personal identiifcation system based on the </p><p> biomimetic pattern recognition”,Acta Electronics Sinica (in Chinese),Vo1.31,No.1,pp.1-3,2003.</p><p> [10]Ryszard En
53、gelking,Dimension Theory,PWN-Polish Scientiifc Publishers—Warszawa,1978.</p><p> [11]QiangHe,YingHe,Matlab Porgramming,Tsinghua University Press,2002.</p><p><b> 中文翻譯:</b></p>
54、;<p> 電子學(xué)報(bào) 2006年7月15卷第3期</p><p> 基于仿生模式識(shí)別的非特定人連續(xù)語(yǔ)音識(shí)別系統(tǒng)</p><p><b> 王守覺(jué) 秦虹</b></p><p> ?。ㄖ袊?guó),北京 100083,中科院半導(dǎo)體研究所人工神經(jīng)網(wǎng)絡(luò)實(shí)驗(yàn)室)</p><p> 摘要:在非特定人語(yǔ)音識(shí)別中,隱馬爾
55、科夫模型(HMMs)是使用最多的技術(shù),但是它的不足之處在于:不僅需要更多的訓(xùn)練樣本,而且訓(xùn)練的時(shí)間也很長(zhǎng)。本文敘述了仿生模式識(shí)別(BPR)在小字表非特定人普通話連續(xù)語(yǔ)音識(shí)別中的應(yīng)用。我們專(zhuān)為此項(xiàng)研究建立了一個(gè)語(yǔ)音數(shù)據(jù)庫(kù)。數(shù)據(jù)庫(kù)中的詞匯包括15個(gè)中國(guó)菜名。每個(gè)菜名的長(zhǎng)度是4個(gè)漢字。我們使用了基于多權(quán)神經(jīng)元(MWN)模型的神經(jīng)網(wǎng)絡(luò)(NNs),來(lái)訓(xùn)練和辨識(shí)語(yǔ)聲。我們測(cè)試出了多權(quán)神經(jīng)元(MWN)的數(shù)量,在這個(gè)數(shù)量下,基于神經(jīng)網(wǎng)絡(luò)(NNs)的仿生
56、模式識(shí)別(BPR),能夠獲得最優(yōu)的性能。這個(gè)系統(tǒng)基于仿生模式識(shí)別(BPR),它能夠?qū)崿F(xiàn)實(shí)時(shí)識(shí)別,針對(duì)來(lái)自中國(guó)不同省份但是說(shuō)著相同的中國(guó)話的人們,最優(yōu)的一個(gè)選項(xiàng)的識(shí)別率達(dá)到98.14%,最優(yōu)的前兩個(gè)選項(xiàng)識(shí)別率達(dá)到了99.81%。我們還進(jìn)行了實(shí)驗(yàn),對(duì)語(yǔ)音識(shí)別中的CDHMM、DTW、BPR三種算法進(jìn)行了評(píng)估。實(shí)驗(yàn)結(jié)果顯示BPR優(yōu)于CDHMM和DTW,尤其是在有限長(zhǎng)度樣本的情況下。</p><p> 關(guān)鍵詞:仿生模式識(shí)
57、別,語(yǔ)音識(shí)別,隱馬爾科夫模型(HMMs),動(dòng)態(tài)時(shí)間規(guī)整(DTW)</p><p><b> 引言</b></p><p> 自動(dòng)語(yǔ)音識(shí)別的主要目標(biāo)是,構(gòu)建一個(gè)識(shí)別系統(tǒng),該系統(tǒng)能夠準(zhǔn)確識(shí)別來(lái)自任意說(shuō)話者的正常話語(yǔ)。識(shí)別系統(tǒng)可以被分為特定人識(shí)別和非特定人識(shí)別。為了獲得較高的識(shí)別率,特定人識(shí)別需要對(duì)實(shí)施語(yǔ)音操作的對(duì)象進(jìn)行單獨(dú)訓(xùn)練。在另一方面,為了應(yīng)用在公共設(shè)施上,該系統(tǒng)
58、必須能夠識(shí)別很多人發(fā)出的聲音,這些人有不同的性別、年齡、口音等等。在公共設(shè)施的基本領(lǐng)域,非特定人識(shí)別有很多更多的應(yīng)用。在非特定人語(yǔ)音識(shí)別中,隱馬爾科夫模型(HMMs)是最為廣泛使用的技術(shù),但是它的不足之處在于:不僅需要更多的訓(xùn)練樣本,而且訓(xùn)練的時(shí)間也很長(zhǎng)。自從王守覺(jué)先生第一次提出仿生模式識(shí)別(BPR)以來(lái),仿生模式識(shí)別(BPR)已經(jīng)被應(yīng)用在物體識(shí)別、人臉辨識(shí)和面部識(shí)別等等方面,并獲得了更好的性能。經(jīng)過(guò)一些修改之后,我們也能夠很容易地將這
59、個(gè)建模技術(shù),應(yīng)用在語(yǔ)音識(shí)別當(dāng)中。在本文中,我們提出了一個(gè)基于仿生模式識(shí)別(BPR)的實(shí)時(shí)普通語(yǔ)音識(shí)別系統(tǒng)。仿生模式識(shí)別(BPR)優(yōu)于隱馬爾科夫模型(HMMs),尤其是在有限長(zhǎng)度樣本的情況下。這是一個(gè)小詞匯量非特定人連續(xù)語(yǔ)音識(shí)別系統(tǒng)。整個(gè)系統(tǒng)是在windows98/2000/XP的PC環(huán)境下,利用同高精度雙權(quán)值突觸神經(jīng)</p><p> 2.對(duì)仿生模式識(shí)別(BPR)和多權(quán)神經(jīng)元網(wǎng)絡(luò)(MWNN)的簡(jiǎn)要介紹</
60、p><p> (1). 仿生模式識(shí)別(BPR)</p><p> 傳統(tǒng)的模式識(shí)別,旨在在特征空間里對(duì)不同種類(lèi)的樣本進(jìn)行最優(yōu)的分類(lèi)。然而仿生模式識(shí)別(BPR)是想要找到每一類(lèi)具有相同類(lèi)型的樣本的精確覆蓋。它的基礎(chǔ)是“類(lèi)內(nèi)連續(xù)性準(zhǔn)則”,也就是說(shuō),任意兩個(gè)屬于相同類(lèi)的樣本,它們的特征差異必定是漸變的。這樣,在這兩個(gè)樣本之間,必定存在無(wú)數(shù)個(gè)特征漸變的樣本點(diǎn)。在仿生模式識(shí)別(BPR)理論中,每個(gè)類(lèi)型
61、的樣本的樣本子空間的構(gòu)建,僅僅依賴(lài)于類(lèi)型本身。具體來(lái)講,就是一個(gè)特定類(lèi)型的樣本的樣本子空間的構(gòu)建,需要分析被訓(xùn)練樣本的類(lèi)型同在多維空間里對(duì)具有復(fù)雜的幾何形狀的物體的覆蓋而使用的方法之間的關(guān)系。</p><p> (2). 多權(quán)神經(jīng)元網(wǎng)絡(luò)(MWNN)</p><p> 多權(quán)神經(jīng)元可以用下面的式子來(lái)描述:</p><p> ,這里是一個(gè)m維權(quán)重向量;X是輸入向量;
62、是神經(jīng)元計(jì)算函數(shù);是閾值;f是動(dòng)作函數(shù)。</p><p> 根據(jù)維度理論,在特征空間,里面,函數(shù)在由權(quán)重決定的n維空間里,建立了一個(gè)(n-1) 維超曲面。它將n維空間分成了兩個(gè)部分。如果是一個(gè)封閉的超曲面的話,它就建立了一個(gè)有限的子空間。</p><p> 根據(jù)仿生模式識(shí)別(BPR)的原則,一類(lèi)特定類(lèi)型的樣本的子空間的建立,是基于它自身的類(lèi)型的。如果我們能夠找出一個(gè)能夠覆蓋所有訓(xùn)練樣本
63、的多權(quán)神經(jīng)元(多權(quán)神經(jīng)元網(wǎng)絡(luò))的集合的話,神經(jīng)網(wǎng)絡(luò)的子空間就代表了樣本的子空間。當(dāng)一個(gè)未知的樣本出現(xiàn)在子空間里面時(shí),我們就可以判斷它是否與訓(xùn)練樣本具有相同的類(lèi)型。更進(jìn)一步,當(dāng)我們加入一個(gè)新類(lèi)型的樣本時(shí),我們不需要重新訓(xùn)練任何一個(gè)已經(jīng)被訓(xùn)練過(guò)了的樣本類(lèi)型。一個(gè)特定的樣本類(lèi)型與其他的樣本類(lèi)型的訓(xùn)練是毫無(wú)關(guān)系的。</p><p><b> 3.系統(tǒng)描述</b></p><p&
64、gt; 語(yǔ)言識(shí)別系統(tǒng)可以分為兩個(gè)模塊。第一個(gè)是信號(hào)預(yù)處理和語(yǔ)音特征提取模塊,另外一個(gè)就是執(zhí)行仿生模式識(shí)別(BPR)任務(wù)的多權(quán)神經(jīng)元網(wǎng)絡(luò)。</p><p> (1).語(yǔ)音特征提取</p><p> Mel倒譜系數(shù)(MFCC)被用于作為語(yǔ)音特征。它的計(jì)算過(guò)程如下:</p><p> A/D轉(zhuǎn)換;利用短時(shí)能量和過(guò)零率進(jìn)行端點(diǎn)檢測(cè);預(yù)加重和Hamming窗口化;快速
65、傅里葉變換;DCT變換。為每幀數(shù)據(jù)提取16個(gè)特征位,為每個(gè)說(shuō)話者選擇32幀數(shù)據(jù)。1個(gè)512維Mel倒譜特征向量(數(shù)值)代表1個(gè)漢字的發(fā)音。</p><p> (2).多權(quán)神經(jīng)元網(wǎng)絡(luò)結(jié)構(gòu)</p><p> 作為模式識(shí)別中的一種新的通用理論模型,這里的仿生模式識(shí)別(BPR)通過(guò)多權(quán)神經(jīng)元網(wǎng)絡(luò)來(lái)實(shí)現(xiàn)。</p><p> 在對(duì)一類(lèi)特定的樣本的訓(xùn)練中,我們必須建立一個(gè)多
66、權(quán)神經(jīng)元子網(wǎng)絡(luò)。這個(gè)多權(quán)神經(jīng)元子網(wǎng)絡(luò)包括1個(gè)輸入層,1個(gè)多權(quán)神經(jīng)元隱藏層和1個(gè)輸出層。這樣的一個(gè)子網(wǎng)絡(luò)可以用下面的映射來(lái)描述:。,這里是多權(quán)神經(jīng)元的輸出,有m個(gè)隱藏的多權(quán)神經(jīng)元,其中:i= 1,2, …,m,是輸入向量。</p><p> 4.對(duì)多權(quán)神經(jīng)元網(wǎng)絡(luò)進(jìn)行訓(xùn)練</p><p> (1).有關(guān)多權(quán)神經(jīng)元網(wǎng)絡(luò)訓(xùn)練的基礎(chǔ)知識(shí)</p><p> 訓(xùn)練一個(gè)多權(quán)神
67、經(jīng)元子網(wǎng)絡(luò)需要計(jì)算每層多權(quán)神經(jīng)元的權(quán)重。多權(quán)神經(jīng)元和使用的訓(xùn)練算法詳見(jiàn)參考[4].在這個(gè)算法中,如果每類(lèi)訓(xùn)練樣本的數(shù)目是的話,我們可以使用個(gè)神經(jīng)元。在本文中,N=30,是一個(gè)標(biāo)量輸出,它是一個(gè)關(guān)于多向量輸入的函數(shù)。</p><p><b> (2).優(yōu)化方法</b></p><p> 依據(jù)上面(1)中所述,如果有很多訓(xùn)練樣本,神經(jīng)元數(shù)目將會(huì)很多以至于降低了識(shí)別速度
68、。在學(xué)習(xí)幾類(lèi)樣本的情況下,關(guān)于訓(xùn)練樣本的各個(gè)類(lèi)之間的關(guān)系的知識(shí)是可以獲得的。在一個(gè)受監(jiān)督的訓(xùn)練算法中,我們使用這個(gè)信息來(lái)減小網(wǎng)絡(luò)的規(guī)模。</p><p> 當(dāng)訓(xùn)練A類(lèi)樣本時(shí),我們觀察B類(lèi)樣本中留下的14類(lèi)樣本。這樣在集合中就有30個(gè)樣本,在集合中就有420個(gè)訓(xùn)練樣本。首先從A中選取3個(gè)樣本,我得到一個(gè)神經(jīng)元。令,其中i= 1,2,…,30;,其中j= 1,2,…420;,我們分配一個(gè)數(shù)值,。如果,在集合A中將剔
69、除出去,這樣我們得到一個(gè)新的集合。繼續(xù)直到在集合中的樣本的數(shù)目是,然后訓(xùn)練過(guò)程結(jié)束,A類(lèi)子網(wǎng)絡(luò)就有一個(gè)包含()個(gè)神經(jīng)元的隱藏層。</p><p><b> 5.實(shí)驗(yàn)結(jié)果</b></p><p> 我們專(zhuān)為此項(xiàng)研究建立了1個(gè)包括15個(gè)中國(guó)菜名的語(yǔ)言數(shù)據(jù)庫(kù)。每個(gè)菜名的長(zhǎng)度是4個(gè)漢字,即每個(gè)語(yǔ)音樣本是一個(gè)連續(xù)的4個(gè)漢字的字符串,比如“魚(yú)香肉絲”,“宮保雞丁”等等。我們將
70、其劃分為兩個(gè)集合:訓(xùn)練集合測(cè)試集合。語(yǔ)言信號(hào)采樣率為16KHz,分辨率為16位。</p><p> 表1 取不同值時(shí)的實(shí)驗(yàn)結(jié)果</p><p> 450個(gè)聲音構(gòu)成了訓(xùn)練集合,用于訓(xùn)練多權(quán)神經(jīng)元網(wǎng)絡(luò)。這450個(gè)聲音屬于10個(gè)來(lái)自中國(guó)不同省份的說(shuō)話者(5名男性和5名女性)。每個(gè)說(shuō)話者將每個(gè)漢字重復(fù)3次。測(cè)試集合總共有539個(gè)聲音,其中包括4名可以任意說(shuō)15個(gè)漢字的說(shuō)話者的聲音。</
71、p><p> 我們利用這些測(cè)試來(lái)評(píng)價(jià),從0.5到0.95,級(jí)差為0.05的識(shí)別系統(tǒng)。不同值下的實(shí)驗(yàn)結(jié)果劍表1。顯然,這個(gè)網(wǎng)絡(luò)可以在任意的值下,對(duì)訓(xùn)練集合獲得全部的識(shí)別。從實(shí)驗(yàn)結(jié)果可以看出,在=0.5的情況下,獲得的識(shí)別率幾乎與基本算法相同。但是,在網(wǎng)絡(luò)中所用的多權(quán)神經(jīng)元數(shù)目卻比基本算法少得多。</p><p> 表2 BPR基本算法實(shí)驗(yàn)結(jié)果 </p><p>
72、對(duì)語(yǔ)音識(shí)別當(dāng)中的連續(xù)密度隱馬爾科夫模型(CDHMM),動(dòng)態(tài)時(shí)間規(guī)整(DTW)和仿生模式識(shí)別(BPR) ,我們進(jìn)行了評(píng)估,重點(diǎn)考察每種方法在減少訓(xùn)練樣本的數(shù)量和訓(xùn)練時(shí)間這兩項(xiàng)指標(biāo)下的性能。連續(xù)密度隱馬爾科夫模型(CDHMM)系統(tǒng)完成每個(gè)漢字的識(shí)別需要5個(gè)狀態(tài)。Viterbi算法和Baum-Welch重估計(jì)被用于訓(xùn)練和識(shí)別。DTW系統(tǒng)的參考模板就是訓(xùn)練樣本本身。CDHMM和DTW技術(shù)都是通過(guò)運(yùn)用參考[11]中的程序來(lái)實(shí)現(xiàn)的。我們?cè)诒?中,對(duì)
73、BPR基本算法、DTW、 HMMs三種算法的實(shí)驗(yàn)結(jié)果進(jìn)行了比較。HMMs系統(tǒng)基于連續(xù)密度隱馬爾科夫模型(CDHMMs) ,并且每個(gè)名字需要5個(gè)狀態(tài)來(lái)實(shí)現(xiàn)。</p><p><b> 6.結(jié)論和致謝</b></p><p> 在本文中,我們建立了一個(gè)基于仿生模式識(shí)別(BPR)的普通話連續(xù)語(yǔ)音識(shí)別系統(tǒng)。另外,我們使用了一個(gè)選擇訓(xùn)練樣本的方法,來(lái)減少網(wǎng)絡(luò)的規(guī)模。作為模式
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 眾賞文庫(kù)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 基于仿生模式識(shí)別的非特定人連續(xù)語(yǔ)音識(shí)別的研究.pdf
- 非特定人連續(xù)語(yǔ)音識(shí)別的理論、分析和實(shí)驗(yàn).pdf
- 非特定人連續(xù)數(shù)字語(yǔ)音識(shí)別研究.pdf
- 漢語(yǔ)非特定人連續(xù)數(shù)碼串語(yǔ)音識(shí)別系統(tǒng)的研究.pdf
- 基于仿生模式識(shí)別的連續(xù)語(yǔ)音關(guān)鍵詞識(shí)別的研究.pdf
- 小詞匯量非特定人連續(xù)語(yǔ)音識(shí)別系統(tǒng)的研究.pdf
- 非特定人漢語(yǔ)連續(xù)數(shù)字語(yǔ)音識(shí)別系統(tǒng)的研究與實(shí)現(xiàn).pdf
- 非特定人孤立詞語(yǔ)音識(shí)別系統(tǒng)的研究.pdf
- 基于DSP的非特定人孤立詞語(yǔ)音識(shí)別系統(tǒng)的研究.pdf
- 設(shè)計(jì)與實(shí)現(xiàn)基于fpga的非特定人孤立詞語(yǔ)音識(shí)別系統(tǒng)
- 基于仿生模式識(shí)別的同調(diào)連續(xù)性的語(yǔ)音識(shí)別算法研究.pdf
- 針對(duì)非特定人的重卡車(chē)載語(yǔ)音識(shí)別系統(tǒng)設(shè)計(jì).pdf
- 基于高維空間覆蓋方法的非特定人連續(xù)數(shù)字語(yǔ)音識(shí)別的研究.pdf
- 非特定人連續(xù)語(yǔ)音識(shí)別技術(shù)研究與應(yīng)用.pdf
- 基于HMM的嵌入式非特定人連續(xù)英語(yǔ)語(yǔ)音識(shí)別的研究與實(shí)現(xiàn).pdf
- 設(shè)計(jì)與實(shí)現(xiàn)基于FPGA的非特定人孤立詞語(yǔ)音識(shí)別系統(tǒng).pdf
- 基于SOPC的非特定人語(yǔ)音識(shí)別系統(tǒng)研究與設(shè)計(jì).pdf
- 非特定人的漢語(yǔ)數(shù)字音識(shí)別系統(tǒng).pdf
- 小詞匯量非特定人語(yǔ)音識(shí)別系統(tǒng)的研究.pdf
- 基于DSP的非特定人孤立詞語(yǔ)音識(shí)別系統(tǒng)的研究和設(shè)計(jì).pdf
評(píng)論
0/150
提交評(píng)論