Skip Navigation Linksdomov > napredno iskanje > rezultati > izpis
Zapis SUTRS

VRSTA GRADIVAanalitična raven (sestavni del), tekstovno gradivo, tiskano, 1.01 - izvirni znanstveni članek
DRŽAVA IZIDASlovenija
LETO IZIDA2002
JEZIK BESEDILA/IZVIRNIKAangleški
PISAVAlatinica
AVTORKotnik, Bojan - avtor
ODGOVORNOSTKačič, Zdravko - avtor // Horvat, Bogomir - avtor
NASLOVPredstavitev učinkovitega postopka za robustnoavtomatsko razpoznavanje govora
V PUBLIKACIJIElektrotehniški vestnik. - ISSN 0013-5852. -ǂšt. ǂ1, ǂLetn. ǂ69 (2002), str. 69-74.
KRATKA VSEBINAV članku predstavljamo učinkovit postopek za robustno razpoznavanje govora v šumnih okoljih, ki je kompromis med nizko računsko zapletenostjo in učinkovitim zmanjšanjem šumov različnih karakteristik pri različnih razmerjih signal\šum. V prvem koraku vpeljemo novo uteževalno funkcijo, ki opravi zmanjšanje nivoja šuma v časovni domeni. Sledi izboljšan postopek spektralnega odštevanja, ki opravi dodatnozmanjšanje šuma v frekvenčni domeni. Postopek izboljšanega spektralnega odštevanja temelji na vpeljavi koncepta statističnega minimuma, kar odpravipotrebo po eksplicitnem detektorju prisotnosti govora v vhodnem signalu, kije sicer potreben pri klasičnem, osnovnem spektralnem odštevanju. Uporabljen princip v rezultirajoč signal ne vnese tako imenovanega "glasbenega šuma", kar je sicer stranski, nezaželen ucinek klasičnega spektralnega odštevanja. V zadnjem koraku sledi postopek izločanja značilk iz govornega signala. Učinkovitost opisanih postopkov odstranjevanja šuma potrjujejo predstavljeni rezultati avtomatskega razpoznavanja govora s slovensko, nemško in špansko fiksno telefonsko bazo SpeechDat II. // Many automatic speech recognition systems, which operate in a laboratory environment, achieve high recognition rates. As speech recognition has moved from the laboratory to the field, however, recognition scores drop significantly. Robust speech recognition refers to the problem of designingan automatic speech recogniser that works well in a wide range of unexpected or adverse environments. In this paper, we present an effective two-stage noise reduction procedure (see Section 1, Figure 1), which uses time and spectral domain processing and achieves a trade-off between effective noise reduction and low computational load for real-time operations. At the first stage, a novel weighting function is used to reduce the effect of additive noise on speech in time domain. This function, described in Section 2, is a compound of a short-time zero crossing value and a short-time power of the speech signal (see Equations 1- 6). At the second stage, a spectral subtraction method based on minimum statistics is used. Two smoothed power spectra of noisy speech signal are determined according to Equation 9. The power spectrum of the pure noise isestimated by minima of the smoothed power spectrum of noisy speech signal within a moving interval with fixed width (seeEquation 10). Finally, the power spectrum of uncorrupted speech is estimated (Equation 11). Described spectral subtraction based on minimum statistics has also the advantage, that no explicit detection of non-speech segments is needed and no "musicalnoises" are added. The last step of the proposed algorithm is a Mel cepstrum feature extraction procedure. The feature vector consists of 12 mel cepstrum coefficients and the energy parameter. The Slovenian fixed telephone database (FDB) SpeechDat II, the German SpeechDat II FDB as well as Spanish SpeechDat II FDB were used for evaluation of efficiency of the proposed noise robust recognition method. The recognition system RefRec (Reference Recogniser) served as a train and recognition platform. RefRec is a set of scripts which have been developed by the COST 249 SpeechDat task force. The scripts are built around the HTK toolkit. The recognition task was connected digits recognition. Three types of experiments with SpeechDat databases were made (see Figures 3 - 5). Base-line recognition experiment was performed on original (none of speech enhancement method wasapplied) utterances. These results serve as a reference. Then, a spectral subtraction algorithm was used, and at last, a novel weighting function together with spectral subtraction was applied. The comparisons of recognition results with SpeechDat databases are presented in Figures 3-5 as well as in Table 1. When applying the proposed method, better recognition results can be achieved even with less computational complexities.The best achieved word error rates (WER) are 3.26% for the Slovenian FDB, 1.02% for the German SpeechDat and also 0.53% for the Spanish fixed speech database. Furthermore, the proposed method is appropriate for real time processing and also for distributed speech recognition systems (DSR).
OPOMBEPovzetek ; Abstract // Bibliografija: str. 74
OSTALI NASLOVIǂAn ǂefficient algorithm for automatic robust speech recognition
PREDMETNE OZNAKE// govor // razpoznavanje // procesiranje
UDK007

izvedba, lastnina in pravice: NUK 2010