手机版

A TWO-STAGE ALGORITHM FOR ENHANCEMENT OF REVERBERANT SPEECH(2)

时间:2025-07-06   来源:未知    
字号:

Room reverberation causes two perceptual distortions on clean speech: Coloration and long-term reverberation. These two effects correspond to two physical variables: Signal-toreverberant energy ratio (SRR) and reverberation time, respectively. Based on thi

determinedbysignal-to-reverberantenergyratio(SRR),whichistheratiobetweentheenergytravelingdirectlyfromasourcetoalistenerandtheenergyofallacousticreflectionsreachingthelistener,andinturn,itisdeterminedbytalker-to-microphonedistance.Shortertalker-to-microphonedistanceresultsinhigherSRRandlessspectraldeviation,hence,lesscoloration.

Consequently,weproposeatwo-stagemodeltodealwithtwotypesofdegradations–colorationandlong-termreverberation–inareverberantenvironment.Inthefirststage,ourmodelestimatesaninversefiltertoreducecolorationeffectsinordertoincreaseSRR.Thesecondstageemploysspectralsubtractiontominimizetheinfluenceoflong-termreverberation.

3.INVERSEFILTERING

Inthefirststageofouralgorithm,wederiveaninversefiltertoreducethereverberationeffectsandthisstageisadaptedfromamulti-microphoneinversefilteringalgorithmproposedbyGillespieatel.[8].AnFIRinversefilteroftheroomimpulseresponseisestimatedbymaximizingthekurtosisofthelinearprediction(LP)residualofspeechutilizingablockfrequency-domainadaptivefilter.Then,inverse-filteredspeechisobtainedbyconvolvingtheinversefilterwithreverberantspeech.

AtypicalresultfromthefirststageofouralgorithmisshowninFig.1.Fig.1(a)illustratesaroomimpulseresponsefunction(T60=0.3s)generatedbytheimagemodelofAllenandBerkley[1].Theequalizedimpulseresponse–theresultoftheroomimpulseresponseinFig.1(a)convolvedwiththeobtainedinversefilter–isshowninFig.1(b).Ascanbeseen,theequalizedimpulseresponseisfarmoreimpulse-likethantheroomimpulseresponse.Infact,theSRRvalueoftheroomimpulseresponseis–9.8dBincomparisonwith2.4dBforthatoftheequalizedimpulseresponse.

However,theaboveinversefilteringmethoddoesnotimproveonthetailpartofreverberation.Fig.1(c)and(d)showtheenergydecaycurvesoftheroomimpulseresponseandtheequalizedimpulseresponse,respectively.Ascanbeseen,exceptforthefirst50ms,theenergydecaypatternsarealmostidentical,andthustheestimatedreverberationtimesarealmostthesame,around0.3s.WhilethecolorationdistortionisreducedduetotheincreaseofSRR,thedegradationduetoreverberationtailsisnotalleviated.Inotherwords,theeffectofinversefilteringissimilartothatofmovingthesoundsourceclosertothereceiver.Inthenextsection,weintroducethesecondstageofouralgorithmtoreducetheeffectsoflong-termreverberation.

3.SPECTRALSUBTRACTION

Latereflectionsinaroomimpulseresponsefunctionsmearspeechspectrumanddegradespeechintelligibilityandquality.Likewise,anequalizedimpulseresponsecanbedecomposedintotwoparts:earlyandlateimpulses.Resemblingtheeffectsofthelatereflectionsinaroomimpulseresponse,thelateimpulseshavedeleteriouseffectsonthequalityofinverse-filteredspeech;byestimatingtheeffectsofthelateimpulsesandsubtractingthem,wecanexpecttoenhancethespeechquality.

Inapreviousversionofthisalgorithm,WuandWang[15]proposeaone-stagemethodtoenhancethereverberantspeechbyestimatingandsubtractingeffectsoflatereflections.

Thesmearingeffectsoflateimpulsesleadtothesmoothingofthesignalspectruminthetimedomain.Therefore,weassumethatthepowerspectrumoflate-impulsecomponentsisa

(a)

(b)

(c)

Time(ms)

(d)

Fig.1.(a)Aroomimpulseresponsefunctiongeneratedbytheimagemodelinanoffice-sizeroom.(b)Theequalizedimpulseresponsederivedfromthereverberantspeechgeneratedbytheroomimpulseresponsein(a)astheresultofthefirststageofouralgorithm.Energydecaycurves(c)thatcomputedfromtheroomimpulseresponsefunctionin(a).(d)Thatfromtheequalizedimpulseresponsein(b).EachcurveiscalculatedusingtheSchroederintegrationmethod.Thehorizontaldotlinerepresents–60dBenergydecaylevel.Theleftdashlinesindicatethestartingtimesoftheimpulseresponsesandtherightdashlinesthetimesatwhichdecaycurvescross–60dB.

¬

smoothedandshiftedversionofthepowerspectrumoftheinverse-filteredspeechzt:

()

Sl(k;i=γw(i ρ) Sz(k;i),

2

2

(1)

whereSz(k;i)

2

andSl(k;i)

2

are,respectively,theshort-term

powerspectraoftheinverse-filteredspeechandthelate-impulsecomponents.Indexeskandirefertofrequencybinandtimeframe,respectively.Thesymbol denotesconvolutioninthetimedomainandw(i)isasmoothingfunction.Theshort-termspeechspectrumisobtainedbyusinghammingwindowsoflength16mswith8msoverlapforshort-termFourieranalysis.

…… 此处隐藏:2287字,全部文档内容请下载后查看。喜欢就下载吧 ……
A TWO-STAGE ALGORITHM FOR ENHANCEMENT OF REVERBERANT SPEECH(2).doc 将本文的Word文档下载到电脑,方便复制、编辑、收藏和打印
×
二维码
× 游客快捷下载通道(下载后可以自由复制和排版)
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
VIP包月下载
特价:29 元/月 原价:99元
低至 0.3 元/份 每月下载150
全站内容免费自由复制
注:下载文档有可能出现无法下载或内容有问题,请联系客服协助您处理。
× 常见问题(客服时间:周一到周五 9:30-18:00)