Accession Number : ADA623029


Title :   Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment


Descriptive Note : Final technical rept. Apr 2011-Apr 2015


Corporate Author : TEXAS UNIV AT DALLAS RICHARDSON


Personal Author(s) : Hansen, John H


Full Text : http://www.dtic.mil/get-tr-doc/pdf?AD=ADA623029


Report Date : Oct 2015


Pagination or Media Count : 134


Abstract : This study has focused on five complementary research tasks in the domain of audio, speech, language, and speaker recognition and processing. In the area of speaker recognition/identification (SID), advancements have been realized to address acoustic mismatch due to speaker overlap, language mismatch, channel/microphone/additive noise, speaker style (spoken vs. singing), speaker state (physical task stress), distant speech, and environment based (room reverberation). In language ID (LID), advancements have been shown for improved out-of-set language rejection, as well as integrated spectral and prosody based LID solutions. For co-channel and diarization, new algorithms based on gammatone subband frequency modulation was achieved. In diarization, robust speech activity detection based on a combination (Combo-SAD) feature stream was developed. New keyword spotting technology using phonological features as well as audio stream assessment for peak clipping and speaker height estimation were also developed. All algorithms were evaluated on various speech corpora from AFRL, CRSS-UTDallas, and publicly available.


Descriptors :   *SPEECH RECOGNITION , ALGORITHMS , EXTRACTION , IDENTIFICATION , LANGUAGE , LEARNING MACHINES , MICROPHONES , NOISE(SOUND) , SPEECH ANALYSIS , STRESS(PHYSIOLOGY)


Subject Categories : Voice Communications


Distribution Statement : APPROVED FOR PUBLIC RELEASE