Conventional single- andmulti-channel speech enhancementmethods aimat improving
the signal-to-noise ratio (SNR) of the signal signals captured through distant microphones, which
do not specifically target the improvements of ASR performance. We investigate a nonlinear
multiple regression to extract robust features for automatic speech recognition (ASR). The idea
is to approximate the log spectra of a close-talking microphone by effectively combining of the
log spectra of distant microphones. The devised system turns out to be a generalized log spectral
subtraction framework for the robust speech recognition. We demonstrate the effectiveness of the
proposed approach through our extensive evaluations on the single- and multi-channel isolated
word recognition experiments conducted in 15 real car-driving environments.
Keywords: microphone array, in-car speech recognition, neural network, K-means, beamforming