Trevor Burton Ph.D. Thesis Audio Files

The following recorded microphone and resulting error signals are given here to demonstrate the perceived effectiveness of the echo cancellation structures from my thesis entitled "Efficient Subband Structures for Acoustic Echo Cancellation in Nonstationary and Nonlinear Environments".

Audio Files From Chapter 4: Subband Structure for Acoustic Echo Cancellation in Nonstationary Environments

The results presented below are based on measured microphone and loudspeaker data from a conference phone operating in hands-free mode with a far-end speech excitation signal under quiet local talker conditions. During the recording an echo path fluctuation was introduced by continual hand waving in front of the phone set at a rate of approximately 1 Hz.

The subband affine projection (SB-AP) structure and the proposed subband gradient proportionate affine projection (SB-GPAP) structure are based on oversampled filter banks with 8 subbands and a downlsampling factor of 4. The subband adaptive filters were set to a length of 500 taps in all subbands for both structures with step sizes of 1.0. The rest of the SB-GPAP and SB-AP settings in all subbands were as follows: ε = 0, γ = 1, β = 0.5, η = 0.9999, and ξ = 0.00001 for the SB-GPAP structure along with P = 1, and δ = 0.00001 for both structures.

The proposed SB-GPAP structure provides improved AEC performance compared to the SB-AP structure in both cases. With the GP-AP algorithm applied to only the initial 50 subband filter taps, to exploit the time domain nature of echo path changes, and the AP algorithm applied to the remainder of the filter taps, the SB-GPAP structure achieves enhanced AEC performance compared to the SB-AP structure while requiring only a minor increase in computational complexity.

Audio Files From Chapter 5: Subband Volterra Structure for Acoustic Echo Cancellation

The results presented below are based on measured microphone and loudspeaker data from a smartphone operating in hands-free mode with a high volume far-end speech excitation signal under quiet local talker conditions.

The subband NLMS (SB-NLMS) structure and the proposed subband second-order NLMS Volterra filter (SB-NLMSVF) structure are based on oversampled filter banks with 8 subbands and a downlsampling factor of 4. The linear subband adaptive filters were set to a length of 500 taps in all subbands for both structures with step sizes of 0.5. The quadratic subband adaptive filters of the SB-NLMSVF structure were set to a length of 120 in all subbands with step sizes of 0.5. The rest of the SB-NLMSVF and SB-NLMS settings were as follows: δ = 0.00001 for both structures in all subbands.

SB-NLMSVF AEC structure with second-order Volterra filtering in only the second and third subbands error signal spectrogram and corresponding error signal.

The proposed SB-NLMSVF structure provides improved AEC performance compared to the SB-NLMS structure in both cases. With quadratic Volterra filtering applied in only the 1-3 kHz subbands to exploit the frequency domain nature of the loudspeaker nonlinearity, the SB-NLMSVF structure achieves enhanced AEC performance compared to the SB-AP structure while requiring only a slight increase in computational complexity.

Audio Files From Chapter 6: Subband Volterra Structure for Acoustic Echo Cancellation in Nonstationary Environments

The results presented below are based on measured microphone and loudspeaker data from a conference phone operating in hands-free mode with a far-end speech excitation signal under quiet local talker conditions. During the recording echo path fluctuations were introduced by continual hand waving in front of the phone set at a rate of approximately 1 Hz along with changing the volume between medium and high levels at the same rate.

The subband NLMS (SB-NLMS) structure and the proposed subband second-order gradient proportionate NLMS Volterra filter (SB-GPNLMSVF) structure are based on oversampled filter banks with 8 subbands and a downlsampling factor of 4. The linear subband adaptive filters were set to a length of 500 taps in all subbands for both structures with step sizes of 0.5. The rest of the SB-GPNLMSVF and SB-NLMS settings for the linear and quadratic adaptive filters in all subbands were as follows: ε = 0, γ = 1, β = 0.5, η = 0.9999, and ξ = 0.00001 for the SB-GPNLMSVF structure, along with and δ = 0.00001 for both structures.

SB-GPNLMSVF AEC structure with GP-NLMS based linear filters in subbands 2-4 and a GP-NLMS based quadratic filter in only the first subband error signal spectrogram and corresponding error signal. The NLMS algorithm was used by the linear filter in the first subband.

The proposed SB-GPNLMSVF structure provides improved AEC performance compared to the SB-NLMS structure in both cases. With a GP-NLMS based quadratic adaptive Volterra filter in only the 0-1 kHz subband to exploit the frequency domain nature of the loudspeaker nonlinearity, and GP-NLMS based linear adaptive filters in only the 1-4 kHz subbands to exploit the frequency domain nature linear echo path changes, the SB-GPNLMSVF structure attains much improved AEC performance compared to the SB-NLMS structure while requiring only a modest increase in computational complexity.

Experimentally Measured Hands-Free System Data

The following are the recorded reference and microphone signals from Chapters 4, 5, and 6 for speech excitation signals.

Chapter 6
It should be noted that only results for the smartphone appear in the thesis document.