Most modern speech compression (coding) algorithms, for
example those used in cell phones, deliver narrowband (=3200Hz) telephone (near
toll) quality speech. This limits the naturalness and intelligibility of the
speech signal. Although wideband coding technology is available for most
networks, the cost of renewing the entire infrastructure to support them is
prohibitive. Efforts to increase the bandwidth of the signal while using
existing infrastructure are thus very desirable.
Researchers at Arizona State University have proposed a
novel bandwidth extension method that enhances significantly the quality and
intelligibility of speech while operating with existing infrastructure. This
method makes use of new psychoacoustic concepts to determine and “fill in” the
perceptually relevant high band content. This optimization based method
allocates bits using a perceptual model only to specific frames in the high
bandwidth region. This concept of having a prediction model combined with
encoded information makes the method unique and efficient.
Experimental results show that the system performance is at
a lower average bit rate when compared to other similar methods while
maintaining a high quality/high intelligibility signal. The method can be
essentially used to “retro-fit” existing narrowband algorithms that work with
existing infrastructure. Additionally, this method works without any compromise
in the quality of the audio signal.
Potential Applications
- Cellular phones
- Voice–over-IP and Internet Telephony
- Teleconferencing
- Hearing aids
- Entertainment applications such as mp3 players
- Defense communications
Benefits and Advantages
- Significant reduction in cost as a result of
implementation and use with existing infrastructure
- Improves the naturalness and intelligibility of speech in
voice communications
- Does not affect the quality of the audio
signal
Download original PDF