The G.718 encoder can accept wideband sampled signals at 16 kHz, or narrowband signals sampled at either 16 or 8 kHz. Similarly, the decoder output can be 16 kHz wideband, in addition to 16 or 8 kHz narrowband. Input signals sampled at 16 kHz, but with bandwidth limited to narrowband, are detected by the encoder. The output of the G.718 codec is capable of operating with a bandwidth of 50 Hz to 4 kHz at 8 and 12 kbit/s and 50 Hz to 7 kHz from 8 to 32 kbit/s.
The high quality codec core represents a significant advance in quality over currently available codecs, providing 8 kbit/s wideband clean speech quality equivalent to G.722.2 at 12.65 kbit/s whilst the 8 kbit/s narrowband codec operating mode provides clean speech quality equivalent to G.729 Annex E at 11.8 kbit/s.
The codec operates on 20 ms frames and has a maximum algorithmic delay of 42.875 ms for wideband input and wideband output signals. The maximum algorithmic delay for narrowband input and narrowband output signals is 43.875 ms. The codec may also be employed in a low-delay mode when the encoder and decoder maximum bit rates are set to 12 kbit/s. In this case the maximum algorithmic delay is reduced by 10 ms.
The codec also incorporates an alternate coding mode, with a minimum bit rate of 12.65 kbit/s, which is bitstream interoperable with ITU-T Recommendation G.722.2, 3GPP AMR-WB and 3GPP2 VMR-WB mobile wideband speech coding standards. This option replaces Layer 1 and Layer 2, and the layers 3-5 are similar to the default option with the exception that in Layer 3 few bits are used to compensate for the extra bits of the 12.65 kbit/s core. The decoder is further able to decode all other G.722.2 operating modes. G.718 also includes discontinuous transmission mode (DTX) and comfort noise generation (CNG) algorithms that enable bandwidth savings during inactive periods. An integrated noise reduction algorithm can be used provided that the communication session is limited to 12 kbit/s.
The underlying algorithm is based on a two-stage coding structure: the lower two layers are based on Code-Excited Linear Prediction (CELP) coding of the band (50-6400 Hz) where the core layer takes advantage of signal-classification to use optimized coding modes for each frame. The higher layers encode the weighted error signal from the lower layers using overlap-add modified discrete cosine transform (MDCT) transform coding. Several technologies are used to encode the MDCT coefficients to maximize performance for both speech and music.
G.718 was developed in ITU-T Study Group 16 as part of an open consortium of 9 organizations; Motorola, Nokia, Ericsson, Texas Instruments, VoiceAge Corporation, Panasonic, Huawei, France Telecom, Qualcomm.