• 沒有找到結果。

The Implementation of PNS in MPEG-4 AAC

Chapter 3 Perceptual Noise Substitution in MPEG-4 Advanced Audio

3.2 The Implementation of PNS in MPEG-4 AAC

In the MPEG-4 standard [33], the PNS tool is used to implement perceptual noise substitution coding within an individual channel stream (ICS), certain sets of spectral coefficients are derived from random vectors rather than from Huffman coded symbols and an inverse quantization process. This is done selectively on a scalefactor band and group basis when perceptual noise substitution is flagged as active. Figure 8 shows the block diagram of MPEG-4 general audio non-scalable encoder.

bitstream

Figure 8: Block diagram of general audio non-scalable encoder [33]

3.2.1 PNS Decoding Process

Symbol Definitions:

hcod_sf[]

Huffman codeword from the Huffman code table used for coding of scalefactors (see [12]).

dpcm_noise_nrg[][]

Differentially encoded noise nergy.

noise_nrg[g][sfb]

Noise energy for group g and scalefactor band sfb.

spec[]

Array containing the channel spectrum of the respective channel.

ms_used[g][sfb]

One-bit flag per scalefactor band indicating that M/S coding is being used in group g and scalefactor band sfb.

The used of perceptual noise substitution tool is signaled by the use of the pseudo Huffman codebook NOISE_HCB (13). Furthermore, if the same scalefactor band and group is coded by perceptual noise substitution in both channels of a channel pair, the correlation of the noise signal can be controlled by means of the ms_used field. While the default noise generation process works independently for each channel (separate generation of random vectors), the same random vector is used for both channels if the ms_used flag is set for a particular scalefactor band and group. In this case, no M/S stereo coding is carried out (because M/S stereo coding and noise substitution coding are mutually exclusive). The energy information for perceptual noise substitution decoding is represented by a “noise energy” value indicating the overall power of the substituted spectral coefficients in units of 1.5 dB. If noise substitution coding is active for a particular group and scalefactor band, a noise energy value is transmitted instead of the scalefactor of the respective channel. Noise energies are coded just like scalefactors, i.e. by Huffman coding of differential values:

z The start value of the DPCM decoding is given by global_gain.

z Differential decoding is done separately between scalefactors, intensity stereo positions and noise energies. In other words, the noise energy decoder ignores interposed scalefactors and intensity stereo position values.

z The same codebook is used for coding of noise energies as for scalefactors.

One pseudo function is defined for use in perceptual noise substitution decoding:

function is_noise(group, sfb) {

1 for window group / scalefactor bands with codebook sfb_cb[group][sfb] == NOISE_HCB

0 otherwise }

The constant NOISE_OFFSET is used to adapt the range of average noise energy values to the usual range of scalefactor and has a value of 90.

The function gen_rand_vector(addr, size) generates a vector of length <size> with signed random values of average energy MEAN_NRG per random value. A suitable random number generator can be realized using one multiplication/accumulation per random value.

The noise substitution decoding process for one channel is defined by the following pseudo code:

nrg = global_gain – NOISE_OFFSET – 256;

for (g=0; g<num_window_groups; g++) { /* Decode noise energies for this group */

for (sfb=0; sfb<max_sfb; sfb++) if (is_noise(g,sfb))

noise_nrg[g][sfb] = nrg += dpcm_noise_nrg[g][sfb];

/* Do perceptual noise substitution decoding */

for (b=0; b<window_group_length[g]; b++) { for (sfb=0; sfb<max_sfb; sfb++) {

if (is_noise(g,sfb)) {

offs = swb_offset[sfb];

size = swb_offset[sfb+1] – offs;

/* Generate random vector */

gen_rand_vector( &spec[g][b][sfb][0], size);

scale = 1/(size * sqrt(MEAN_NRG));

scale *= 2.0^(0.25*noise_nrg[g][sfb]);

/* Scale random vector to desired target energy */

for (i=0; i<len; i++)

spec[g][b][sfb][i] *= scale;

} } } }

3.2.2 Integration with Intra Channel Prediction Tools

For scalefactor bands coded using PNS, the corresponding predictors are switched to “off”, thus overriding the status specified by the prediction_used mask. In addition, for scalefactor bands coded by perceptual noise substitution, the predictors belonging to the corresponding spectral coefficients are reset [12]. The update of these predictors is done by feeding a value of zero as the “last quantized value” . In Long Term Prediction (LTP), the scalefactor bands coded using PNS are not predicted.

) 1 (

n

xrec

3.2.3 Integration with other AAC Tools

The following interactions between the perceptual noise substitution tool and other AAC tools take place:

z During Huffman decoding of the quantized spectral coefficients, the Huffman codebook table NOISE_HCB is treated exactly like the zero codebook

ZERO_HCB, i.e. no Huffman codewords are read for the corresponding

scalefactor band and group.

z If the same scalefactor band and group is coded by perceptual noise substitution in both channels of a channel pair, no M/S stereo decoding is carried out for this scalefactor band and group.

z The pseudo noise components generated by the perceptual noise substitution tool are injected into the output spectrum prior to the temporal noise shaping (TNS) process step.

3.2.4 Integration into a Scalable AAC-based Coder

The following rules apply for usage of perceptual noise substitution tool in a scalable AAC-based coder:

z If a particular scalefactor band and group is coded by perceptual noise substitution, its contribution to the spectral components of the reconstructed output signal for the update of the intra channel predictor is omitted.

z If a particular scalefactor band and group is coded by perceptual noise substitution, its contribution to the spectral components of the output signal is omitted if spectral coefficients are transmitted for this scalefactor band and group in any of the higher layers by means of a non-zero codebook number.

z If a particular scalefactor band and group is coded by perceptual noise substitution in both channels of a channel pair, the higher layers may still use the M/S stereo flag ms_used to signal the use of M/S stereo decoding.

3.2.5 Suggested Encoding Procedure for PNS

In MPEG-4 AAC, the encoding procedure for perceptual noise substitution is similar to the coding procedure for intensity stereo and is performed as follows:

z For each scalefactor band containing spectral coefficients above a lower border frequency (e.g. 4 kHz) a noise detection procedure is carried out. The scalefactor band is classified as noise-like if the corresponding signal is neither tonal nor contains strong changes in energy over time. The tonality of the signal can be estimated by using the tonality values calculated in the psychoacoustic model.

Similarly, changes in signal energy can be evaluated using the FFT energies calculated in the psychoacoustic model.

z From the detection procedure, a map, noise_flag[sfb], is constructed such that noise-like scalefactor bands are flagged with a non-zero value.

z For each flagged scalefactor band, the energy of the corresponding spectral coefficients is calculated and mapped to a logarithmic representation with a resolution of 1.5 dB. An offset (NOISE_OFFSET=90) is added to the logarithmic noise energy values.

z For each flagged scalefactor band, the corresponding spectral coefficients are set to zero before quantization of the coefficients is carried out as usual.

z During the noiseless coding procedure, the pseudo codebook NOISE_HCB is set for all flagged scalefactor bands. Apart from this, the regular section / noiseless coding procedure is carried out on the quantized coefficient data.

z The logarithmic noise energy values are coded analogous to the regular scalefactors, i.e. with a differential encoding scheme starting with the global_gain value. They are transmitted in place of the scalefactors belonging to the flagged scalefactor bands.

相關文件