Chapter 4 Enhanced BFOS Bit Allocation Algorithm
4.3 Generalized BFOS Bit Allocation Algorithm for AAC
The generalized BFOS algorithm is an efficient bit allocation algorithm for subband coding.
For the purpose of analyses and comparisons, we propose an approach to integrate the generalized BFOS bit allocation algorithm in AAC in this section based on the concepts described in [10] and [11]. The bit allocation procedure of the generalized BFOS scheme for AAC is similar to that of the EBFOS scheme (see Fig. 4.1). Each step in the generalized BFOS scheme for AAC is elaborated below.
1. Initialization. The same to the initialization step in Section 4.2.1, we set the reference NMR to 1 (0 dB) for all the bands. Then, we determine the sref,i value and calculate the value of reference total coding bits for each band, Bref,i based on the adopted reference NMR value, NMRref,i=1, ∀i.
2. Local Maximum NGPB/BPNL analysis. Differing from the EBFOS scheme, the local maximum NGPB and BPNL of the ith SFB for the BFOS scheme are determine by the formula (4.10) and (4.11) respectively.
{
ref i newi newi ref i}
newi ref i i newi ref iBnew,i and NMRnew,i are the new values of total coding bit and distortion for the ith SFB respectively, when the corresponding SF value of the ith SFB is changed from sref,i to snew,i. The local optimal SF value of the ith SFB, sopt,i, is the SF value associated with the local
maximum NGPB or BPNL.
3. Global Maximum NGPB/BPNL analysis. The same to the step 3 in Section 4.2.1, we first find the NGPBG (or BPNLG ) for a frame by the formula (4.7) (or (4.8)) and determine sfbG . Then we set the SF value only of the sfbG-th SFB to the local optimal SF value of the sfbG-th SFB.
4. Update NMRref,i (as well as sref,i) and Bref,i of the sfbG-th SFB. Go to step 2 if the bit budget constraint is not met.
In the generalized BFOS bit allocation scheme here, we also adopt the trellis-based optimization algorithm for HCB decision. However, differing from the EBFOS scheme, we only perform the local maximum NGPB/BPNL analysis for the sfbG-th SFB.
As described in [10], the generalized BFOS bit allocation scheme can be performed with and without convexity assumption. When the generalized BFOS scheme is performed with convexity assumption, ni in (4.10) (or (4.11)) is equal to 1. When the generalized BFOS scheme is performed without convexity assumption, ni is approximate to 14 on the average from the statistics of coded data.
4.4 Simulation Results
In this section, we evaluate the computational complexity and the coded audio quality in our experiments. Four types of bit allocation algorithms are simulated and compared as described below using the MPEG-4 AAC Verification Model (VM) as the test platform.
(1) The TLS algorithm in MPEG-4 AAC VM (VM-TLS).
(2) The BFOS algorithm for AAC with convexity assumption, BFOS-C, and without convexity assumption, BFOS-NC, which are described in Section 4.3.
(3) The trellis-based algorithm aiming at minimizing average NMR, JTB-ANMR, and aiming
(4) The EBFOS scheme and its fast version, which are described in Section 4.2.
In order to focus only on the bit allocation performance, all the optional tools in AAC, such as TNS and M/S stereo coding, are not used in our simulations. Ten two-channel audio sequences with a sampling rate at 44.1 kHz are tested. Two of them are extracted from MPEG SQAM [6], and the rest are from EBU [24].
4.4.1 Complexity Analysis
The complexity analysis for the aforementioned several bit allocation algorithms is summarized in Table 4.2. The “Computation” column is the average number of NGPB (or BPNL) calculation for a frame. The values in “Computation” column are derived from the statistics collected from the simulations on audio sequences. For the convenience of comparison, the BFOS-NC scheme is chosen to be the reference (ratio=1) and all the other schemes are rated based on this reference.
Table 4.2: Complexity Analysis of EBFOS scheme and generalized BFOS scheme Scheme Computation Ratio
BFOS-C 119 0.27
BFOS-NC 444 1
Fast EBFOS 1145 2.58
EBFOS 11848 26.68
The experimental data indicate that the computation of fast EBFOS scheme is approximately 2.6 times higher than that of the BFOS-NC scheme. Moreover, the fast EBFOS scheme is approximately 10 times faster than that of the EBFOS scheme.
4.4.2 Objective Quality
The rate-distortion curves of the aforementioned bit allocation schemes are shown in Fig. 4.2 and Fig. 4.3. Two common objective quality measurements, average NMR (ANMR) and
maximum NMR (MNMR) are adopted in the objective performance comparison.
The research in [11] shows that the BFOS-C scheme is a near optimal bit allocation scheme for MPEG-1 LayerⅠ/ LayerⅡ audio coding, but the simulation results show that the BFOS-C scheme becomes less efficiency for AAC. The performance of the BFOS-NC scheme is much better than that of the BFOS-C scheme which means that the convex assumption is not suitable for AAC. Otherwise, both the ANMR and MNMR performances of the BFOS-NC scheme are approximately 1dB worse than that of the JTB-ANMR scheme.
Clearly, the performances of the EBFOS scheme are much better than that of VM-TLS and better than that of the BFOS-NC scheme. If we look at the ANMR plot (Fig. 4.2), the performance of the EBFOS scheme is slightly worse than that of JTB-ANMR but they are very close. It is somewhat better than the JTB-MNMR scheme since the latter is not optimized for the ANMR criterion. If we look at the MNMR plot (Fig. 4.3), the EBFOS scheme is somewhat worse than JTB-MNMR but it is slightly better than the JTB-ANMR scheme. As stated earlier, the EBFOS scheme is aiming at reducing the overall NMR, which pretty much leads to minimizing ANMR. As for the fast version, there is almost no loss of performance (less than 0.06dB loss) in adopting the fast algorithm for EBFOS.
4.4.3 Subjective Quality
The informal listening tests on the aforementioned schemes show that it is hard to tell the difference between JTB-ANMR and the EBFOS scheme. In addition, a “simulated” subjective measure, Objective Difference Grade (ODG), is used in audio quality evaluation.
The ODG results of the aforementioned bit allocation schemes are shown in Fig. 4.4, in which the reference signal is the original audio sequence. Interestingly, JTB-ANMR is the best algorithm judged by ODG. According to the collected test data (Fig. 4.4), the EBFOS scheme is better than that of the BFOS-NC and BFOS-C schemes. Moreover, the difference between the EBFOS and the JTB-ANMR schemes is rather small.
Fig. 4.2: ANMR rate-distortion comparison for various bit allocation schemes
Fig. 4.3: MNMR rate-distortion comparison for various bit allocation schemes
Fig. 4.4: ODG performance of various bit allocation schemes