• 沒有找到結果。

C ONCLUSIONS AND F UTURE W ORK

III. I NVERSE FILTERING A PPROACHES

5 C ONCLUSIONS AND F UTURE W ORK

A comprehensive study has been conducted to explore various audio processing approaches for the automotive virtual surround audio systems via simulations and experiments. Ten processing methods have been presented. Two methods based on up/downmixing algorithms including the UDWD method and the DWD method are intended to improve the spaciousness and to balance the front and rear reproduction.

These two methods are practical approaches in terms of computation complexity and audio performance. A reverberation-based upmixing algorithm is used to extend two-channel inputs to four-channel signals. Further, a standard downmixing algorithm is employed to convert 5.1-channel input to two-channel. Eight inverse filtering-based approaches are further divided into two groups: HRTF-based model

and point receiver-based model. Four HRTF-based inverse filtering methods are exploited to correct the car responses and then render a spatial listening environment.

Four point-receiver-based inverse filtering methods intend to compensate the acoustical plants. It is summarized from the discoveries above that a simple design strategy can be formulated according to the number of passengers, using a hybrid approach, as presented in Table V. Conclusions can be drawn from the listening tests and the localization test as follows. First, for two-channel inputs, the UDWD method outperformed the upmixingHIF1 and upmixingPIF1 methods in the position FL. However, in the RR seat, the upmixingPIF1 method performed better than others. Second, for the single listener and 5.1-channel inputs, the HIF1 method received the highest grades in most attributes in the position FL, notwithstanding its poor performance in localization test. In addition, the HIF1 and PIF1 methods all receive high grade in many attributes at the rear-right seat. Thus, referring to the result of localization test, the PIF1 method would be the best choice. Third, for the two-listener mode, the HIF2a method receives high grade in most attributes, the strategy for multi-listener is chosen to be the DWD method. Since there are no significant difference between the DWD method and the HIF2a method, and grade of the DWD method is significantly higher than that of HIF2a methods in localization test. Similar conclusion can be drawn for the four-listener mode. Although the grades of the PIF4 method are slightly higher than those of the DWD method in most attributes, the poor performance in localization test and the high computational complexity lead to the PIF4 method becomes a less practical approach for producing spatial sound in the automobile. It can be concluded that the inverse filtering did not perform as well for the multi-listener mode as it did for the single passenger mode.

The number of inverse filters increases drastically with number of passengers, rendering this scheme impractical in automotive applications. Fourth, the

upmixingPIF1 method and the PIF1 method obtain low grades in both FL and RR seats. Since these two methods are basically the same, except the upmixing procedure due to different number of input. The reason might be that the PIF method produces an excessively narrow frontal sound image. Thus, it indicates that the spatial quality can be improved by incorporating a revereberator into the system.

A number of topics are planned for future research. Increase the number of rending loudspeakers to devise strategies for luxury cars. Integration of present surround system to the other audio techniques such as equalizers, superbass systems, dynamic range control, Karaoke machines, acoustical echo and noise control, etc., should be investigated.

REFERENCES

[1] Y. Kahana, P. A. Nelson and S. Yoon, “Experiments on the synthesis of virtual acoustic sources in automotive interiors,” AES 16th international conference on spatial sound reproduction and applications, Paris, March 1999, 16-021.

[2] M. R. Bai and C.C. Lee, “Comparative study of design and implementation strategies of automotive virtual surround audio systems,” J. Audio Eng. Soc.

(submitted)

[3] F. Rumsey, Spatial Audio (Focal Press, Oxford, Boston, 2001).

[4] P. Damaske and V. Mellert, “A Procedure for Generating Directionally Accurate Sound Images in the Upper-half Space Using Two Loudspeakers,” Acoustica, vol.

22, pp. 154–162 (1969).

[5] D. H. Cooper and J. L. Bauck, “Prospects for Transaural Recording,” J. Audio Eng.

Soc., vol. 37, pp. 3–19 (1989).

[6] D. R. Begault, 3-D Sound for Virtual Reality and Multimedia (AP Professional, Cambridge, MA, 1994).

[7] R. Schroeder and B. S. Atal, “Computer Simulation of Sound Transmission in Rooms,” IEEE International Convention, Record 7, pp. 150–155 (1963).

[8] W. G. Gardner, “Transaural 3D Audio,” MIT Media Laboratory Tech. Report 342, (1995).

[9] J. L. Bauck and D. H. Cooper, “Generalized Transaural Stereo and Applications,”

J. Audio Eng. Soc., vol. 44, pp. 683–705 (1996).

[10] J. Blauert, Spatial Hearing: The Psychophysics of Human Sound Localization (MIT Press, Cambridge, MA, 1997).

[11] M. R. Bai and C. C. Lee, “Development and Implementation of Cross-talk Cancellation System in Spatial Audio Reproduction Based on the Subband Filtering,” J. Sound Vib., vol. 290, pp. 1269–1289 (2006).

[12] W. G. Gardner, 3-D Audio Using Loudspeakers (Kluwer Academic, Boston, Mass, 1998).

[13] D. B. Ward and G. W. Elko, “Effect of Loudspeaker Position on the Robustness of Acoustic Crosstalk Cancellation,” IEEE Signal Process. Lett., vol. 6, pp.

106–108 (1999).

[14] T. Takeuchi and P. A. Nelson, “Robustness to Head Misalignment of Virtual Sound Imaging Systems,” J. Audio Eng. Soc., vol. 109, pp. 958–971 (2001).

[15] T. Takeuchi and P. A. Nelson, “Optimal Source Distribution for Binaural Synthesis over Loudspeakers, J. Acoust. Soc. Am., vol. 112, pp. 2786–2797 (2002).

[16] P. A. Nelson and J. F. W. Rose, “Errors in Two-point Sound Reproduction,” J.

Acoust. Soc. Am., vol. 118, pp. 193–204 (2005).

[17] M. R. Bai, C. W. Tung, and C. C. Lee, “Optimal Design of Loudspeaker Arrays for Robust Cross-talk Cancellation Using the Taguchi Method and the Genetic Algorithm,” J. Acoust. Soc. Am., vol. 117, pp. 2802–2813 (2005).

[18] P. A. Gauthier, A. Berry and W. Woszczyk, “Sound-field Reproduction In-room Using Optimal Control Techniques: Simulations in the Frequency Domain,” J.

Acoust. Soc. Am., vol. 117, pp. 662–678 (2005).

[19] T. Betlehem and T. D. Abhayapala, “Theory and Design of Sound Filed Reproduction in Reverberant Rooms,” J. Acoust. Soc. Am., vol. 117, pp.

2100–2111 (2005).

[20] G. Theile and H. Wittek, “Wave Field Synthesis: A Promising Spatial Audio Rendering Concept,” Acoust. Sci. and Tech., vol. 25, pp. 393–399 (2004).

[21] S. Spors, A. Kuntz, and R. Rabenstein, “An Approach to Listening Room Compensation with Wave Field Synthesis,” AES 24th International conference on multichannel audio, pp. 1–13, (AES, Canada, 2003).

[22] Lexicon, LOGIC 7, http://www.lexicon.com/logic7/index.asp [23] Dolby®, Prologic II, http://www.dolby.com/professional/popup_PLII/

[24] SRS® Labs, SRS Automotive,

http://www.srslabs.com/ae-srsautomotivetech826.asp [25] Bose, AudioPilot®,

http://www.bose.com/controller?event=VIEW_STATIC_PAGE_EVENT&url=/auto motive/innovations/audiopilot.jsp

[26] Bang & Olufsen, Advanced sound system, http://www.bang-olufsen.com/page.asp?id=321

[27] B. Crockett, M. Smithers, and E. Benjamin, “Next Generation Automotive Research and Technologies,” AES 120th Convention (Paris, France, 2006).

[28] Pioneer, “MCACC Multi-Channel Acoustic Calibration”, http://www.pioneerelectronics.com/pna/article/0,,2076_4151_20157532,00.html#

[29] S. Sharma, Applied Multivariate Techniques (John Wiley, New York, 1996).

[30] M. R. Bai and G. Bai, “Optimal Design and Synthesis of Reverberators with a Fuzzy User Interface for Spatial Audio,” J. Audio. Eng. Soc., vol. 59, pp. 812–825 (2005).

[31] ITU-R Rec. BS.775-1, “Multi-channel Stereophonic Sound System with or without Accompanying Picture,” International Telecommunications Union, Geneva, Switzerland (1992–1994).

[32] W. G. Gardner and K. D. Martin, KEMAR HRTF measurements (MIT’s Media Lab, http://sound.media.mit.edu/KEMAR.html, 1994).

[33] W. G. Gardner and K. D. Martin, “HRTF Measurements of a KEMAR,” J. Acoust.

Soc. Am., vol. 97, pp. 3907–3908 (1995).

[34] B. Noble, Applied Linear Algebra (Prentice-Hall, 1988).

[35] M. R. Bai and C. C. Lee, “Objective and Subjective of Effects of Listening

Angle on Crosstalk Cancellation in Spatial Sound Reproduction,” J. Acoust. Soc.

Am., vol. 120, pp. 1976–1989 (2006).

[36] P. D. Hatziantoniou and J. N. Mourjopoulos, “Errors in Real-Time Room Acoustics Dereverberation,” J. Audio. Eng. Soc., vol. 52, pp. 883–899 (2004).

[37] P. D. Hatziantoniou and J. N. Mourjopoulos, “Generalized Fractional-octave Smoothing of Audio and Acoustic Responses,” J. Audio. Eng. Soc., vol. 48, pp.

259–280 (2000).

[38] ITU-R BS.1534-1, “Method for the Subjective Assessment of Intermediate Sound Quality (MUSHRA)”, International Telecommunications Union, Geneva, Switzerland (2001).

[39] M. R. Bai, G. Y. Shih and C. C. Lee, “Comparative study of audio spatializer for dual-loudspeaker mobile phones,” J. Acoust. Soc. Am., vol. 121(1), pp. 298–309 (2007).

TABLE I. The descriptions of ten automotive virtual surround processing methods.

Method Input content

Num. of Listener

Design strategy

Up/downmixing 2-channel 1 or more Up/downmixing + Weighting & delay Downmixing 5.1-channel 1 or more Downmixing + Weighting & delay upmixingHIF1 2-channel 1 Upmixing +

HRTF-based Inverse filtering HIF1 5.1-channel 1 HRTF-based Inverse filtering HIF2 5.1-channel 2 HRTF-based Inverse filtering HIF2a 5.1-channel 2 HRTF-based Inverse filtering upmixingPIF1 2-channel 1 Upmixing +

Point-receiver-based inverse filtering PIF1 5.1-channel 1 Point-receiver-based inverse filtering PIF2a 5.1-channel 2 Point-receiver-based inverse filtering PIF4 5.1-channel 4 Point-receiver-based inverse filtering

TABLE II. The descriptions of four experiments.

Experiment I II III IV

Input content 2-channel 5.1-channel 5.1-channel 5.1-channel

Passenger no. 1 1 2 4

Anchor Summation of all lowpass filtered inputs Æ All outputs

TABLE III. The definitions of the subjective attributes.

Attribute Description

Preference Over all preference in considering timbre-related and space-related attributes

Fullness Dominance of low-frequency sound Brightness Dominance of high-frequency sound Artifacts Any extraneous disturbances to the signal

Localization Determination by a subject of the apparent direction of a sound source

Frontal image The clarity of the frontal image or the phantom center

Proximity The sound is dominated by the loudspeaker closest to the subject Envelopment Perceived quality of listening within a reverberant environment

TABLE IV. The description of five levels of grade for the localization test.

Grade Description

5 The perceived angle is the same as the presented angle

4 30˚ difference between the perceived angle and the presented angle 3 Front-back reversal of the perceived angle identical to the presented

angle

2 30˚ difference between front-back reversal of the perceived angle and the presented angle

1 Otherwise

TABLE V. Summary of the strategies for various listening mode Passenger No. Input

Channel

Strategy

1 FL 2 Up/downmixing method

1 RR 2 upmixingPIF1 method

1 FL 4 HIF1 method

1 RR 4 PIF1 method

2 or more 4 Downmixing method

Fig. 1. The block diagram of the standard downmixing algorithms.

R w2

w1

L

RL w3

FL

FR

w3

RR

w2

C

(a)

(b)

Fig. 2. The block diagram of the reverberation-based upmixing algorithms. (a) The structure of the reverberator. (b) Block diagram of the upmixing algorithm

-c3

Fig. 3. The block diagram of the UDWD method

RR R

L

FL

FR RL

w1 w2

w1

z -D

z -D w2

Upmixing Algorithms

FL’

RL’

RR’

FR’

Fig. 4. The block diagram of the DWD method RL

FL

FR

RR C

FR FL z -D RL

w1

z -D RL w1

Downmixing Algorithms

L’

R’

Fig. 5. The block diagram of the multichannel model matching problem. L: number of control points, M: number of loudspeakers, and N: number of program input.

Fig. 6. The geometry of HRTF model.

H11

H21 H12

H22

H13

H14

H23

H24

Fig. 7. The geometry of point receiver model. The left plot shows the model for single listener case, and the right plot indicates the loudspeakers and the seats.

Loudspeaker 2

Loudspeaker 3 Loudspeaker 4 H1

H2

H4

H3

1

Loudspeaker 2

Loudspeaker 3 Loudspeaker 4 2

3 4

Loudspeaker 1

Loudspeaker 1

Fig. 8. The geometry of the matching model for point receiver model in four-listener sitting mode.

1 2

4 1 2

3 4 3

2m

0.85m

0.7m

Virtual source Receiver

Fig. 9. The block diagram of the upmixingHIF1 method.

Fig. 10. The block diagram of the HIF1 method, the HIF2 method and the HIF2a Method.

Fig. 11. The block diagram of the upmixingPIF1 method.

Fig. 12. The block diagram of the PIF1 method and the PIF2a method.

Fig. 13. The block diagram of the PIF4 method.

(a)

(b)

Fig. 14. The photos of the experimental arrangement (a) External view (b) Internal view.

DVD LCD player

Front-right loudspeaker

(a)

(b)

Fig. 15. The frequency response of the HRTF-based acoustical plant at the front-left seat. (a) the front-side loudspeakers (b) the rear-side loudspeakers. The dotted lines represent the measured responses and the solid lines represent the smoothed responses.

(a)

(b)

Fig. 16. The frequency responses of the HRTF-based inverse filters for front-left seat.

(a) For the front sound image. (b) For the rear sound image

(a)

(b)

Fig. 17. The frequency responses for the virtual sound image rendering. The solid lines represent the matching model responses M and the dotted lines represent the multichannel filter-plant product HC. (a) For the front sound image (b) For the rear sound image

(a)

(b)

Fig. 18. The frequency responses of the HRTF-based inverse filters for front-left and rear-right seats. (a) For the front sound image. (b) For the rear sound image

(a)

(b)

Fig. 19. The frequency responses for the virtual sound image rendering. The solid lines represent the matching model responses M and the dotted lines represent the multichannel filter-plant product HC. (a) For the front sound image (b) For the rear sound image

Fig. 20. The frequency responses of the point receiver-based acoustical plant at the front-left seat. The dotted lines represent the measured responses and the solid lines represent the smoothed responses.

Fig. 21. The frequency responses of the point receiver-based inverse filters for the front-left seat.

Fig. 22. The frequency responses for the virtual sound image rendering. The solid lines represent the matching model responses M and the dotted lines represent the multichannel filter-plant product HC.

Fig. 23. The frequency responses of the point receiver-based acoustical plant for four listener mode. The dotted lines represent the measured responses and the solid lines represent the smoothed responses.

Fig. 24. The frequency responses of the point-receiver-based inverse filters for four-listener mode.

Fig. 25. The frequency responses for the virtual sound image rendering. The solid lines represent the matching model responses M and the dotted lines represent the multichannel filter-plant product HC.

UDWD

UDWD

Fig. 26. The means and spreads (with 95% confidence intervals) of the grades for Exp.

I. (a) The first four attributes for FL seat (b) The last four attributes for FL seat (c) The first four attributes for RR seat (d) The last four attributes for RR seat.

DWD HIF1 PIF1 An. H.R.

DWD HIF1 PIF1 An. H.R.

Position: FL

DWD HIF1 PIF1 An. H.R.

DWD HIF1 PIF1 An. H.R.

Position: RR

Fig. 27. The means and spreads (with 95% confidence intervals) of the grades for Exp.

II. (a) The first four attributes for FL seat (b) The last four attributes for FL seat (c) The first four attributes for RR seat (d) The last four attributes for RR seat.

DWD HIF2 HIF2a PIF2a An. H.R.

DWD HIF2 HIF2a PIF2a An. H.R.

-4

Fig. 28. The means and spreads (with 95% confidence intervals) of the grades for Exp III. (a) The first four attributes (b) The last four attributes

Downmixing PIF4 An. H.R

Fig. 29. The means and spreads (with 95% confidence intervals) of the grades for Exp IV. (a) The first four attributes (b) The last four attributes

Fig. 30. The arrangement for localization test. The markers positioned on the boundary of the car at the eye level with resolution 30°.

-60

-30

0

30

60

90

120

150 180

-150 -120

-90

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

Fig. 31. The results of the localization test. (a) Unprocessed case for FL seat. (b) Unprocessed case for RR seat. (c) The downmixing method for FL seat. (d) The downmixing method for RR seat. (e) The HIF1 method for FL seat. (f) The HIF1 method for RR seat. (g) The PIF1 method for FL seat. (h) The PIF1 method for RR seat. (i) The PIF4 method for FL seat. (j) The PIF4 method for RR seat. (k) The HIF2a method for FL seat. (l) The HIF2a method for RR seat.

DWD HIF1 PIF1 Unprocessed PIF4 HIF2a 1.6

1.8 2.0 2.2 2.4 2.6 2.8 3.0 3.2 3.4 3.6 3.8 4.0 4.2

Seat FL Seat RR

Fig. 32. The means and spreads (with 95% confidence intervals) of the grades for Exp.

IV. (a) The first four attributes (b) The last four attributes.

相關文件