Goals and Outline of the Thesis - 虛擬類比電子琴聲源探討：可運用於聲音合成之三角波型實作

Chapter 1 Introduction

1.3 Goals and Outline of the Thesis

When using subtractive synthesis as sound design tool, instrument like organ, glockenspiel, human whistling and more can easily synthesized using only triangle wave as sound source [16]. Currently discovered algorithms that start from sawtooth wave or impulse train, which takes more steps to get a triangle wave, a direct synthesis of aliasing suppressed triangle wave algorithm for V.A is introduced in the thesis. Musical tones and instruments mentioned above can implement more efficiently with less computations and memory.

This thesis structured as follow. In Chapter 2, subtractive synthesis concept, classic waveforms used in subtractive synthesis are introduced. Aliasing problem in digital generated classic waveform also discussed. In Chapter 3, previous V.A algorithms for generating classic waveforms are reviewed. Chapter 4 a method for implement synthesizer’s triangle wave is proposed and evaluate with perceptual model of hearing.

Chapter 5 contains the conclusions and future works.

Chapter 2 Subtrctive Synthesis and Virtual Analog

2.1 Subtractive Synthesis

Subtractive synthesis is a sound design method. As the name suggest, subtract synthesis filter out, or say, subtract the unwanted frequencies that source signal generated. Source signal can be harmonically rich waveform like sawtooth, square, triangle wave or noise generally [7]. Classic waveform used in early analog synthesizer will be discussed in section 2.3.

2.2 Subtractive Synthesizer

Early analog synthesizers around 1960s use subtractive synthesis as their sound generation principle. It includes a collection of sound producing and modifying modules, for example: Oscillators, filters, low frequency oscillator (LFO), amplifier and more [7]. In it's most basic form, Fig 2.1, sound synthesis is a very simple process as follow: The source produces a constant raw waveform. The filter changes the harmonic structure of it, the amplitude envelope shape the sound’s volume.

Fig 2-1 Basic form of subtractive synthesis

2.2.1 Sound Processing Modules

Subtractive synthesis use one or more oscillators as sound source. Each oscillator can have different pitch and waveform. The waveforms are typically simple geometric shapes. The pitch of oscillators depends on played note but can optionally controlled by an envelope generator or LFO.

The filter in subtractive synthesizer is often a lowpass filter with resonance. In most situations, the filter’s cutoff frequency or resonance amount are time-varying controlled by a envelope generator that alter the timber of sound with time to produce a wide range of synthetic and imitative timbres. Amplifier mixed the processed signal, simply gain or reduce the volume. It can be controlled by modulation modules to shape the overall sound automatically.

2.2.2 Modulation Modules

Modulation modules like low frequency oscillator (LFO) or envelope generator do not directly modify the signal but instead control parameters of sound processing module.

When the sound processing modules is modulate by large amount of control signals, complex sounds can be produced.

Source

Signal Filter Envelope

Sound

LFO provide frequency usually below 20 hz creates a rhythmic pulse or sweep.

Audio effect like vibrato and tremolo is done by LFO when modulate the frequency and amplitude in sound processing modules. Envelope generator (E.G) provide envelope modulation to shape the volume or harmonic content of the produced sound with time. The four step E.G with attack, decay, sustain and release (ADSR) commonly used.

2.3 Classic Waveform

As section 2.1 mentioned, source signal relies heavily on spectrally rich waveforms.

Any periodic waveforms can be used in source-filter synthesis in order to produce a spectrum with different harmonic structure. The shape of the source signal waveform determines the amplitude relations of the harmonics.

Simple geometric shape such as Sawtooth wave, rectangular pulse wave and triangle wave are typically used [8] which are easy to produce electronically. An early analog VCO module with these four classic waveforms is shown in Figure 2-2. Wave shape of classic waveforms are plotted in Fig.2-3.

Fig 2-2 Analog VCO module with classic waveforms (Picture from [14])

Fig 2-3 Classic waveforms. Sawtooth, triangle, square and sine wave.

2.3.1 Sawtooth Wave

Sawtooth waves contain both even and odd integer harmonic frequencies showed in Fig.2-4. It is the most common waveforms used by subtractive synthesis. The waveform contain one discontinuity in a cycle. The harmonic amplitude of sawtooth waveform falls at about 6 dB per octave, where 1 octave defined as double frequency.

The slow fall-off rate cause huge aliasing when directly synthesized in digital domain.

Pulse and Square waves can be derived from summing two shifted sawtooth waves.

Therefore most of algorithms in V.A synthesis concentrates on producing sawtooth waveforms.

2.3.2 Square and Triangle Wave

The spectrum of square wave falls about 6 dB same as sawtooth but only contain odd-integer harmonics. Square waves can be derived from summing two shifted sawtooth waves, Eq.1. Where f is the fundamental frequency in Hz and t is time.

Square(t)=Saw(t)-Saw(t- 1

2 f) (1)

The triangle wave contains only odd harmonics. Spectra tilt of triangle wave roll off much faster than in a square and sawtooth wave which is about 12 dB per octave.

2.4 Digital Implement of Classic Waveform

Classic geometric waveforms with sharp edges and discontinuity cause aliasing problem when implement in digital domain. Because period in samples of the waveform is not normally an integer, computer having to round off the discontinuity time to the nearest available sampling instant, there cause aliasing in time domain. On the other hand, waveforms like sawtooth, triangle and square wave having theoretically infinite bandwidth. Harmonic contents above the Nyquist limit are reflected down to the audible frequency range [9] [10]. This aliasing noise is heard as roughness and unpleasant inharmonicity particularly in high fundamental frequency.

p= f_s f

f_s 2

Generating geometric waveforms trivially can be seen as sampling the waveforms without any form of bandlimiting. Unless the spectrum falls very fast, this results in heavy aliasing. A sawtooth wave can be produced trivially using a simple modulo counter [17], which is described in Eq.2. Where is the sampling interval, is the fundamental frequency, n is the sample number. In Figure 2-5, is plot in left and on the right side is which been subtracted by 1 to center at zero.

(2)

Fig 2-5 Modulo counter generating trivial sawtooth.

Figure 2-6 in next page shows huge aliasing in high frequency when generating sawtooth wave with modulo counter which is compared to ideal sawtooth wave. The spectra have been computed from a 1 second signal segment with a 44100-point FFT

T_s

f₀

2 nf

(

₀T_smod1

)

s n

( )

⁼^{2 nf}

(

⁰^T^s^mod1

)

^-¹

Fig 2-6 Aliasing of trivial sawtooth generate using Eq.2.

The aliased components in Figure 2.6 can severely corrupt the sound quality, therefore the use of the classic waveforms in digital subtractive synthesis requires a efficient waveform synthesis algorithm to remove the aliased components, which should reduce the aliasing below an audible level.

There are basically three approaches to deal with the aliasing issue, which is categorized by Välimäki et al. as follows [17] [18] [19]:

1. Strictly bandlimited methods : Only the harmonics up to the half of the sampling frequency are generated.

2. Quasi-bandlimited methods: This method do not eliminate aliasing completely, suppress just enough to make less disturbing. Some aliasing is allowed mainly at high frequencies.

3. Alias-suppressing methods: in alias-suppressing methods aliasing is allowed in the whole frequency band, but it is sufficiently suppressed at low and middle frequencies using spectral tilt modifications. Method of alias suppressed triangle wave proposed in Chapter 4 is in this category.

Next, selection of different algorithms are explained and discuss.

Chapter 3 Algorithms Review

Since the method of generating triangle wave propose later are in alias-suppressing category. Algorithm reviews are focus on other alias-suppressing methods include Lane’s method and Differential Polynomial Wave. Bandlimited impulse train in Quasi-bandlimited framework and discrete summation formula in strictly bandlimited methods are explained.

3.1 Lane’s method

Lane et al. introduced algorithms that filtering a full-wave rectified sine wave to obtain classic periodic waveforms [20]. The technique of nonlinear wave-shaping is the basis for one of the early V.A models of oscillators. A sawtooth wave can be generated by algorithm shown in the block diagram of Figure 3-1. f denote the fundamental frequency in Hz.

Fig 3-1 Block diagram of Lane’s method

Abs Lowpass Highpass

S(n) IS(n)I x(n) y(n)

f/2

Adjust cutoff frequency with different f

f

The strategy is first generate a waveform that harmonic partials fall off rapidly. A full-wave rectified sine is used, Fourier series of this nonlinear wave-shaped sine is present in Eq.3 [20]. The harmonic partial of full-wave rectified sine fall off faster than a sawtooth wave and extend to infinity, aliasing fold back still exist but with less energy. A dc offset come out during full-wave rectify, see Figure 3-2 (b).

(3)

A butterworth low-pass filter applied after the wave-shaped signal (Order 8th, cutoff frequency 12 kHz is used originally in [20] and Figure 3-2). The aliasing components and the harmonics in the vicinity of the Nyquist frequency are eliminated.

At last, an adaptive first-order infinite impulse response (IIR) highpass filter is added, the spectral tilt can be corrected to yield an approximate sawtooth spectrum and the dc offset is also eliminated. Design of highpass filter is described in [20]. The filter coefficients have to be recalculated when the sawtooth frequency changes. It is suggest to save the coefficients with a table hence with more memory consumption.

Fig 3-2 Sawtooth wave and spectrum of Lane’s method

In Figure3-2 the sawtooth waveform is compute with sampling rate 44100. The spectra have been computed from a 1 second signal segment with a 44100-point FFT using a blackman window.

3.2 Differentiate Parabolic Wave

Differentiated parabolic wave (DPW) synthesis is another classic waveforms algorithm in alias-suppressing framework, based on differentiating a piecewise parabolic waveform. This method was proposed by Välimäki around 2006 [17] [18].

An extended version of DPW with higher orders, providing improved alias-suppression called differentiated polynomial wave is discussed in later section.

Figure 3-3 shows the flow chart of DPW [17].

Fig 3-3 Flow chart of Differential parabolic wave

Instead of starting with a sinusoidal signal like Lane’s method, A trivial sawtooth waveform is generated first using a bipolar modulo counter mentioned in section 2.3.

By squaring the trivial sawtooth waveform to the second power, the piecewise

-Fig 3-4 Parabolic waveform by squaring the modulo counter.

(a) The waveform, (b)Spectra of 880hz parabolic wave, the line indicate the ideal sawtooth spectral slope about -6db/octave.

The parabolic waveform has spectra decays about -12db/octave steeper than the sawtooth which has about -6db/octave. It is showed that fold-back aliasing component above Nyquist frequency has less energy. Later on, this parabolic waveform is differentiated with a first order difference filter with transfer function

1-z^-1, this filter modified the harmonic partial to approximate sawtooth wave which has fall-off rate -6dB/octave. After scaling using Eq.4 [17], a normalized sawtooth wave is obtained, see Figure 3-5. It is shown that alias noise in low frequency is suppressed but the harmonics has less energy in high frequency compared with the trivial sawtooth. Spectrum in Figure3-5 is plot using 44100 points FFT with blackman window for a 1 second long DPW sawtooth wave.

c = f_s (4) 4× f×

(

1- f f_s

)

[ ]

Fig 3-5 (a) DPW sawtooth wave, 880hz (b)Trivial sawtooth.

(With 44.1khz sampling rate)

3.3 Differentiate polynomial wave

The differentiated polynomial waveforms extend the previous DPW method to higher polynomial orders that can be differentiated one or more times to providing improved alias-suppression. Higher order polynomial functions can be derived by analytically integrating a low order polynomial function [21]. Process of higher order DPW is illustrate in Figure 3-6.

Modulo counter

f

^Polynomial^{Order N} ^c

function Waveshaping

N-1 differentiators

3.4 Bandlimited Impulse Train (BLIT)

Classic waveforms can be generated by applying an appropriate integration to a bandlimited impulse train which is described by Stilson and Smith [22]. Bandlimited impulse train (BLIT) can be generate by lowpass filtering the continuous-time impulse train. In other words, impulses are replaced with impulse response of a lowpass filter. The lowpass filter is taken to be an ideal lowpass filter with cutoff frequency fc, which has impulse response given by Eq.4

(4) Discrete time bandlimited impulse train is obtain by sampling the bandlimited continuous-time impulse train [22] shown in Eq.5, Where

Unfortunately, in Eq.11 the value of BLIT at each time instant requires summation of infinitely long sinc function. It is impossible in practice on computer. In the following sections, different methods are introduced to overcome this problem. In BLIT

3.4.1 Discrete Summation Formula can be closely approximated.

(6)

Smith & Stilson [22] proposed a similar techniques to obtain bandlimited impulse train shown in Eq.7. Where P = / is the period in samples, M is the number of harmonics and is the largest odd integer not exceeding the period P.

(7)

Both methods described above provide alias-free harmonics but requires two sine evaluations and a division per sample which are time consuming. It encounters a

3.4.2 Sum of Windowed Sincs (BLIT-SWS)

Another approach to generate BLIT is obtained by pre-calculate a windowed sinc-functions and summing the values of windowed sinc functions (BLIT-SWS)[22].

This concept is well known as wavetable synthesis [12][13]. Blackman and Kaiser window typically used. Alias-suppress quality of BLIT-SWS is determined by the window type, the number of zero-crossings and samples per zero-crossing. Wavetable synthesis consumes much memory and requires interpolation when fundamental frequency is changing.

Chapter 4 Implement an Alias Suppressed Triangle wave

In this chapter, a method for generating triangle wave which can be used in subtractive sound synthesis is proposed. Another alias-suppressing methods DPW reviewed in chapter 3 is compared. Last, the algorithm is investigated that fundamental frequency within practically used range are perceptually free when psychoacoustic phenomenon of masking and hearing threshold are involved.

4.1 Nonlinear Waveshaping

Audio effect like overdrive, valve simulation and distortion used in guitar and recording applications fall into the category of nonlinear processing [26]. They create additional harmonic or inharmonic frequency components which are not present in the input signal.

The idea for generate alias-suppressed triangle wave is base on this concept. Using sine wave as the original input like Lane’s method [20]. A nonlinear waveshaper distort the sine wave that produced only odd harmonic and the harmonic content can

4.2 Hyperbolic Tangent Waveshaper

It has been suggested that a ‘S’ like, symmetrical curve function used to simulate pentode valves can generate additional odd harmonics when a pure tone fed in [27].

Hyperbolic tangent function has this characteristic and it’s been discussed in [28] [29].

Figure 4-1 shows the output of hyperbolic tangent function with input value between 1 and -1.

Fig 4-1 Hyperbolic tangent function.

Fig 4-2 (a)Sine wave shaped by tanh(x), (b) it’s spectrum.

In Figure 4-2, 440hz sine wave shaped by hyperbolic tangent waveshaper is ploted, the dash line indicate the original sine wave. 44.1khz sampling rate and 44100 points FFT with blackman window are used. Figure 4-2(b) shows that odd harmonics come out during waveshaping but with a steep spectral roll off and fully bandlimited above 10khz. In this situation, triangle wave is hard to generate by modifying harmonic contents that don’t even exist in high frequency. Nonetheless, Lazzarini and Timoney [29] modify the hyperbolic tangent waveshaper and find out that low-aliasing square wave can be generated efficiently but which is not our goal.

4.3 Waveshaper used in Triangle Wave Algorithm

Odd harmonic frequency can be generated utilizing a common distortion waveshaper described in Eq.8, [34].

This waveshaper is widely used in simulation of guitar distortion, with the distortion factor ‘a’ range from +1 to -1. In Figure 4-3, output of waveshape function with a=0.1 and a=0.9 are plotted. This function is symmetric with ‘S’ curve as [27]

suggest.

Fig 4-3 Plot of Eq.8 with a=0.1 and a=0.9

Now take a look at the sine wave being distorted after this waveshaper, showed in Figure 4-4. The output is approximating a square wave when the distortion factor a increase. When a is close to 1 square wave obtained but with large amount of aliasing.

On the other hand, the spectra tilt fall off quickly when a is close to zero, frequency fold-back above Nyquist to the audio range are suppressed. Pictures in Figure 4-4 calculate 1 second signal segment and the first two cycles are ploted. Spectrums are compute from the signal with a 44100-point FFT using a blackman window.

Fig 4-4 Sine wave distorted by Eq.8.

4.4 Triangle wave algorithm

In previous section, appropriate waveshaper for generate odd harmonics is discussed.

that get close to -12db/octave, thus an alias-suppressed triangle wave is generated.

Block diagram of the triangle wave algorithm is showed in Figure 4-5.

Fig 4-5 Block diagram of waveshape triangle.

(9)

The waveshaper is given in Eq.8 described earlier with distortion factor a set to 0.1.

Signal amplitude of output after differentiate varied with fundamental frequencies.

For high frequency the maximum difference of two samples increase due to the fast rise up of waveform. Thus, scaling is needed. A scaling factor ‘g’ derived from 10^th order polynomial fit for frequency range 8hz to 12544 (MIDI number 0~127, [30]) given in Eq.9.

f(x[n])

Triangle wave

Delay 1 sample

+ g

-Differentiator

g=f₀´0.0001741-0.0000006

4.5 Comparison

This section evaluate the audio quality of signals produced using the proposed waveshape triangle and DPW method both in the alias-suppressing category. Methods for generating triangle wave with DPW is discussed in [21] which we have mention in Chapter 3. DPW triangle wave polynomial functions for order N=1 to 3 are gathered in Table.2 where s(n) is the modulo counter and P = / is the period in samples.

Table 2

DPW triangle wave order N=1 to 3, [21].

Polynomial order N Polynomial function Triangle wave

Input signal Amplitude Scaling factor c

The results of waveshape triangle and DPW order N=1 to 3 are compared in Figure 4-6 which been computed 1 second with fundamental frequency 1760hz, sampling rate 44.1khz. Spectrum plot using 44100-points FFT with a blackman window.

T_s

Fig 4-6 Spectrum of waveshape triangle and DPW triangle wave order N=1 to 3.

Figure 4-6 shows that alias-suppressing ability of waveshape triangle is similar to 2^nd order DPW triangle. While N-order DPW requires N-1 differentiators and the polynomial equation is more complicated in higher order to compute, waveshape triangle become computation effective. The block diagrams of waveshape triangle and DPW triangle wave are shown in Figure 4-7.

Fig 4-7 Block diagrams of waveshape triangle compare to DPW triangle [21].

modulo

counter Abs ⁺ ^X ^X ⁺ 2 Differentiators

0.5

4.6 Perceptual Evaluation

In this section, we evaluate the audio quality of signals produced using the proposed waveshape triangle method. The sound quality of alias-suppressed signals has been assessed using various methods. Välimäki and Huovilainen applied the noise-to-mask ratio (NMR), which is commonly used for evaluating the quality of perceptual audio codecs [19]. Perceptual Evaluation of Audio Quality (PEAQ) is used by Timoney et al. [31].

The human auditory system can render the aliasing inaudible in certain conditions [32]. The main psychoacoustic phenomenon involved is masking. That is, if an aliased component is located near a harmonic peak and its level is below a certain level, it is masked by the sensory system so that the aliasing is not perceived.

Another aspect is the hearing threshold in quiet. The hearing threshold level dramatically increases above 15 kHz [32].

The perceptual evaluation used here is performed by comparing the threshold of hearing and masking curve of oscillators with their aliasing levels [21] [25]. Sound quality of the V.A oscillators can be evaluated by identifying the maximum fundamental frequency up to which the aliasing is not audible.

First, use a hearing threshold function (Eq.10) [32] to find which spectral components are inaudible. where f is frequency in hz and the level is represented as absolute sound pressure level (SPL)

typically greater for a tonal masker [33].Assume that the signals are played at the 96 dB SPL (sound pressure level) to accommodate harmonic peaks in the spreading function. The reference used was set to 96 dB SPL for a sinusoid alternating between 1 and -1.

It is found that highest perceptually free note is MIDI number 103 (G7), 3136hz.

在文檔中虛擬類比電子琴聲源探討：可運用於聲音合成之三角波型實作 (頁 11-0)