Heart’s Hearing - 音樂節奏特性與心率變異性之關聯性研究及其硬體實現

2 Background

2.2 Heart’s Hearing

It is a generally accepted concept that our emotion and physical state change when listening to different music. For example, Rock music makes someone feel vigorous and increases the heart rate and Jazz music makes someone feel lethargic and slows his breath down. Is there any physiological pathway which links the perception of music with the responses of ANS? Research has revealed that the heart rate can be controlled by external stimuli [4]. However, following several years of research, it was observed that, the heart communicates with the brain in ways that significantly affect how we perceive and react to the world. Neurophysiologists discovered a neural pathway and mechanism whereby input from the heart to the brain could inhibit or facilitate the brain’s electrical activity [20]. From these researches the connection and communication between brain and heart are established.

On the other hand, it is long known that changes in emotions are accompanied by predictable changes in physiological state such as heart rate, blood pressure, respiration and digestion. When someone is aroused, his sympathetic division of the autonomic nervous system energizes him for fight or flight. When someone is in quiet times, the parasympathetic

division cools him down. Based on theses researches, it can be sure that there must be some connection between the music perception and the physical responses.

Chapter 3 Approach

3.1 HRV Signal Processing Flow

The overall HRV signal processing flow is shown in Fig. 3.1. The processing methods of each block are detailed in the following subsection.

3.1.1 Acquirement of the ECG signals

The ECG signal is captured by a 3-channel portable device (MSI E3-80, FDA 510(k) K071085) at 500Hz sampling rate from the chest surface of body shown in Fig. 3.2(a) (b) [21].

Only the channel1 (L1) data were taken to be analyzed. In the previous HRV studies, the

Fig. 3.1: The block diagram of the overall signal processing flow of HRV analysis

Fig. 3.2(a): MSI E3-80 Fig. 3.2(b): Electrodes placement

duration of recording was dictated by the nature of each investigation. And it is recommended that the recording of approximately 1min is needed to assess the HF components of HRV while approximately 2min are needed to address the LF component [3]. Therefore, 2min duration is taken as the shortest unit to be compared in this study.

3.1.2 QRS detection

As the introduction in 2.1.3, the QRS complex is the most notable waveform within the electrocardiography (ECG). Since it reflects the electrical activity within the heart during the ventricular contraction, the time of its occurrence as well as its shape provides much information about the current state of the heart. Due to its characteristic shape it serves as the basis for the automated determination of the heart rate. Therefore, QRS detection provides the fundamentals for almost all automated ECG analysis algorithms.

Within the last decade many new approaches to QRS detection have been proposed; for example, algorithms from the field of artificial neural networks, genetic algorithms, wavelet transforms, filter banks as well as heuristic methods mostly based on nonlinear transforms.

The detailed review and comparison of these methods were presented [22]. The detection algorithm described by Hamilton and Tompkins is adopted in this work because of its high reliability and low computational load [23-25].

The ECG waveform contains, in addition to the QRS complex, P and T waves, 60-Hz noise from power line interference, EMG from muscles, motion artifact from the electrode

Fig. 3.3: Relative power spectra of QRS complex, P and T waves, muscle noise and motion artifacts based on an average of 150 beats

and skin interface, and possibly other interference from electrosurgery equipment in the operating room. Many clinical instruments such as a cardiotachometer and an arrhythmia monitor require accurate real-time QRS detection. It is necessary to extract the signal of interest, the QRS complex, from the other noise sources such as the P and T waves. Fig. 3.3 summarizes the relative power spectra of the ECG, QRS complexes, P and T waves, motion artifact, and muscle noise based on the previous research [25].

The signal processing flow of QRS detection and the corresponding results are shown in Fig. 3.4 and Fig. 3.5. There are two main stages in the QRS detection flow. One is the preprocessing stage which is composed of various filters for removing noise and acquiring the QRS complex information. The other stage, peak detection, makes use of the information acquired by the preprocessing stage and some criteria to detect the QRS complex peaks. In the beginning of the preprocessing stage, the band-pass filter is used to reduce the influence of muscle noise, 60 Hz interference, baseline wander, and T-wave interference. The desirable pass-band to maximize the QRS energy is approximately 5-15 Hz [25].

Fig. 3.4: QRS peak detection flow

Fig. 3.5: Results of each step of Fig. 3.4

The band-pass filter is composed of cascaded low-pass and high-pass filters. Their difference equations are listed as (3.1). The performance details of the low-pass filter and high-pass filter are shown in Fig. 3.6 and Fig. 3.7. The amplitude response of the band-pass filter which is composed of the cascade of the low-pass and high-pass filters is shown in Fig. 3.8. The center frequency of the pass-band is at 10 Hz. The amplitude response of this filter is designed to approximate the spectrum of the average QRS complex as illustrated in Figure 12.1. Thus this filter optimally passes the frequencies characteristic of a QRS complex while attenuating lower and higher frequency signals.

( ) ( ) ( ) ( ) ( ) (

Fig. 3.6(a): Amplitude response of the low-pass filter

Fig. 3.6(b): Phase response of the low-pass filter

Fig. 3.7(a): Amplitude response of the high-pass filter

Fig. 3.7(b): Phase response of the high-pass filter

Fig. 3.8: Amplitude response of band-pass filter composed of low-pass and high-pass filters

After the signal has been filtered, it is then differentiated to provide information about the slope of the QRS complex. This derivative is implemented with the difference equation (3.2). The performance characteristics of this derivative implementation are shown as Fig. 3.9.

The amplitude response approximates a true derivative up to about 20 Hz. This is the important frequency range since all higher frequencies are significantly attenuated by the band-pass filter.

After differentiation, the signal is squared point by point. The equation of this operation is shown as (3.3). This makes all data points positive and dose nonlinear amplification of the output of the derivative emphasizing the higher frequencies.

( ) ( )

^{1 8 2}

( ) ( ) (

)

(

⁴

)

y nT x nT x nT T x nT T

Deriva

x n tiv

T T

= ⎡⎣ + − − − − − ⎤⎦ (3.2)

( ) ( )

Squaring Functio

y nT x nT

= ⎡⎣ ⎤⎦ (3.3)

Fig. 3.9(a): Amplitude response of the derivative

Fig. 3.9(b): Phase response of the derivative

The slope of the R wave alone is not a guaranteed way to detect a QRS event. Many abnormal QRS complexes that have large amplitudes and long durations (not very steep slopes) might not be detected using information about slope of the R wave only. Thus, we need to extract more information from the signal to detect a QRS event. Moving window integration extracts features in addition to the slope of the R wave. It is implemented with the following difference equation (3.4). For a sample rate of 500 sps, the integration window chosen for implementation in the thesis is 64 samples wide (which correspond to 128 ms).

After the preprocessing stage, the peak detection stage detects peaks in the signals after moving window average. The corresponding relation between ECG raw data and the signals after moving window average is shown in Fig. 3.10. The detection algorithm stores the maximal levels encountered in the signal since the last peak detection like the red dots in Fig.

3.10. A new peak is defined only after a level is encountered that is less than half the height of the maximal level. Detection occurs halfway down the back side of the peak. This approach eliminates multiple detections from ripple around the wave peak. The peak detection algorithm does not establish that a valid peak has occurred until the middle of the falling slope when the level drops below half the distance from the maximal value to the base point.

Because the time between the middle of the rising slope and the middle of the falling slope is equal to the duration of the averaging window, ideally the R peak point representing the peak of the R wave is located with fixed delay of one window’s width. Each time a peak is detected it is classified as either a QRS complex or noise, or it is saved for later classification. This work uses the peak height and peak location to classify peaks. An outline of the basic detection rules in the peak detection stage are listed as follows

1. Ignore all peaks that precede or follow larger peaks by less than 200ms.

2. If the peak is larger than the detection threshold call it a QRS complex, otherwise call it noise.

( ) ( ) ( ) ( )

( ) 1 ( 1) ( 2) ...

y nT N x nT N T x nT N T x nT

where N is the number of samples in the width of the integration window Moving Window Integral

= ⎡⎣ − − + − − + + ⎤⎦

−

(3.4)

Fig. 3.10: The corresponding relation between ECG raw data and the signals after moving window average

3.1.3 Evaluating the QRS detection algorithm

Many algorithm of HRV analysis, such as heart rate calculation, PAV detection, and PVC detection, require a very accurate QRS recognition capability. Several standard ECG database are available for the evaluation of software QRS detection algorithms. Tests on these well-annotated and validated databases provide reproducible and comparable results.

Furthermore, these databases contain many selected signals representative for the large variety observed but clinically important. The MIT-BIH Arrhythmia Database is the most frequently used database. It contains 48 half-hour recordings of annotated ECG with sampling rate of 360Hz and 11-bit resolution over a 10mV range. Twenty-five recordings with less common arrhythmias were selected from over 4000 24-hour ambulatory ECG recordings, and the rest was chosen randomly. While some records contain clear R-peaks and few artifacts (e.g., records 100-107), for some records the detection of QRS complexes is very difficult due to abnormal shapes, noise, and artifacts (e.g., records 108 and 207).

The MIT-BIH Arrhythmia Database is acquired from the PhysioNet which offers free access via the web to large collections of recorded physiologic signals and related open-source software [26]. There are forty-eight recordings in this database. Each recording include annotations that indicate the times of occurrence and types of each individual heart

beat ("beat-by-beat annotations"). The standard set of annotation codes includes both beat annotations and non-beat annotations. Most PhysioBank databases use these codes as described as Table A.1 in Appendix A. According to [22], essentially three parameters should be used to evaluate the QRS detection algorithm. They are formulated as (3.5) where TP denotes the number of true positive detection, FN denotes the number of false negatives, and FP denotes the number of false positives. Therefore, TP represents the QRS detector successfully detects the beats which are coded by beat annotations, FN represents the QRS detector misses the beats which are coded by beat annotations and FP means the QRS detector detects the beats which are coded by non-beat annotations or non-existed actually.

In this study, all the forty-eight recordings in the MIT-BIH Arrhythmia Database are used to evaluate the QRS detector algorithm. Each recording records half-hour annotated ECG, but just first ten minutes data are used to evaluate the QRS detector performance for simplicity.

The evaluation result of each recording is listed in Table A.2 of Appendix A and the performance measures are listed in Table 3.1.

3.1.4 Abnormal Beats Rejection and Compensation

The heart beat is triggered mainly by the sinoatrial (SA) node controlled by the sympathetic and parasympathetic neural systems. In addition to the SA node, other latent

( ) Sensitivity TP

TP FN

Positive predictivity TP

TP FP

Detected QRS time Actual QRS time Average Time error ms

Table 3.1: The performance of simplified algorithm adopted in this work

Sensitivity Positive Predictivity Average Time Error(ms)

95.65% 99.36% 5.33

pacemakers exist throughout the heart. Normally, regular conduction of the electrical impulse from the SA node and the refractory period of the cells reject any other electrical source except those coming from the SA node. However, some of the additional pacemakers may, in certain cases, interpose additional electrical impulses that generate ectopic beats. Besides, QRS complex misdetections can generate a similar effect to that of ectopic beats in HRV analysis [27]. The detector errors can be false positive (FP) when a false beat is detected due to noise or a high amplitude T wave or false negative (FN) when a real beat is missed due to a low amplitude QRS or noise masking. The abnormal beats make the time associated with HRV exhibit a sharp peak and make the power spectral density estimation in the frequency domain analysis strongly unstable shown in Fig. 3.10(a).

In this study, the criterion based on the variation of the instantaneous heart rate is used as the abnormal beats detector [27]. The normal heart beat shows a band limited variation of the instantaneous heart rate. So, it is possible to impose a threshold TH on the derivative of the instantaneous heart rate to screen out the abnormal beats. The criterion is formulated as (3.6).

The threshold TH is set to 0.2 empirically in this study. When the criterion in (3.6) is not met for some peak time instant , it means that some position t_k t_k₋₁, t_k, or t_k₊₁ are abnormal. The six conditions which judge whether the anomalies were caused by QRS complex misdetections or not is checked all over the recorded data: by removing , removing , inserting an intermediate beat between

tk satisfied when removing, it implies a FP at the removal position; if the criterion is satisfied on insertion, this implies a FN and if satisfied when moving it typically implies an ectopic beat.

It can be found that almost all of the abnormal beats are produced by QRS complex

( )( )( )

misdetection from the collected data in the experiment.

There is a data filtering mechanism existing in this study for data accuracy and stability.

There are two criteria for each HRV analysis section in one ECG recording. First, the number of detected abnormal beats in each HRV analysis section must be lower than 5. Second, it must be confirmed that there is not any abnormal beat remaining after the abnormal beat processing in each HRV analysis section. If any one criterion is not satisfied in any analysis section of one ECG recording, the ECG recording will be looked as the unstable data and be abandoned. A simple method for abnormal beats processing is utilized and the detailed algorithm is formulated in Appendix B.

It can be seen in Fig. 3.10(b) that the more stable and accurate spectrum analysis can be obtained after the removing and compensating of these abnormal beats.

Fig. 3.11(a) Abnor al beats detection m Fig. 3.11(b) Abnorm l beats removement a and compensation

Fig. 3.12: The RR interval time series after 4Hz cubic spline interpolation.

3.1.5 Interpolation and De-trending

3.1.5.1 Interpolation

The RR interval time series is an irregularly time-sampled signal. This is not an issue in time domain analysis, but in the frequency domain analysis it has to be taken into account. If the spectrum estimate is calculated from this irregularly time-sampled signal, implicitly assuming it to be evenly sampled, additional harmonic components are generated in the spectrum. Therefore, the RR interval signal is usually interpolated before the spectral analysis to recover an evenly sampled signal from the irregularly sampled event series. The RR interval time series after interpolation is shown in Fig. 3.11. The 4Hz cubic spline interpolation is used in this study [28].

The fundamental idea behind cubic spline interpolation is based on the engineer’s tool used to draw smooth curves through a number of points as shown in Fig. 3.12. The mathematical spline is similar in principle. The points, in this case, are numerical data. The weights are the coefficients on the cubic polynomials used to interpolate the data. These

Fig. 3.13: The fundamental idea behind cubic spline interpolation

coefficients ’bend’ the line so that it passes through each of the data points without any erratic behavior or breaks in continuity. The essential idea is to fit a piecewise function of the form shown as (3.6). And is a third degree polynomial function defined by (3.7) for

1, 2, , 1 i= … n−

In this work, “natural splines” which include the stipulation that the second derivative be equal to zero at end point is adopted. By the four properties of cubic splines listed in (3.8), the weights can be determined by the matrix equation (3.9) and (3.10). The iterative method to solving M is shown as (3.11) for the future hardware implementation. _i

( )

Fig. 3.14: The RR interval time series after de-trending.

3.1.5.2 Detrending

Heart rate variability (HRV) is widely used quantitative marker of autonomic nervous system activity. Various time and frequency domain methods have been applied to HRV analysis. A traditional spectral method, power spectral density (PSD) estimation, provides information about power distribution as a function of frequency. Spectral estimation inherently assumes that the signal is at least weakly stationary. However, real HRV is usually non-stationary. Non-stationarities like slow linear or more complex trends in the HRV signal can cause distortion to time and frequency domain analysis. Origins for non-stationarities in HRV are discussed [15]. The method tries to remove the slow non-stationary trends from the HRV signal before analysis is called de-trending. The detrending is usually based on first-order or higher order polynomial models. In this thesis, an advanced detrending procedure based on smoothness priors approach is adopted [29]. The main advantage of the method is its simplicity. The frequency response of the method is adjusted with a single parameter. This smoothing parameter should be selected in such a way that the spectral components of interest are not significantly affected by the detrending. The RR interval time series after de-trending is shown in Fig. 3.13. The detailed processing flow of detrending is explained as follows:

The RR interval time series is denoted as (3.12). The detrended nearly stationary RR series can be calculated as (3.13) where the second-order difference matrix D₂∈ ⁽^N^{− ×}³^{) (}^N⁻¹⁾

(

^, ^,..., ^T ^N ¹

is shown as (3.14). The frequency response of the detrending method is detailed as follows.

Equation (3.13) can be written as ˆz_stat =L , where ^L^{= − +}^I

(

^I ^λ²^{D D}²^T ²

)

⁻¹ corresponds to a time-varying finite-impulse response high-pass filter. The frequency response of for each discrete time point, obtained as a Fourier transform of its rows, is presented in Fig. 3.14. The filtering effect is attenuated for the first and last elements of and, thus, the distortion of end points of data is avoided. The effect of the smoothing parameter

λ on the frequency response of the filter is presented in Fig. 3.15. The cutoff frequency of the filter decreases when λ is increased. Besides, the λ parameter the frequency response naturally depends on the sampling rate of signal z . Because each RR series is first interpolated to obtain a regularly sampled series with sampling rate of 4Hz, the smoothing parameter λ is set to 300, which equals a cutoff frequency of 0.043 Hz.

Fig. 3.15: Time-varying frequency response of ^{L N}

(

^{− =}^{1 50}^and^λ⁼¹⁰

)

. Only the first half of the frequency response is presented, since the other half is identical

Fig. 3.16 Frequency responses, obtained from the middle row of L, for λ = 1, 2, 4, 10, 20, 50, and 300. The corresponding cutoff frequencies are 0.189, 0.132, 0.093, 0.059, 0.041,

0.025, and 0.011 times the sampling frequency

3.1.6 Measures of Heart Rate Variability

There are many measures and analyzing methods of heart rate variability have been proposed such as time domain analysis, frequency domain analysis, linguistic analysis, etc.

在文檔中音樂節奏特性與心率變異性之關聯性研究及其硬體實現 (頁 25-0)