3 Method
3.3 Model Setting and Data Sampling
The study area is parameterized into 33×33×33 nodes with the horizontal spacing of about 36.26 km and vertical spacing of 31.25 km. The model space centered at the Rio Grande Rift laterally expands the area enclosing all the stations used, and vertically extends from the surface down to the depth of 1000 km (Fig. 3-2).
To assess the spatial distribution of the data sampling from the different perspectives of ray theory and finite-frequency theory, the diagonal values of the product of the transpose of the Gram matrix AT and A, i.e., diag(ATA), are calculated as a measure of sampling density. Each element of diag(ATA) represents the total sum of the squares of the path lengths of 1-D rays or of the square values of 3-D volumetric kernels that contribute to a certain node. The velocity variations at the nodes with dense crossing rays or large amplitudes of sensitivity kernels implicitly have the potential to be better resolved.
Fig. 3-3 presents the three vertical cross sections and six constant-depth maps of the diag(ATA) values in a logarithm scale. The area of the highest sensitivity is under the La Ristra array along which the stations were densely deployed and served for the longest period. For the ray-theoretical sensitivity plot, the ray-path trajectories are clearly seen, especially for those received by the La Ristra array. The very narrow sensitivity primarily confined along the ray path results in heavily fragmented images as observed in Fig. 3-3(a) and (c). Other than the La Ristra array, the USArray with the deployment of evenly-spaced stations provides fairly uniform sensitivity above 400 km depth. Below 400 km remains only intense ray beams received by the La Ristra array.
On the other hand, the sampling of finite-frequency kernels is relatively more
homogeneous because of their banana-doughnut shapes and broad cross-path widths.
For the sensitivity plot including only the high-frequency travel-time data, the sampling distribution is more or less similar to that depicted by ray theory as shown in Fig. 3-3(b).
Whereas combined with the sensitivity from the low frequency data, the differences immediately come into sight in Fig. 3-3(d). The regions where no rays even travel through can be yet “felted” by the off-path sensitivity of the finite-frequency waves passing nearby.
There is no standard criterion to compare the models based on different data rules.
Moreover, for a family of the resolved models evoked with different regularization parameters, there always exists a tradeoff between data fitness and model complexity.
Therefore, the typical tradeoff analysis between data fit and model norm or variance is conducted to choose the optimal model. The model goodness of fit to observed data is evaluated by variance reduction, VR, calculated from
2
where di is the ith travel-time shift, and ˆd the corresponding prediction from the i chosen model. Figs 3-4 and 3-5 show the tradeoff relations for the resolved P and S models, respectively. The tradeoff curves between variance reduction and model variance associated with different forward theory and parameterization invoked in the inversion are shown on the left, while those between variance reduction and model norm are shown on the right. The comparisons of three parameterization methods including the damped least square (simple), multi-scale, and horizontal multi-scale and vertical convoluting quelling hybrid parameterization are made in (a) and (b) of these two figures. Regardless of the invoked data rule or forward theory, multi-scale parameterization always achieves the highest degree of variance reduction whereas the
data fitting of the simple damping models is the worst, given the same model variance for comparison.
When model norm is instead calculated for the tradeoff analysis, finite-frequency tomography yields a larger model norm than ray-based tomography under the same variance reduction. Figs. 3-3 (c) and 3-4 (c) compare the tradeoff between the ray and finite-frequency kernel derived models associated with multi-scale parameterization.
In general, the ray-obtained models have higher variance reductions and better data fits while the finite-frequency models yield larger amplitudes of velocity heterogeneities.
The best-resolved model usually takes place at the tradeoff curves with maximum curvatures. With the corresponding damping value, one can get an improved, fairly good variance reduction without increasing radically the model norm or variance. In the following, I choose the models with equally good fit of the data or the same variance reduction for further comparison of the resulting velocity structures. The final chosen results have a variance reduction of ~70% for the P-wave model and ~60% for the S-wave model. Because the low quality of the long-period P data would reduce the data fit substantially, only high frequency travel-time data is used to obtain the P wave model.
Fig. 3-2 Topographic map of the study area and grid discretization denoted by gray lines used for model parameterization. The region of interest covers the northern and middle part of the RGR, the Colorado Plateau and the Great Plains to the east. The grid spacing is about 36 km in the horizontal directions and 31 km in the vertical direction. Inverted white triangles indicate the station
distribution. Black lines outline the Colorado Plateau and the Rio Grande Rift. The Jemez lineament – a volcanic belt as young as 4 Ma are filled with red color.
(a) High-frequency P+PcP+PKPdf ray sensitivity
(b) High-frequency P+PcP+PKPdf kernel sensitivity
Fig.3-3. Comparison between (a) ray-theoretical sensitivity and (b) finite-frequency kernel sensitivity from high frequency P, PcP, and PKPdf travel-time data. The sensitivity from 3-D kernels is more evenly-distributed, with less obvious ray-path trajectories from nearby earthquakes to dense stations as revealed in the sensitivity of 1-D rays.
(c) Low-frequency S+ScS ray sensitivity
(d) Low-frequency S+ScS kernel sensitivity
Fig. 3-3 (Continued) Comparison between (c) ray-theoretical sensitivity and (d) finite-frequency kernel sensitivity from low frequency S and ScS travel-time data. As the frequency content is lower for S-wave data, the 3-D kernel sensitivity is more homogenized because of the fatter banana-doughnut geometry. The 1-D ray sensitivity remains shattered because of its infinitely thin, frequency-independent width.
(e) P+PcP+PKPdf kernel sensitivity
(f) S+ScS kernel sensitivity
Fig. 3-3 (Continued) Finite-frequency sensitivity for P+PcP+PKPdf and S+ScS data including all the data measured at both high- and low-frequency bands.
(a)
(b)
(c)
Fig. 3-4. Tradeoff curves of variance reduction versus model variance (left) and variance reduction versus model norm (right) for the combined P, PcP, and PKPdf (P3) dataset. Comparison of three regularization methods are made for the (a) ray-obtained models and (b) finite-frequency models. The high-frequency ray- and kernel-based models are compared in (c) under the same multi-scale
parameterization. Large dots in (c) indicate the damping value chosen for the optimal models.
(a)
(b)
(c)
Fig. 3-5. Tradeoff curves of variance reduction versus model variance (left) and variance reduction versus model norm (right) for S and ScS dataset. See Fig. 3-4 for detailed description of figure arrangement.