Strong Scaling - Parallel Performance - Parallel Computing of QDS method

Chapter 3 Complex Gas Flow Simulations using the QDS-N 2 Method

3.5 Parallel Computing of QDS method

3.5.3 Parallel Performance

3.5.3.2 Strong Scaling

3.5.3.2 Strong Scaling

Strong scaling, which fixes computational domain size but increases the number of processors, is also used to measure the parallel performance of a given application.

Strong scaling is used to determine a reasonable length of time to process a relatively large application using many processors without excessive overhead cycles. Strong scaling efficiency is give as:

N t^N 100%

⎛ ⎞⎜ ⎟

⎝ ⎠ . (66)

where t is the time required to complete a job unit. The subscript indicates the number of processors. tN is the amount of time required to complete N job units with N processors.

In this section, in Tables 3-5 to 3-7, we present three strong scaling performance results for different numbers of cells, representing small, medium, and large-scale domains. These tables show the relationships among computation time, number of processors, and number of cells per processor. The largest number of processors we consider is 256. As shown in Figure 3-31, for a 2D shock-bubble problem using 500,000 cells and 2,000 time steps, there was a net increase in speed of approximately 78% when compared with ideal performance (red line). As illustrated in Figures 3-32 and 3-33, for 2 million and 12.5 million cells, the speed up was 68.5% and 65.5%,

respectively. On the other hand, if we compare results for the three cases when the number of cells per processor is equal, the results are approximately equal. For example, for 125,000 cells per processor, computation times are 2761.14 s, 2890.74 s, and 2806.29 s. This indicates that, regardless of the size of the computational domain, we can obtain a rational answer using the same number of cells for each processor.

3.6 Brief Summary

The major findings of the study of the QDS algorithm for the N² and 2N methods can be summarized as follows:

1. The one-dimensional QDS method extended to a higher order method is useful for simulating the shock acoustic wave problem and can be used to compute higher-order problems.

2. Using the N² method in a two-dimensional domain obtains results comparable to the benchmark. The accuracy achieved using the N² method is considerably better than that achieved using the 2N method.

3. The N² extension of the QDS method requires approximately three times the computation time compared to the original 2N method for the same number of cells.

4. For both the 2N and the N² methods, when solving the Euler equation, accuracy effectively increases with increasing grid resolution. However, it is essentially the same when N ≥ 3.

5. The advection of vertical distribution test case results show that both methods are essentially equally accurate; however, the proposed the N² solver is approximately 25 times faster than the 2N solver.

6. The QDS method is suitable for three-dimensional computation and can be applied to three-dimensional simulation problems.

7. Parallel performance studies, including strong and weak scaling, show that the parallel efficiency for a large-scale problem (0.5, 2, and 12.5 million cells) can reach up to 75%, 68.5%, and 65.5%, respectively, using 256 processors at the NCHC APLS cluster.

Chapter 4 Analysis of 2NMethod as Compared to QDS-N

Method

4.1 Introduction

In the previous section, we identify the difference between 2N and N² method by comparing with several numerical methods as benchmark. Since both one-dimensional 2N and N² method are theoretical the same. The major difference of both methods is mathematically the flux reconstruction travelled to the diagonal cell in two-dimension.

In this chapter, we use an analytical aspect to discuss the difference and to observe which flow condition is suitable for N² method or 2N method for simulation.

4.2 Derivation of Analytical Fluxes of QDS Methods (2N vs. N²) 4.2.1 Mass Flux

According to chapter 2, the different part of two methods is flux reconstruction in two-dimension. Here we present the fluxes value which contributed to correct cell after a time step ∆t when the source cell travelled to the diagonal destination cell in Figure 1.

The net of mass fluxes M2N obtained from 2^nd order 2N method shows blow.

2 2 energy) for 2^nd order in x and y-direction . Those are described blow,

( )

On the other hand, the analytical mass flux equation MN2 obtained from 2^nd order N²method is

As we can see, the difference of 2N and N² method for net mass flux form source

2 2 N

MN −M

(77)

The difference of 2N method can be identified form Eq.(78).

2 two-dimensional reconstruction of mass flux.The momentum flux for the 2N method in x-component and y-x-component can be described as follows:

2 2

To simplify the equation of momentum fluxes, those equations are described with the net of mass flux reconstruction. The net momentum fluxes in two components are described as follows:

2 _ 2

_ x N

Px N −P , Py N_ 2 −Py_ 2N (83) The difference of 2N method can be identified from Eq. (83):

2 2

Follow the previous section, the energy fluxes of the 2N method are divided with Eq. (68) ~ Eq. (71), shown in below:

where Ω is the number of simulated degrees of freedom (i.e. in this chapter, Ω = 2). The energy fluxes in N² method is described by following equation:

2 2

Therefore, the net energy flux is combined with Eq. (85) and Eq. (86) that can be formulated as

2 2N

EN −E ⁽⁸⁷⁾

According to Eq. (87), the difference of two methods can be shown that

4.3.1 Diagrams of Relative Difference Distribution

In order to estimate the difference between the two methods, the unknown elements in the equation have to be defined. The test cases that we simulate contain a domain of 1000 × 1000 cells; the cell size in the x and y directions is 1.0e^-3 (∆x = ∆y = 1.0e^-3), and the time step ∆t is equal to 1.0e^-4. The density, velocity, and temperature are changed from 0.1 to 5.0, which are observed for the difference in the fluxes. Therefore, the CFL number is calculated to be 0.01–0.5 on the basis of the changing value of the density, velocity, and temperature. The gradients for all the conserve properties are assumed to be 1.0e^-5 and 1.0e^-6 for two cases. Therefore, on the basis of our assumptions, we calculate the difference in density (∆ρ) to be 1.0e^-2 for the first case by using the mass gradient shown in Eq. (67). Furthermore, the difference in velocity (∆v, ∆u) and temperature (∆T) are obtained in the same manner. For another case, the difference in the density, velocity, and temperature are 1.0e^-3.

The purpose of this simulation is to estimate the difference in the fluxes between the 2N and the N² methods in terms of the mass flux, momentum flux, and energy flux

on the basis of the variations of the density, velocity, and energy observed when the source cell travels to the diagonal destination cell. We compare two cases in which the differences in the density, velocity, and temperature are 1.0e^-2 and 1.0e^-3, respectively.

Figure 4-1a shows the mass value of the difference between the 2N and the N² method in the density, velocity, and temperature range from 0.1 to 5.0. Most of the significant difference in the mass flux is observed in the low-density range, and in particular, the largest difference is observed when the temperature is equal to 0.1 and the velocity is increased to 2.5. Another range in which we can clearly see the difference is the low-density and low-velocity range in which the temperature is either low or high. The tendency of case 2 for the largest difference is shown in Figure 4-1b.

The other results of the comparison of the difference between the two methods with respect to the momentum and the energy are shown in Figures 4-2 to 4-4.

As we can see from the figures, the highest value of the difference is obtained in the low-density range. Figure 4-5 shows two cases when the difference value is in the low-density range. We compare velocity with the value of mass flux for three temperatures that each of them has the lowest point which is 1.2, 2.1, and 2.6. Even case 2 has the same lowest point as case 1.

Therefore, when the simulation domain is in the low-density, low-temperature, and high-velocity range, the accuracy obtained using the N² method is considerably better than that obtained using the 2N method.

4.3.2 Example with Large Difference

In the previous section, we discussed that most of the different fluxes between 2N and N² are around two parts: the first part is at the low density, low speed, and the second part is at the low density, low temperature, and high speed as shown in Figures

4-1 to 4-4. In order to validate the discussion, section 4.3.1 is provided; here, we discuss four test cases, namely shock-bubble interaction, Euler-four-shock, and Mach 3 over a forward step and a backward step.

4.3.2.1 Shock-Bubble Interaction

The first test case is demonstrated in the same manner as the preceding case discussed in Section 3.3.1. The comparisons shown in the schlieren image show that the accuracy obtained by using the N² method is considerably better than that obtained by using the 2N method. Although both methods can identify the right position for the shock wave, the result obtained using the 2N method does not descript specifically at a complex place, especially in the bubble. In this section, we discuss in more detail, the conditions in the fluid that lead to a more significant difference between the two methods. Figure 4-6 shows the four contours of the N² method, namely density, temperature, and two dimensions of velocity. The result obtained is the same as that shown in Figure 3-6. According to Figure 3-6, most differences are almost around the x

= 0.6~07 and y = 0.1~0.15 region. In order to analyze the result, the properties around the bubble that attacked after the shock wave are considered to be low density, high temperature, and high velocity. In order to observe higher difference of two methods, we decrease the temperature of the bubble from 10 to 2 initially. The schlieren result shows that the N² method descripts more details than the 2N method at a low density and a low temperature as shown in Figure 4-7. The density and the temperature contours of this result are illustrated in Figures 4-8 and 4-9. The temperature contour shows a more considerable difference between the two methods than the density contour. As expected, the difference between the two methods is more distinct in the complex

region. Therefore, the complex region provides more specific results in the low-density environment in the case of the N² method.

4.3.2.2. Euler-Four-Shock Simulation

The analytical fluxes that we discussed in the previous section are focused on the diagonal direction. The first case gave a hint about the difference between the two methods in a complex fluid. Therefore, the second test case combines the diagonal direction and the complex flow introduced in Salichs [2006]. The benchmark is used for computing the numerical solution employing the piecewise hyperbolic method-Marquina’s ﬂux formula (PHM-MFF) and the power PHM-MFF schemes. The test problem is initially divided into four quadrants sharing a common corner at 0.75 and 0.75 in the domain [0, 1] × [0, 1], as illustrated in Figure 3-8. These quadrants initially have the relationship shown in Eq. (57).

The result that we obtained using the N² solver on a computational grid of 1000 × 1000 cells at the time of 0.4 is similar to that obtained using the total variation diminishing-monotone upstream centered schemes for conservation laws (TVD-MUSCL) [Čada et al., 2009] (see Figure 3-10a). The Courant–Friedrichs–Lewy (CFL) factor is set as 0.5. All three approximations obtain the basic structure of the solution where the four shock waves interact. Three results show that the position of the shock wave is the same as that shown in Figure 3-10. However, the area from (x, y) = (0.2, 0.2) to (0.4, 0.4) is not for the 2N method that has less variation. The differences between the N² and the 2N methods are observed in the contact area. In the same way, in order to analyze the differences, we observe four properties of the contour results of the N² method in Figure 4-10. The results indicate that the density and the temperature in the contact area are lower than those in any other area. According to Figure 4-10, it is

reasonable to explicate why the accuracy of the N² method is better than that of the 2N method.

4.3.2.3 Mach 3 Flow over a Forward Step

This case is designed for a high-speed velocity model through a forward step and involves complex oblique shock reflections. The inlet and the outlet boundary conditions in the domain are both supersonic flows. The geometry of the test case is illustrated in Figure 3-16, and the result of the flow is shown in Figure 3-17. The purpose of this test case is to demonstrate that the two methods obtain approximately the same result. Since a normal shock wave moves along the x direction, the flux reconstruction calculated in the diagonal direction is not considered to be significant.

Based on the discussion presented in Section 4.3.1, we know that most of the differences in the flux can be observed in the diagonal direction. Moreover, the oblique shocks usually occur with an increase in the temperature, pressure, and density. This is contrary to the definition of the difference between both the QDS methods carried out under low-density, low-temperature, and high-velocity conditions. Figures 11 and 4-12 show that the density and the temperature contours use the same initial flow and boundary conditions as those discussed in Section 3.3.4. The results revealed a considerably high density and temperature; few results of low density and temperature were observed behind a corner. The most obvious discrepancy between the 2N method, the N² method, and Keats’s results is the location of the shock reflection on the upper wall. The point located at x = 2.4 is the contact point between an incident shock and a reflected shock obtained using the N² method; it is at the same position as that in Keats’s result. Consequently, the results reveal two not-so-apparent differences: first,

the flow properties as low velocity, high density, and high temperature; and second, the flow around the shock wave and the oblique shock.

4.3.2.4 Shock Wave Diffraction over a 90-degree Sharp Corner

Different from the preceding geometric configuration, the oblique shock does not occur in this test case instead of secondary physical phenomena such as a second shock, contact surface, and vortex. The last case discussed in this chapter involves the use of the same geometrical and boundary conditions as those considered in the case discussed in Section 3.3.5. The initial conditions of the flow for Ms are listed in Table 3-3.

According to the explanation of the structure of the perturbed region by Skews [1967], the location of the vortex is well defined for Ms < 1.5. Therefore, in order to observe more secondary physical phenomena in this test case, we discuss the case of shock wave diffraction by using the initial flow velocity of Ms = 2.4. The experimental result obtained by Schardun [Dyke, 1997] is selected as the benchmark. The flow structure around a perturbed region is outlined in Appendix A. Because the comparison has an experimental result as a benchmark, the resolutions of the simulation are calculated using a considerably fine cell: 1000 × 1000 uniform grids. The CFL number is set to 0.2.

The simulation time is calculated using the same principle as that discussed in Section 3.3.4 and is t = 0.775.

Figure 3-13 shows three schlieren results using the second-order 2N method, N² method, and the experimental result separately. These results can be used for gauging the ability of these methods to detect the expansion region by juxtaposing them with the three results shown in Figure 3-18. The shape of the primary shock wave shown in Figure 3-13b and c matches the experimental result. The secondary shock waves obtained using the two methods are accurately located at the correct position behind the

wall and between the slipstream and the contact surface corresponding to the experimental result. The accuracy of the second shock wave in both the results is clear.

However, the phenomena in the vortex and the contact surface of both the results are contrary to the second shock. It is obvious that the vortex obtained using the N² method is presented in considerable detail than that obtained using the 2N method. The contact wave is considerably diffused as compared to that in the case of the N² method and the experimental result. The density, temperature, and velocity contours in this case are shown in Figures 4-14 to 4-16. These results show that the vortex belongs to the region with a low density, low temperature, and high velocity. The contact surface in the results is shown in the region of low density, low temperature, and low velocity. This is reasonable for supporting the theory in Section 4.3.1 that a more significant difference between the two methods shows the same trend as the large discrepancy is in the region of low density, low temperature, and low velocity.

4.4 Brief Summary

The major findings of the study of the difference analysis of the QDS-2N method presented in this chapter can be summarized as follows:

1. There are two regions of flow properties where a large discrepancy of conservative fluxes occurs between the QDS-2N and the QDS-N² methods. The first is in the region of low density (down to 1.0), low temperature (down to 2.5), and high velocity (up to 2.0). The second is in the region of low density and low velocity (down to 1.2).

2. The conservative fluxes of the QDS-2N method that move along the diagonal direction exhibit a considerably large difference as compared to the QDS-N²

method. In contrast, the conservative fluxes that move along the horizontal or the vertical direction exhibit a significantly smaller discrepancy.

3. The normal shock and oblique shock waves in the resolution obtained by using the two methods are approximately located in the same region, and most of the shock waves, as predicted by the QDS-2N method, move in the diagonal direction.

4. Because the properties of an after-expansion wave are low density, low temperature, and high velocity, we have found a large discrepancy between the QDS-2N method and the QDS-N² method.

Chapter 5 Conclusion and Recommendations of Future Work

5.1 Summary

In this thesis, a true-direction multi-dimensional higher-order extension of the QDS method, referred to as the QDS-N² method, for solving the inviscid Euler equation is investigated numerically and theoretically. The major findings of this thesis are summarized in the following two sections in turn.

5.1.1 Numerical Investigation of QDS-N² Method

1. The results of the one-dimensional shock and acoustic wave interaction problem demonstrate an improvement for higher orders of accuracy (up to third-order) of the QDS-N² method.

2. The QDS-N² method improves the solution in the flows unaligned with the computational grid as compared to the QDS-2Nmethod.

3. The QDS-N² method significantly reduced the amount of numerical dissipation within the solution as compared to the QDS-2Nmethod.

4. Despite the additional computational expense associated with the QDS-N² method for the same computational grid, for any given degree of accuracy, the proposed solver was found to be several times (up to 25 times in the case of the advection of vortical disturbances) faster than the original QDS-2N method.

5. Of particular interest is the test case of the advection of vortical disturbances, where the QDS-N² method improves the radial symmetry of the result

approaching the analytical solution, while the QDS-2N method failed to converge to the analytical solution even when a very fine grid is used.

6. The results are essentially the same when N≥ 3 because the integration of the Gauss function with a polynomial (degree ≤ 2) using the Gauss-Hermite integration technique becomes exact.

7. Parallel performance studies, including strong and weak scaling, show that the parallel efficiency of shock bubble interaction for a large-scale problem (0.5, 2, and 12.5 million cells) can reach up to 75%, 68.5%, and 65.5% respectively using 256 processors at the APLS cluster of National Center for High-Performance Computing, Taiwan.

8. Parallel performance of weak scaling shows that the average efficiency of shock

在文檔中應用真實流向概念重建通量法則於靜直接模擬法速解尤拉方程式的研究 (頁 66-0)