South China

Sea

21°51’ N 25°21’ N

◇Experiment Data

− Taiwan

Figure 5.1: The experiment data in experiment–B

Table 5.2: Experimental results of t_E in B–I and B–II setting 256 − u − E1 − O2 AvgDev(m) 0.5087

(−9.48899sin(φ)λ+1.92778e⁵)e³−(1.36694φ²−4.6575λ)e⁴

φ−2.54796eλ−1.49749e² −sin(φ)(φ−2.54796eλ−1.49749e^4.86773λ³^−8.13652e⁴ ²)+ 2 setting 256 − s − E2 − O2 AvgDev(m) 0.4013

TN[−2.65186sin(φ)e²−8.68171e³−1.68387e⁻¹⁷T_N³]−e⁸(2.52098φ²−3.56625e⁴)

φ−4.91899e⁵−TNe⁻²[4.00273−4.26238e⁻²sin(φ)] − 11.25 setting 512 − s − E1 − O2 AvgDev(m) 0.4028

λe³(4.87771λ−4.95387sin(φ))+e⁴(5.28864φ²−7.46274e⁴) φ+0.0191201λ²−8.69869 − 1.6 setting 512 − s − E2 − O2 AvgDev(m) 0.6345

λ[−7.26026e³sin(φ)+5.22277e³λ]

φ+2.02448e⁻²λ² + ^φ[5.68214e_φ+2.02448e⁴^φ+5.16929e−2λ²²^sin(φ)]− _φ+2.02448e^8.01763e−2⁸ λ² + 0.2 setting 1024 − u − E2 − O2 AvgDev(m) 1.1053

TNe²(−5.24391e⁻³sin(φ)φ+2.68252)−[8.40946φ+1.51947sin(φ)]e⁸+5.93986e⁴φ³

φ²+7.86776e⁻⁴TN+6.01565e⁻³ + 1.2

{+, −, ×, ÷, sin, cos, tan}. For convenience, we use the notation, [256|512|1024]-[u|s]-[E1|E2]-[O1|O2], to express the number of data (256, 512, 1024) extracted from each sampling block, with (s) or without (u) being sorted by λ or φ in the experiments E1 or E2 using operator sets O1 or O2. The best transformation equations obtained in this experiment are presented in Table 5.2 and Table 5.3. From the results, we induce that operators in O2 provide better regressive functions than others. This may be because of the non-leaner relationships between Ellipsoidal coordinates and Cartesian coordinates.

The existence of non-leaner operators can better describe the transformation functions than just basic, usually linear, operators. We also find that the trained equation for T_E performs better than the one for T_N. This may be caused by the biased distribution of training data selected from I_E and I_N.

5.2.3 Experiment–A–II

We know that T_N and T_E are cross-referred, i.e, T_N=f_N(~g, T_E) and T_E=f_E(~g, T_N). In this experiment, we include T and T together with (φ, λ, h) as input data. Operators

Table 5.3: Experimental results of t_N in B–I and B–II setting 256 − u − E1 − O2 AvgDev(m) 2.5056

1.10767e⁵λ−2.3349e³

1+3.17788e⁻³sin(φ) + 7.25

setting 256 − u − E2 − O1 AvgDev(m) 2.4911

(1.10919λ²+1.02362)e⁵+TEe⁻⁷(7.72509TE−8.55344e⁴λ−1.87762e⁶)

λ+8.14987e⁻² − 1.5

setting 512 − s − E1 − O2 AvgDev(m) 2.5177

λ²(1.10868e²λ−5.95677)+5.08182e²

λ²(1.0e⁻³+3.17088e⁻⁷sin(φ))+[6.30707e⁻²λ−8.2436]e⁻⁵ + 8 setting 512 − s − E2 − O1 AvgDev(m) 2.5061

TE[+2.27391e⁻¹⁴T_E³+(1.10708e²λ−3.52056)e³]−e⁷(7.3863λ−110.574)

TE−e¹(1.42λ+6.81256) +_T2 ^1.71916e¹³

E−TEe¹(1.42λ+6.81256) − 2.5 setting 1024 − u − E1 − O1 AvgDev(m) 2.5883

1.75968e⁴λ−1.03562e³

1−4.63439e⁻³ + _φ2(1−4.63439e^4.54473e⁸^λ⁻³) − 0.35 setting 1024 − u − E2 − O1 AvgDev(m) 2.5132

TEλ²(1.10428T_E²−1.73612e⁷)e⁵−TEλ(1.68998TE+4.06722e²)e¹²+4.52439e²⁰ T_E²[λ(TE−2.11095e²)−1.52949e⁷]−1.22274e⁵ − 3.5

0.0 0.5 1.0 1.5 2.0 2.5 3.0

256-u-E1-O2 512-u-E1-O2 1024-u-E2-O2 256-s-E2-O2 512-s-E1-O2 1024-s-E1-O2 Avg. Dev.(m)

Avg. Dev. of all blocks Max. Dev. of all blocks

Figure 5.2: The average and maximum deviations between t_E and T_E

are the ones in O1 and O2. In order to avoid biased selection of training data, we perform 10 times of this experiment with the same parameter settings. In this repeated training process, two modes of generating initial population of trees are used: one is purely by random; the other is based on the best tree obtained previously. The best transformation equations obtained in this experiment are also listed in Table 5.2 and Table 5.3. The average and maximum deviation of these two experiments are presented in Figure 5.2 and 5.3. The introduction of cross-reference of T_N and T_E improves the performance of training t_E in accuracy. The average performance of t_N obtained in this experiment is better than that in experiment E1, but the average performance of t_E is opposite.

This phenomenon may be explained by that t_E doesn’t need to refer to T_N but t_E does.

Interestingly, initial populations with previously trained trees can have better results than randomly generating initial population in both convergence and accuracy. The order of feeding data into the regression engine may be a considerable factor, according to the results in Table 5.2 and Table 5.3. However, sorted data is significantly helpful to the accuracy of TE, but is not for TN. We further study the distribution of data in the sampling blocks by examining their density, i.e., the number of sampled data v.s. the maximum number of samplable points. Figure 5.4 and Figure 5.5 present the accuracy of regression functions when data are selected with different sampling density. A better distribution on selection of training data may be important in this method.

0.0 0.5 1.0 1.5 2.0 2.5 3.0

256-u-E2-O2 512-u-E2-O2 1024-u-E1-O2 256-s-E2-O1 512-s-E2-O1 1024-s-E2-O2 Avg. Dev.(m) Avg. Dev. of all blocks Max. Dev. of all blocks

Figure 5.3: The average and maximum deviations between t_N and T_N

0.0 0.4 0.8 1.2 1.6 2.0 2.4 2.8

1.61 2.91 4.50 6.17 7.63 9.02 11.34 12.29 14.08

Sampling Density(%) Avg. Dev.

(m)

256-s-E2-O2 512-s-E1-O2 1024-s-E1-O2 256-s-E2-O1 512-s-E2-O1 1024-s-E2-O2

T_N

Figure 5.4: The deviations of the regression functions with sampled data sorted

0.0

1.61 2.91 4.50 6.17 7.63 9.02 11.34 12.29 14.08

Sampling Density(%)

Figure 5.5: The deviations of the regression functions with sampled data unsorted

5.3 Experiment–B

5.3.1 Data collection and experiment design

In experiment–B, we add some standard reference point in the sampling area. The ref-erence points are separated into two parts. First one, sampling area 1, is HSINCHU, CHANGHUA, CHIAYI, and TAINAN. They are shown in Figure 5.6 and Figure 5.7.

Second one, sampling area 1 is TAINAN, KAOHSIUNG, and PINGTUNG. We also use the generation data in the sampling area 3 which in in Easting(162000-260000) and Northing(2428000-2740000) and the collecting data in sampling area 4 which in Easting(140000-226000) and Northing(2489000-2576000). They are shown in Figure 5.8 and Figure 5.9. For convenience, we use the notation, (DT, VT)::(DS, VS), to represent that in the training phase of an experiment, DT is the training dataset (user collected (UserData) or standard reference points (StdRef)) whose expected values are calculated by means of V_T (from standard formulas (StdFom) or reference points (StdRef)). Sim-ilarly, D_S is the dataset used in the testing phase whose values are calculated by the formulas obtained in the training phase and to be compared with the values calculated by means of V_S. Three experiments are conducted. The experimental results are pre-sented in Table 5.4–Table 5.6 and Figure 5.10.

Several interesting phenomenas are observed and explained as follows.

Pacific Ocean

122°03’ E 120°03’ E

Taiwan Strait

Sea

21°51’ N 25°21’ N

◇Experiment Data

− Taiwan

Figure 5.6: Reference points in sampling area 1

Pacific Ocean

122°03’ E 120°03’ E

Taiwan Strait

South China

Sea

21°51’ N 25°21’ N

◇Experiment Data

− Taiwan

Figure 5.7: Reference points in sampling area 2

Pacific Ocean

122°03’ E 120°03’ E

Taiwan Strait

South China

Sea

21°51’ N 25°21’ N

◇Experiment Data

− Taiwan

Figure 5.8: Generating data in sampling area 3

Pacific Ocean

122°03’ E 120°03’ E

Taiwan Strait

South China

Sea

21°51’ N 25°21’ N

◇Experiment Data

− Taiwan

Figure 5.9: Collecting data in sampling area 4

Table 5.4: Experimental results of B–I

Regions Target Coord. Sys. TLs #Points StdErr Dev(m)

HSINCHU, TM2E (TWD97) 3 91 1.28e⁻⁴ 0.803

CHANGHUA, TM2N (TWD97) 3 7.11e⁻⁵ 1.148

CHIAYI,TAINAN TM2E (TWD67) 4 1.04e⁻⁵ 0.954

(30746.27 km²) TM2N (TWD67) 4 1.37e⁻⁵ 1.294

TAINAN, TM2E (TWD97) 3 489 1.44e⁻⁵ 0.576

KAOHSIUNG, TM2N (TWD97) 3 6.08e⁻⁶ 0.437

PINGTUNG TM2E (TWD67) 4 8.74e⁻⁶ 0.397

(7464.96 km²) TM2N (TWD67) 4 8.69e⁻⁶ 0.707

Note: WGS84(φ₈₄, λ₈₄, h₈₄)using level-wise methods, TLs: the number of coordinate transformations from

Experiment–II: (100%StdRef, StdRef)::(100%StdRef, StdFom)

Table 5.5: Experimental results of B–II

Regions Target Coord. Sys. TLs #Points StdErr Dev(m) E(162000-260000) TM2E (TWD97) 3 10000 9.34e⁻⁵ 3.932

N(2428000-2740000) TM2N (TWD97) 3 9.12e⁻⁵ 3.577

Generating data TM2E (TWD67) 4 3.62e⁻⁵ 0.239

(30109.78 km²) TM2N (TWD67) 4 5.51e⁻⁶ 0.158

E(140000-226000) TM2E (TWD97) 3 28527 1.61e⁻⁵ 0.406

N(2489000-2576000) TM2N (TWD97) 3 9.48e⁻⁶ 0.235

Collecting data TM2E (TWD67) 4 4.01e⁻⁶ 0.499

(7386.41 km²) TM2N (TWD67) 4 5.94e⁻⁶ 0.843

Note: WGS84(φ₈₄, λ₈₄, h₈₄)using level-wise methods, TLs: the number of coordinate transformations from

Experiment–I: (70%UserData, StdFom)::(30%UserData, StdFom)

Table 5.6: Experimental results of B–III

Regions Target Coord. Sys. TLs #Points StdErr Dev(m) Reference points TM2E (TWD97) 3 10091 3.02e⁻⁵ 8.831

+ TM2N (TWD97) 3 7.89e⁻⁶ 3.723

Generation data TM2E (TWD67) 4 1.56e⁻⁵ 4.037

TM2N (TWD67) 4 9.62e⁻⁶ 2.375

Reference points TM2E (TWD97) 3 29016 6.83e⁻⁵ 0.794

+ TM2N (TWD97) 3 7.72e⁻⁶ 0.296

Collecting data TM2E (TWD67) 4 1.41e⁻⁶ 0.631

TM2N (TWD67) 4 5.59e⁻⁶ 0.312

Note: WGS84(φ₈₄, λ₈₄, h₈₄)using level-wise methods, TLs: the number of coordinate transformations from Experiment–III:

(70%UserData+70%StdRef, StdFom+StdRef)::

(30%UserData+30%StdRef, StdFom+StdRef)

Percentage of points in that Dev. (%)

Figure 5.10: Advanced analysis on the result of Experiment–C using different numbers of sampling points

distributed may be important factors in deriving transformation formulas. Too few sampling points derive less accurate results. Most testings produce satisfactory re-sults, except Experiment–C in the first region. Figure 5.10 plots the percentage of the points among 91 standard reference points in the first region against the associ-ated deviations in testing the regression formulas. Most deviations concentrate at 3m when the number of sampling points is large enough.

• Deviations are different at various areas. This may be due to the insufficient number of sampling data on critical landforms which play a critical role in geodesic survey.

Another possibility of inaccuracy comes from the insufficiency of the reference points in the testing regions. For example, there are 91 reference points in the first region (30746.27 km²) but 489 points in the second region (7464.96 km².) in case of experiment-B

• It is intuitive to assume that stronger relationships exist between the source and tar-get coordinates if the coordinates are calculated using existing formulas. Regression on such coordinates may be more easy and have better performance.

• Regression on datasets consisting of only standard reference points may not produce a good result. This may be because that such points are surveyed at different time

and are recorded in different precisions. Additionally, the use of standard reference points in the regression process sometimes improves the overall performance in accuracy. This phenomena may be explained by that the standard reference points which give stronger constraints to the regression engine.

5.4 Experiment–C

5.4.1 Data collection and experiment design

We use the standard reference points to be the experiment data. There are 2,748 such standard reference points distributed on Taiwan island of 35,915 km². They are cate-gorized as satellite control points, gravity reference points, base level points, etc., and are classified into three levels, class-1 – class-3, with each level further partitioned into 3 grades, grade-1 – grade-3 for different purposes.

5.4.2 Experiment–C–I

The first experiment is to test if the proposed regression engine can work and give satis-factory results. All 2,748 standard reference points are used in the training phase, where 70%, 80%, 90% of them are randomly selected as training data and the rest of data are reserved for testing. The experimental results are presented in Table 5.7. From the ex-perimental results, the regression engine works satisfactorily. It converges and meets the requirements of minimizing errors and products regressive functions. The value of Dev in TM2N is about 0.5m; while the one in TM2E is about 1.5m. The accuracy of regression on TM2N is much better than that on TM2E. This may due to the geographic shape of Taiwan where the length in the north-south direction is about 2.5 times of that in the east-west direction. Even the error biases, w_E and w_N, are adjusted, the same results still remain. This phenomena may be explained by the fact that the standard reference points do not uniformly distributed on the sampling area.

5.4.3 Experiment–C–II

In order to make sampling data more uniformly distributed and to eliminate Dev in both TM2E and TM2N, the sampling data are partitioned. There are two partitioning strategies. First, the standard reference points are partitioned into 5 regions according to their administrative areas and redo the above experiment. In each region, 80% data

Table 5.7: Experimental results–using all standard reference points for regression

Data TM2E TM2N

Sampling^∗ StdErr Dev(m) StdErr Dev(m)

70%::30% 9.34e⁻⁵ 2.152 1.28e⁻⁴ 0.517 80%::20% 9.34e⁻⁵ 0.971 1.28e⁻⁴ 0.628 90%::10% 9.34e⁻⁵ 1.509 1.28e⁻⁴ 0.456 Average 9.34e⁻⁵ 1.544 1.28e⁻⁴ 0.534

*: kData for trainingk :: kData for testingk

are randomly selected for training and 20% data for testing. The visualized picture is shown in Figure 5.11. The experimental results and functions are presented in Table 5.8 and Table 5.9 respectively. The second partitioning strategy is according to their height above the reference surface denoted in the WGS84’s hφ₈₄, λ₈₄, h₈₄i data packages. Also, in each level of height of data, 80% data are randomly selected for training and 20% data for testing. The visualized picture is shown in Figure 5.12. The experimental results are presented in Table 5.10.

From the experimental results, regression using partitioned datasets gains better ac-curacy than that in un-partitioned datasets. This is a reasonable result since each dataset has condensed sampling data so that the regression engine can converge faster. Interest-ingly, both results are better than that in Table 5.7 and the results in Table 5.8 outperform slightly those in Table 5.10. This may due to the spread of data partitioned by heights is wider than that partitioned by locations.

5.4.4 Experiment–C–III

Finally, the effectiveness of the proposed method is compared with level-wise transfor-mation. For this purpose, 8 sets of synthetic data in WGS84 (hφ84, λ84i) are extracted with parameters δe= 0.41⁰, δn= 0.23⁰, and k = 0 ∼ 511 as follows.

• E1={h120^◦38.43⁰, 21^◦51.00⁰ + k · δ_ei}

• E2={h120^◦45.43⁰, 21^◦51.00⁰ + k · δ_ei}

• E3={h120^◦53.43⁰, 21^◦51.00⁰ + k · δ_ei}

Middle-West

在文檔中以遺傳基因演算法為基礎的符號式回歸引擎及其於全球定位系統之座標轉換上的應用 (頁 67-83)