Results of trajectory-based baseball tracking

Chapter 4. Sports Information Retrieval in Baseball Video

4.5 Experimental Results

4.5.1 Results of trajectory-based baseball tracking

For trajectory-based ball tracking, some parameters are used. Td is the threshold of frame difference in moving object segmentation. Since the intensity of the baseball should be much higher than the background or other objects in the frames, we adaptively set Td by Eq. (4-9), which can eliminate many noises and still retain the ball.

Td = Average _intensity_of_the_frame × 50% (4-9)

As to the range of size filter, up to 95% of the baseball sizes (in pixel) in the frames of the resolution 352 × 240 are within the range [8, 50] by statistical results, so [Rmin, Rmax] is set to [8, 50]. The parameter Ra is the threshold of shape filter. Generally speaking, the aspect ratio of the baseball should equal 1. Due to the high speed movement, the ball may deform over frames. Thus, for tolerance of deformation, the constraint of shape filter is loosened. Since an object with aspect ratio greater than 3 is far from a ball, Ra is set to 3. Since an object of compactness degree Dc less than half cannot be claimed to be “compact”, the threshold of compactness filter Tc is set to 50%. Furthermore, though the ball trajectory over frames is not exactly a parabolic curve, a trajectory with great prediction error cannot be the ball trajectory.

Thus, for reasonable error tolerance, the threshold of prediction error Te is set to 2 (in pixel).

The ball position in each video frame is manually recognized as ground truth. A ground truth ball is called “detected” if it matches a ball candidate. A ground truth ball falling on the obtained trajectory is called “tracked”, since the ball position can be predicted on the trajectory by the motion characteristics even though it does not match a ball candidate. The experimental results of ball detection and tracking are listed in Table 4-5, where #clip shows the number of pitch shots, #frm represents the total number of frames in all the pitch shots and #bf represents the number of the frames containing the ball. The column “#detected (%)”

gives the number of balls detected and the detection rate (#detected / #bf), “#tracked (%)”

gives the number of balls tracked and the tracking rate (#tracked / #bf), #false (%) gives the number of false alarms and the false alarm rate (#false / #frm).

It can be found that there are some misses because the ball might not be detected when it passes over a left-handed batter dressed in a white uniform. Fortunately, the positions of missed balls can be recovered by applying the ball position prediction. An example of ball detection is shown in Fig. 4-25(a), where the ball is missed in two frames when passing over the white uniform. The result of ball tracking is presented in Fig. 4-25(b) where the missed ball positions can be recovered by applying the predicted positions of the obtained trajectory.

Although some tracking errors might exist, the proposed scheme promotes the overall accuracy of ball tracking up to 96%. The ball tracking with visual enrichment of some example pitch shots are demonstrated in Fig. 4-26. It is convincing that the proposed framework performs well in baseball clips from different channels, no matter whether the pitcher/batter is left- or right- handed.

Table 4-5. Performance of baseball detection and tracking

Baseball #clip #frm #bf #detected (%) # tracked (%) # false (%)

MLB 30 1380 424 387 (91.27%) 409 (96.46%) 11 (0.80%)

JPB 32 2089 466 435 (93.35%) 453 (97.21%) 12 (0.57%)

CPBL 24 942 352 326 (92.61%) 338 (96.02%) 7 (0.74%)

Total 86 4411 1242 1148 (92.43%) 1200 (96.62 %) 30 (0.68%)

* detection rate (%) = #detected / #bf, tracking rate (%) = #tracked / #bf, and false alarm rate = #false / #frm.

(a) Ball detection (a) Ball tracking

Fig. 4-25. Illustration of ball detection and ball tracking in baseball video. (a) Ball detection.

Two ball positions are missed when passing over the white uniform. (b) Ball tracking.

Positions of missed balls can be recovered.

(a) MLB pitch shot (b) JPB pitch shot (c) CPBL pitch shot Fig. 4-26. Examples of ball tracking and visual enrichment for various baseball clips.

The experiments run on an IBM ThinkPad X60 notebook computer (CPU: Intel Core Duo T2400 1.83GHz, RAM: 1GB). For a pitch shot of 2 seconds, the required processing time is about 8~10 seconds. In baseball games, the duration between two successive pitches is usually longer than 10 seconds. That is, the proposed framework is able to compute the ball trajectory of a pitch shot and superimpose the trajectory over the video before the next pitch coming up in near real-time. The application of enriching the live broadcast baseball video for entertainment effects becomes feasible.

It is difficult to perform a head-to-head comparison with other algorithms since there exist differences in the actual setup and the implementation. As a reasonable comparison, we divide the process into two stages: potential trajectory exploration and ball trajectory identification, and make the discussion.

A. Potential Trajectory Exploration

Kalman filter and particle filter are widely used in moving object tracking. However, particle filter is usually applied to tracking large objects with salient characteristics of edges or colors, such as cars and people [55]. Though particle filter can also be used in ball tracking, it is applicable to ball of big size, such as basketball [55], for which a distinguished target model can be built. Since most of the ball tracking algorithms in the literature [8,11,52] are Kalman filter-based, we make a comparison focusing on Kalman filter. We compare the performance between the Kalman filter-based algorithm (KF) and the proposed parabola-based algorithm (PB). The performance metrics include the number of potential trajectories produced and the number of the ball candidates linked on the potential trajectories.

For each pitch sequence, fewer ball candidates linked on the potential trajectories need fewer updates of the prediction function or Kalman filter. The fewer number of the potential trajectories is, the less computation in trajectory identification is.

Using the 86 testing sequences as in Table 4-5, the comparison is presented in Table 4-6.

The notations #Seq, #PT and Avg. #PT represent the number of testing pitch sequences, the total number of potential trajectories produced in the pitch sequences and the average number of potential trajectories produced per pitch sequence. #Cand and Avg. #Cand denote the total number of ball candidates linked over all the potential trajectories and the average number of ball candidates linked per pitch sequence. It can be observed that KF algorithm produces more potential trajectories with more ball candidates linked, because KF algorithm may link neighboring non-ball objects in consecutive frames and form many potential trajectories

which are not parabolic and need to be eliminated. However, the proposed PB algorithm aims at extracting only the potential trajectories which form (near) straight lines in X-direction and (near) parabolic curves in Y-directions, simultaneously. Therefore, the proposed parabola-based algorithm is more efficient in potential trajectory exploration since fewer ball candidates linked cause fewer updates of prediction function, and it will save more time in trajectory identification due to the fewer potential trajectories need to be validated.

Table 4-6. Comparison between the Kalman filter-based algorithm and the proposed physics-based algorithm in baseball video

Extracting the true ball trajectory from lots of potential trajectories needs some identification mechanism. Chu et al. [52] simulate all the possible trajectories of ball pitching varying in different beginning velocities, releasing angles and spin rates to derive physical limitation for trajectory identification, which is time-consuming. To transform 2D trajectories into 3D trajectories for validation, they compute the ratio of “the vertical movement distance of pitches in the real world” (1 meter, assumed by the authors) to “the average vertical movement distance of pitches in the video frames of their dataset”, and then estimate the depth of each ball candidate proportionally. However, the positions of pitchers releasing the ball and catchers catching the ball vary. The variation in the vertical movements of numerous pitches should be large and a pitch with the vertical movement far from the average, e.g. an underhand pitch, may not be identified reliably.

In our proposed scheme, we maintain the best-fitting function of the trajectory, the component ball candidates linked and their associated coordinates and categories (isolated or contacted) for each potential trajectory. Then, the properties for pruning the false trajectories and extracting the true ball trajectory, including trajectory length, prediction error, the ratio of isolated candidates over all candidates on the trajectory, and the length of consecutive isolated candidates, can be computed quickly. Therefore, the ball trajectory can be identified efficiently and reliably.

在文檔中運動影片內容分析、理解與註釋之研究 (頁 119-124)