Ball candidate detection - Camera Calibration

Chapter 2. Physics-based Ball Tracking and 3D Trajectory Reconstruction with

2.4 Camera Calibration

2.5.1 Ball candidate detection

The detection of ball candidates, the basketball-colored moving objects, requires extracting the pixels which are 1) moving and 2) in basketball color. For moving pixel

detection, frame difference is a compute-easy and effective method. We extract the pixels with significant luminance difference between consecutive frames as moving pixels. Color is another important feature to extract ball pixels. However, the color of the basketball in frames might vary due to the different angles of view and lighting conditions. To obtain the color distribution of the basketball in video frames, 30 ball images are segmented manually from different basketball videos to produce the respective color histograms in RGB, YCbCr and HSI color spaces, as shown in Fig. 2-10. After statistical analysis, the Hue value in HSI space has better discriminability and is selected as the color feature and the ball color range is set to [Ha, Hb]. We compute the average Hue value for each 4x4 block in frames and discard the moving pixels in the blocks of which the average Hue values are not within the ball color range [Ha, Hb]. To remove noises and gaps, morphological operations are performed on the remaining moving pixels, called ball pixels. An example of ball pixel detection is shown in Fig. 2-11. Fig. 2-11(a) is the original frame and Fig. 2-11(b) shows the moving pixels detected by frame difference. The extracted ball pixels after morphological operations are presented in Fig. 2-11(c).

Fig. 2-10. Color histograms of 30 manually segmented basketball images

(a) Source frame (b) Moving pixels (c) Extracted ball pixels Fig. 2-11. Illustration of ball pixel detection.

With the extracted ball pixels, objects are formed in each frame by region growing. To prune non-ball objects, we design two sieves based on visual properties:

1) Shape sieve: The ball in frames might have a shape different from a circle, but the deformation is not so dramatic that its aspect ratio should be within the range [1/Ra, Ra] in most frames. We set Ra = 3 since the object with aspect ratio > 3 or < 1/3 is far from a ball and should be eliminated.

2) Size sieve: The in-frame ball diameter Dfrm can be proportionally estimated from the length between the court line intersections by pinhole camera imaging principal, as Eq. (2-9):

(Dfrm / Dreal) = (Lfrm / Lreal) , Dfrm = Dreal(Lfrm / Lreal) (2-9) where Dreal is the diameter of a real basketball (≈ 24cm), Lfrm and Lreal are the in-frame length and the real-world length of a corresponding line segment, respectively. To compute the ratio (Lfrm / Lreal), we select the two points closest to the frame center from the six court line intersections and calculate the in-frame distance Lfrm of the selected two points. Since the distance of the two points in real court Lreal is specified in the basketball rules, the ratio (Lfrm / Lreal) can be computed out. Thus, the planar ball size in the frame can be estimated as π • (Dfrm/2)². The size sieve filter out the objects of which the sizes are not within the range [π • (Dfrm/2)² – ∆ , π • (Dfrm/2)² + ∆], where ∆ is the extension for tolerance toward processing faults.

It would be a difficult task to detect and track the ball if there is camera motion. There

are two major problems we may confront. The first is that more moving pixels are detected due to the camera motion and therefore more ball candidates might exist. However, our analysis is focused on the shooting trajectories in court shots. To capture and present the large portion of the court, the camera is usually located at a distance from the court. The camera motion is not so violent in court shots except for a rapid camera transition from one half-court to the other, as can be noted in Fig. 2-12, where the left image shows the detected ball candidates marked in the yellow circles, and the right image presents the camera motion using motion history image [28], generated from 45 consecutive frames. When a shooting event occurs, one of the backboards should be captured in the frames. During the camera transition since no backboard shows on the screen, our system need not perform ball candidate detection. That is, the performance of ball candidate detection is not affected by the camera moving from one half-court to the other. Second, it is possible (although it is rare in practice) that the ball might have little motion or stay still on the screen when the camera attempts to follow the ball. However, we observe in experiments that the ball is hardly at exactly the same position in consecutive frames even if the camera follows the ball. Although there are still some misses in moving pixel detection in this case due to the mild motion of the ball in frames, the pixels of the true ball can be correctly detected in most frames. The missed ball candidate can be recovered from the ball positions in the previous and the subsequent frames by interpolation.

(a) Fewer ball candidates produced if the camera motion is small.

(b) More ball candidates would be produced if there is large camera motion.

Fig. 2-12. Left: detected ball candidates, marked as yellow circles. Right: motion history image to present the camera motion.

在文檔中運動影片內容分析、理解與註釋之研究 (頁 37-41)