• 沒有找到結果。

A Baseball Exploration System using Spatial Pattern Recognition

N/A
N/A
Protected

Academic year: 2022

Share "A Baseball Exploration System using Spatial Pattern Recognition"

Copied!
4
0
0

加載中.... (立即查看全文)

全文

(1)

A Baseball Exploration System using Spatial Pattern Recognition

Hua-Tsung Chen, Ming-Ho Hsiao, Hsuan-Shen Chen, Wen-Jiin Tsai, Suh-Yin Lee

Department of Computer Science, National Chiao-Tung University, Hsinchu, Taiwan {huatsung, mhhsiao, xschen, wjtsai, sylee}@csie.nctu.edu.tw

Abstract— Despite a lot of research efforts in baseball video processing, little work has been done in analyzing the detailed process and ball movement of the batting content. This paper proposes a novel system to auto- matically summarize the progress of each batting in baseball videos. Utilizing the strictly-defined specifica- tions of the baseball field, the system recognizes the spa- tial patterns in each frame and identifies what region of the baseball field is currently focused. Finally, an anno- tation string which abstracts the batting content is gen- erated. With the annotation strings, the system is able to make descriptions and provide exploration for baseball videos, so that users can be given a further insight into the game quickly. The experiments on broadcast base- ball videos of MLB and JPB show promising results.

I. INTRODUCTION

The rapidly increasing amount of digital videos moti- vates researchers to strive for various aspects of video analysis. In recent years, sports video analysis is attracting considerable attention due to the entertainment functional- ities and potential commercial benefits. The possible appli- cations of sports video analysis have been found almost in all sports, among which baseball is a quite popular one.

Baseball videos are characterized by a strictly-defined structure containing a series of plays and each play starts with a pitch. Therefore, pitch analysis has been addressed to derive the correlation between the ball trajectory and the rotation by tracking the translation and rotation of a pitched ball [1], to extract the ball trajectory based on physical char- acteristics [2], and even to reconstruct the 3D trajectory of the pitched ball with multiple cameras [3].

Due to broadcast requirement, there has been an essen- tial demand for highlight extraction which aims at abstract- ing a long game into a compact summary to provide the audience a quick browsing of the game. Cheng and Hsu fuse visual motion information with audio features to extract baseball highlight using HMM [4]. Chang et al. utilize the transition of image features in each frame for automatic video indexing with HMM [5]. Mochizuki et al. provide a baseball indexing method based on patternizing baseball

scenes using a set of rectangles with image features and a motion vector [6].

Even though the previous works report good results on highlight extraction and event indexing, there is no idea of the detailed batting process and ball movement within a shot, such as: “The ball batted into the left infield is picked up by an infielder and then thrown to the first baseman”. In this paper, we aim at exploring field shots, the shots follow the batted ball in the field, to identify the regions the ball has passed through. In addition to providing the detailed de- scription of each play, a baseball exploration system is also developed, so users can efficiently retrieve the batting clips desired. With the proposed framework, highlight extraction and event indexing in baseball videos will be more powerful and practical, since comprehensive, detailed and explicit information about the game can be presented to users.

II. PROPOSED FRAMEWORK

In this paper, a novel framework is proposed for auto- matic annotation of the batting content in baseball videos.

We first identify the play region, the currently focused re- gion of the baseball field, and then annotation strings can be generated by analyzing the transition of the identified play regions.

As illustrated in Fig. 1, for a given baseball video, the field shots, in which the camera follows the batted ball in the field, are first segmented. Then, we extract the visual features in a field shot to analyze the distribution of domi- nant colors and white pixels. With baseball domain knowl- edge, the spatial patterns of field lines and field objects are recognized to classify play region types, such as infield left, outfield right, audience, etc. Finally, from each field shot, an annotation string which describes the transition of play regions is generated to abstract the content of the batting for baseball exploration.

In the following, we in turn describe the major compo- nents of the proposed system: visual feature extraction, spa- tial pattern recognition and play region type classification.

Since shot classification and indexing in sports video has been researched well in the literature [5-8], the process of field shot segmentation is not addressed here.

978-1-4244-1684-4/08/$25.00 ©2008 IEEE 3522

(2)

Fig. 1. Framework of the proposed baseball exploration system.

III. VISUAL FEATURE EXTRACTION

As depicted in Fig. 2, the baseball field is characterized by a well-defined layout of specific colors. Moreover, im- portant lines and the bases are in white color to provide vis- ual assistance for players, umpires and audience. Therefore, color is an effective visual feature in baseball video analysis.

We propose to define dominant colors using histogram.

Furthermore, the spatial distribution of dominant colors and white pixels are exploited to detect field objects and lines.

The soil color and grass color are the dominant colors in the baseball field. However, the appearance of the grass and soil colors would vary with the field condition and captur- ing device. We have observed that within one game, the hue value in the HSI (Hue-Saturation-Intensity) color space is relatively stable despite lighting variations. Hence, the hue value is adequate to define the dominant colors. In addition, the intensity value is applicable for white pixel extraction.

In a field shot, the first frame mainly contains the base- ball field, while the later frames, which might zoom in on a player or move to the audience, contain less proportion of the field. Therefore, it is reasonable to define the dominant colors at the first frame of a field shot.

Fig. 3 demonstrates the spatial distribution of dominant colors and white pixels. The first field frame and its hue histogram are shown in Fig. 3(a) and (b), respectively. In the hue histogram, dominant colors can be defined as the peak of small hue value representing the soil color and the peak of large hue value representing the grass color. The regions segmented by dominant colors are depicted in Fig.

3(c), where grass regions are shown in green, soil regions in brown and others in black. The white pixels extracted are presented in Fig. 3(d).

IV. SPATIAL PATTERN RECOGNITION

In this section, we focus on the analysis of field shots and

(a) (b)

Fig. 2. Prototypical baseball field: (a) Full view of a real baseball field (b) Illustration of field objects and lines.

(a)

(b)

(c)

(d)

Fig. 3. Spatial distribution of dominant colors and white pixels: (a) First field frame. (b) Hue histogram. (c) Segmented regions: grass regions shown in green, soil regions in brown and others in black.

(d) Extracted white pixels.

attempt to recognize the spatial patterns of field lines and field objects: left line (LL), right line (RL), pitcher’s mound (PM), home base (HB), first base(1B), second base (2B), third base (3B) and auditorium (AT), as depicted in Fig.2(b).

Since the baseball field has a strictly-defined layout, the field lines and objects can be recognized based on the dis- tribution of dominant colors and white pixels. In Fig. 4, the top row gives the original frames and the bottom row illus- trates the following detection of the field lines and objects.

1) Left line (LL) and right line (RL): A growing algorithm, which produces a vector representation of the line segments [9], is applied to the extracted white pixels. The field lines (left line and right line) are then obtained by joining to- gether the line segments which are close and collinear, as the oblique lines in Fig. 4(a), (b) and (c).

2) Pitcher’s mound (PM): An elliptic soil region sur- rounded by a grass region would be recognized as pitcher’s mound, as the red rectangle in Fig. 4(a) and (c).

3) Home base (HB): Home base can be located at the in- tersection of left line and right line, as shown in Fig. 4(a), if both field lines are detected.

4) First base (1B) and third base (3B): The white square located on the right line, if detected, in a soil region would be identified as first base, as depicted in Fig. 4(a). Similarly, the white square on the left line, if detected, in a soil region would be identified as third base, as depicted in Fig. 4(b).

5) Second base (2B): In a soil region, a white square on neither field line would be recognized as second base, as the white square in Fig. 4(a) and (c).

6) Auditorium (AT): The top area which contains high tex- ture and no dominant colors is considered as the auditorium, as the black area above the white horizontal line in Fig. 4(c).

3523

(3)

(a) (b) (c) Fig. 4. Detection of field lines and field objects.

Fig. 5. Twelve typical play region types.

V. PLAYREGION TYPECLASSIFICATION

In order to comprehend the detailed content of ball move- ment and region transition, we have to recognize the play region, the currently focused region in the baseball field, of each field frame. With baseball domain knowledge, we util- ize the detected field objects and lines to classify each field frame into one of the twelve typical play region types: IL (infield left), IC (infield center), IR (infield right), B1 (first base), B2 (second base), B3 (third base), OL (outfield left), OC (outfield center), OR (outfield right), PS (player in soil), PG (player in grass) and AD (audience), as shown in Fig. 5.

Note that B1, B2 and B3 here represent play region types while 1B, 2B and 3B in Sec. IV represent field objects.

The rules of play region type classification are list in Table 1, where Wf is the frame width. The function P(A) returns the percentage of the area A in a frame, X(Obj) re- turns the x-coordinate of the center of the field object Obj, and E(Obj) returns whether the field object (or line) Obj exists or not. Each field frame is classified into one of the twelve play region types by applying the rules on the spatial patterns. Take IL (infield left) as an example. A field frame would be identified as IL under the following conditions:

1. The percentage of AT area in a frame is no more than 10%, PM exists and the x-coordinates of PM center is greater than two-third of the frame width Wf (PM is located at the right one-third of a frame).

2. The percentage of AT area in a frame is no more than 10%, PM does not exist, LL exists and 3B does not exist.

Table 1. Rules of play region type classification IL: {P(AT)” 10%, E(PM), X(PM) > Wf x 2/3} ||

{P(AT)” 10%, ~E(PM), E(LL), ~E(3B)} ||

{P(AT)” 10%, ~E(PM), E(LL), E(3B), P(soil) ” 30% } IC: {P(AT)” 10%, E(PM), Wf/3 < X(PM)” Wfx 2/3} ||

{P(AT)” 10%, ~E(PM),~E(RL),~E(LL), E(2B), P(soil)”30%}

IR: {P(AT)” 10%, E(PM), X(PM) ” Wf/3} ||

{P(AT)” 10%, ~E(PM), E(RL), ~E(1B)} ||

{P(AT)” 10%, ~E(PM), E(RL), E(1B), P(soil) ” 30%}

B1: {P(AT)” 10%, ~E(PM), E(RL), E(1B), P(soil) > 30%}

B2: {P(AT)” 10%, ~E(PM),~E(RL),~E(LL), E(2B), P(soil)>30%}

B3: {P(AT)” 10%, ~E(PM), E(LL), E(3B), P(soil) > 30%}

OL:{10% < P(AT)” 80%, E(PM), X(PM) > Wf x 2/3} ||

{10% < P(AT)” 80%, ~E(PM), E(2B), X(2B) > Wf x 2/3} ||

{10% < P(AT)” 80% , ~E(PM), ~E(2B), E(LL), ~E(RL)}

OC:{10% < P(AT)” 80%, E(PM), Wf/3 < X(PM)” Wf x 2/3}||

{10% < P(AT)” 80%,~E(PM),E(2B), Wf/3< X(2B)”Wfx 2/3}

OR:{10% < P(AT)” 80%, E(PM), X(PM) ” Wf/3} ||

{10% < P(AT)” 80%, ~E(PM), E(2B), X(2B) ” Wf/3} ||

{10% < P(AT)” 80%, ~E(PM), ~E(2B), E(RL), ~E(LL)}

AD:{P(AT) > 80%}

PS: {P(AT)”10%, ~E(PM),~E(2B), ~E(RL),~E(LL), P(soil)>30%}

PG: {10% < P(AT)” 80%, ~E(PM), ~E(2B), ~E(RL), ~E(LL)}

Unknown: others

Fig. 6. Scheme of play region type classification within a field shot.

3. The percentage of AT area in a frame is no more than 10%, PM does not exist, LL exists, 3B exists and the percentage of soil area is no more than 30%.

The scheme of play region type classification within a field shot is illustrated in Fig. 6. The spatial patterns are first recognized by the distribution of dominant colors and white pixels in field frames. According to the rules on the spatial patterns, each field frame is then classified into one of the twelve typical play region types. To filter out instantaneous misclassifications of play region types within a field shot, a fixed length temporal window and majority voting are ap- plied. Thus, an annotation string which describes the transi- tion of play regions contained in a field shot can be obtained.

The content of the sample field shot in Fig. 6 says that the ball is first batted into the left infield. Then, the shortstop picks up the ball and throws it to the first baseman. The bat- ting process can be appropriately abstracted by the output annotation string: IL (infield left) Æ PS (player in soil) Æ IR (infield right) Æ B1 (first base).

3524

(4)

Fig. 7. User interface of the proposed baseball exploration system.

VI. BASEBALL EXPLORATION SYSTEM

The user interface of the proposed baseball exploration system is shown in Fig. 7. The video is displayed in area A and the visual presentation of the video analysis is provided in B. Area C gives the information about the recognized spatial patterns. Furthermore, users are allowed to designate play region types in D for exploration. The video clips con- taining the user-designated play region types are retrieved and listed in E with their respective annotation strings.

VII. EXPERIMENTS

For evaluation, the proposed baseball exploration sys- tem has been tested on 119 field shots of Major League Baseball and Japan Professional Baseball videos (352x240, MPEG-1). The ground truths of the play region types con- tained in each field shot are identified manually. Table 2 presents the experimental results. The second column “to- tal” represents the total number of field shots containing the play region type designated in the first column. Note that a field shot might comprise more than one play region type.

The “correct” and “false” represent the number of correct detections and false alarms. Both the precision and recall are about 90% except for the precision of PS (player in soil) and the recall of B2 (second base region). The false alarms of PS might result from no field object detected in the in- field, and the missed detection of play region type B2 might result from the missed detection of field object 2B. These could be improved by enhancing field object detection and refining the rules of play region type classification. Overall, we achieve good performance.

VIII. CONCLUSIONS

In this paper, we propose a baseball exploration system which is able to automatically abstract the content of a bat- ting process in baseball videos. First, the spatial patterns of the field objects and lines in each field frame are recognized

Table 2. Performance of the baseball exploration system.

play region

type total correct false precision (%)

recall (%)

IL 34 33 2 94.3 ʳ 97.1ʳ

IC 31 30 2 93.8ʳ 96.8ʳ

IR 51 49 1 98.0ʳ 96.1ʳ

B1 48 47 2 95.9ʳ 97.9ʳ

B2 12 10 1 90.9ʳ 83.3ʳ

B3 9 8 0 100.0ʳ 88.9ʳ

OL 18 18 2 90.0ʳ 100.0ʳ

OC 17 15 2 88.2ʳ 88.2ʳ

OR 25 25 1 96.2ʳ 100.0ʳ

AD 18 18 2 90.0ʳ 100.0ʳ

PS 38 34 7 82.9ʳ 89.5ʳ

PG 54 52 7 88.1ʳ 96.3ʳ

based on the distribution of dominant colors and white pix- els. With baseball domain knowledge, each field frame is classified into one of the twelve typical play region types utilizing the rules on the spatial patterns. A fixed length temporal window and majority voting is applied to filter out instantaneous misclassifications of play region types in a field shot. Finally, an annotation string is generated to de- scribe each batting. With the annotation strings, the baseball exploration system allows users to comprehend the game quickly and to retrieve the desired batting clips efficiently.

REFERENCES

[1] H. Shum and T. Komura, “Tracking the Translational and Ro- tational Movement of the Ball Using High-Speed Camera Movies,” Proc. of the IEEE Int. Conf. on Image Processing (ICIP 2005), vol. 3, pp.1084-1087, 2005.

[2] H. T. Chen, H. S. Chen, M. H. Hsiao, Y. W. Chen and S.Y.

Lee, “A Trajectory-Based Ball Tracking Framework with En- richment for Broadcast Baseball Videos,” Proc. of the Int.

Computer Symposium (ICS 2006), pp.1145-1150, 2006.

[3] A. Gueziec, “Tracking pitches for broadcast television,” Com- puter, vol. 35, pp. 38-43, 2002.

[4] C. C. Cheng and C. T. Hsu, “Fusion of Audio and Motion In- formation on HMM-Based Highlight Extraction for Baseball Games,” IEEE Trans. on Multimedia, vol. 8, pp.585-599, 2006.

[5] P. Chang, M. Han and Y. Gong, “Extract highlights from base- ball game video with hidden Markov models,” Proc. of the IEEE Int. Conf. on Image Processing (ICIP 2002), vol. 1, pp.609-612, 2002.

[6] T. Mochizuki, M. Tadenuma and N. Yagi, “Baseball Video Indexing Using Patternization of Scenes and Hidden Markov Model,” Proc. of the IEEE Int. Conf. on Image Processing (ICIP 2005), vol. 3, pp.1212-1215, 2005.

[7] W. Hua, M. Han and Y. Gong, “Baseball scene classification using multimedia features,” Proc. of the IEEE Int. Conf. on Multimedia and Expo (ICME 2002), vol. 1, pp.821-824, 2002.

[8] L. Y. Duan, M. Xu, Q. Tian, C. S. Xu and J. S. Jin, “A Unified Framework for Semantic Shot Classification in Sports Video,”

IEEE Trans. on Multimedia, vol. 7, pp.1066-1083, 2005.

[9] R. C. Nelson, “Finding Line Segments by Stick Growing”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.16, pp. 519-523,1994.

3525

參考文獻

相關文件

A factorization method for reconstructing an impenetrable obstacle in a homogeneous medium (Helmholtz equation) using the spectral data of the far-field operator was developed

Full credit if they got (a) wrong but found correct q and integrated correctly using their answer.. Algebra mistakes -1% each, integral mistakes

Reading Task 6: Genre Structure and Language Features. • Now let’s look at how language features (e.g. sentence patterns) are connected to the structure

If  is positive, the electric field lines are radially outward, normal to the Gaussian surface and distributed uniformly along it. The charge enclosed is the total charge in

Nevertheless, we show in this paper that in the case of FISDW, unlike in the case of a regular charge- or spin- density wave ~CDW/SDW!, a nonstationary motion of the FISDW

• Follow Example 21.5 to calculate the magnitude of the electric field of a single point charge.. Electric-field vector of a

On the con- trary, if the bias field is attributed to the interlayer coupling between Co and Fe–Mn alloy films, one can expect a de- crease of bias field while increasing the

magnetic field lines that pass through the loop is