• 沒有找到結果。

視訊資訊系統結構化視訊計算與查詢方法

N/A
N/A
Protected

Academic year: 2021

Share "視訊資訊系統結構化視訊計算與查詢方法"

Copied!
5
0
0

加載中.... (立即查看全文)

全文

(1)

 

Structured Video Computing and Query Methods in a Video Information System

 : NSC 87-2213-E009-093

 :  8681  87731

:   

(  

 

)

 !"#$ "%&'(

)!*+,-$./01234567

45$89:;<=>?@!/A

BCD EF$GH8I 

!JK$"LMNOPQ!RSC

T!UVWMXYZ[I !

$8\I! C!

D ]^ _!`abc

cN!de'fcNgD!h$i

jklmC

no$pqrID 

cNsGH!(tQ

!uvCpqwx"yzD {|

U6(U}YcN)~!U

iuvCpq"€;< ‚rƒ

!R„$…†cN‡ˆfs!

‰ŠC>‹$pqrIŒkQ{|

Duv$'S†ŽU {|cN

}Y!‘’“CT!D”•($

‘’–f—˜‘s™uvfaš$

›'œCpq€ž‹sS†Ÿ

 !)~$8‰nŸ )~" {|

Ÿ&(UŽ¡!¢£C¤¥D¦!G~

§$pq"9[¨©ª!D)~fD

”h$«¬rI­{|®!¯°

±²'S†Ÿ³“´µC

¶   

(keyword: Video Indexing and Searching, Video Server, Video Information System, Video Feature Extraction)

In the near future, video information will be represented digitally and stored together on a device such as a hard disk or CD-ROM, and accessed directly by the computer. Processing and computing of dynamic digital video will be important for video data. One of the significant task is the automatic detection of video structure and the construction of structured video. Structured video will help the understanding of the scenario information in the video, recognition of objects in the scene and their relationships with other objects.

In this proposal, we present several important methods in a video information system for video object extraction and query processing. We first describe an efficient method for computing the position of one or more moving objects in a video sequence. We utilize computer vision routines for both target object tracking and feature value extraction. Then, we propose a multiple subsequence matching mechanism for comparing the difference of object motion between two video sequences. The definitions of mapping function, penalty table, and overall distance between two video sequences are provided. We use the prominent feature points as our segmentation positions and partition the sequence into several subsequence groups. After an alignment process, we can find the mapping of the best matching, and get an evaluation report between the two video sequences for temporal content analysis.

·¸¹:W!

º»¼t” ½¾+$

!¿ÀÁÂ¥JKÃ!ÄÅ$'Æ

¿i!s6œ(ŠCXYÇ

(2)

Ès (automatic image feature

extraction)

$M:;<!ÇÈGHÉÊX

YŸ³IËÌJK!Í(s

–Î$¤:nJÏ!(sÐ9ÑÒ

JK ½!DC†`ÊXYI

cN!s$wxÓaÔ!M

¬ÕcN!Ÿ  (segmentation)dÖ

(recognition)

 ‡ ˆ (tracking) ™

×ØC !cNÙÚ}Y

*f7Ç!ÇێÜde$Ýv!

ÞÓßàyzCá†âf ½

ãã(ägåæç$„f¬è6éêë

!Ýv$'fgìOíî!Dï2$

ŽÊðñòóf¬!ÓôC

õö÷uv&ø

†`XYEëé!Ã Õ®

f¬®è $ùŠ¥ú!ö÷$'o

ûüýþ† (region-based)$D

o û  !cN›'Ÿ Ã

s$MŽi!uvCpq"'nH

I Ó! 

uvC

pq9:È!üý (color

histogram)

Ÿ ³ I g ì ü ! Ÿ

$ gìüýŸ¢ (similar color

clustering)

!uv\þ!ž‹

üD– (prominent color table)$'þ

ðv (area-flooding)$9" üþ

Ÿ I$'¿þ“s$ÑE

\gþ!Õ®gD)~è

(attribute graph)$(a)(b)C

A C B D E

(a) Õ®cNŸ  (b) Õ®hè

. Õ®Ÿ Õ®

áM$ò!ÇÈããÚ¨7Ç

™-EÜi!Ÿ $9 HSV

(Hue, Saturation, Value$ýêê

ê) üý$Düý!Ÿ›'Ÿ¢

§Ÿ $–ÎC

:ÇÈÕ®è! !$"

#$%&%'oû! ½¬$

½!GHï2"·Î$()

&†¬®*+%&%'è!GH

ï2CÙn$*!D (graph matching)

uv"L&† !uv

“C

–. gìüýŸ¢,-

Color ID Color Hue Saturatio n

Value A Light Blue (sky) 160~192 26~49 63~8

0 B Dark Green (tree) 65~79 6~13 8~30 C Green (mountain) 66~86 8~17 18~3 8 D Cyan (ocean) 159~1176 21~65 28~6 3 E Gray (sand) 336~43 2~20 57~8 4 t i+1 i t i+r-1 Time t

·. ½GHï2

†r.cNŸ ¬€!þ

ðv!Ê$noÞIU—/2

0 1 v 2 (integrated labeling

algorithm)

$ðþ!«¬$345

6ncNþ!û7 ü™s

8-þ®!g9h$–·ÎC

–·. —/201v2:5

Region ID Region Color

Degree Adjacent Region Set A Light Blue 2 B, D B Dark Green 3 A, C, D C Green 2 B, D D Cyan 3 A, B, C

(3)

E Gray 1 D

 G H  Y + ¬®è ½

¬$cN!‡ˆ$õÎ$Mgå

Q!VRSC;/üýþŸ v

f g ì è ä ê (similarity

measurement)

$Z[6Uoû!gì

cN)~$M&n 

Y+Ÿ³¬{è ½!

uvC

(

a)

<= >?

(b)

oûfW0þ (c) oû·fW0þ

õ. ½!cN‡ˆ

pq@!cN‡ˆuvA

ÎC€BÀÁx"‡ˆ!cN'CD

™EàÞFGœ$\nHcN!üý

ŸI–'fg9þ!èC¤¥

6Joû!‘’Ž§$ÑÒ9Êþ

!Dgìè$'¿©gÈ

!W0C ùŠ)}Kä$‰{D“§

!oûÑ҇ˆ!RSC

Target Searching and Matching Feature Value and Motion Vector Construction Target Tracking Target Area Feedback Noise Remove Target Object Selection Target Area Finding Frame Difference and Thresholding Region Flooding Topological and Statistical Information Constructing Histogram Accumulating Color Ranking Color Table Selecting Similar Color Clustering Color Table Establishing Region Labeling and Segmentation

A. cN‡ˆLM

:0 „i¬®!è$¬

D!ï2$NÎ$OP+

QD (pattern matching) !×ØgìC

g«Rê!>?$S!KMP u

vÐÊT¨W!CáM$å ½U

!DVk¬$ntWèXZ (linear

search)

vLŽ†YZC

![N\ji¨+!]Y

‘’¬$^Î$KMP uv!D

ï2"ºvaÔCD0ntDï2$

pq"'_ѧ!gì¾OCS (Optimal

Common String of Subsequences)

uv›

'aÔC

Value Frame Value Frame Target Sequence A Sample Sequence B

N. ¬{è ½Dï2

(4)

Target Sequence 1 1 2 m Sample Sequence p q n Alignment Point

^. ]YèDï2

†`a OCS uv\¥ç!

D–$i!"{|Ÿ &Ž¡!¢

£$bÎ$"9cd+e

f!¨C

Target Sequence 1 1 2 m' Sample Sequence Alignment Point 1 n m'' ... Alignment Point 2 subsequence

group T1 subsequencegroup T2 Alignment Point 3 ... m Alignment Point k subsequence

group S1 subsequencegroup S2 m'+1

b. {|!¢£Ÿ 

Ÿ gh (divide and conquer)

!ij$9"D”!{|¢£kD

CD!¥M!¢£lÊ

ðñ!Km[N$no7!‘s3

,-“J${|ÐMø“

C

€Bpûrƒ€BuÐqr!

RSst$«¬ÊuÜ!ÑÒÇÈ

ޜf >?vw$xÎC

Thumbnail Area

Menu Bar Area

Image Processing Area Playback Area Single Frame Data Area Video Parameter Area

Status Bar Area Curve Data Area

Annotation Area

x. €BpûÞ

A»z»

pqno\`Œ„i

(5)

ÃÊy! z¾Cò{

øžÎpq!uv$†"#!

”$rƒ`Uinòó!s

tC

N´µ|

[1] S. Adah, K. S. Candan, S. H. Chen, K. Erol and V. S. Subrahmanian, “The Advanced Video Information System: Data Structures and Query Processing,” Multimedia Systems, Vol. 4, No. 3, pp. 172-186, 1996.

[2] C. W. Chang, K. F. Lin and S. Y. Lee, "The Characteristics of Digital Video and Considerations of Designing Video Databases," Fourth Intl. Conf. on Information and Knowledge Management CIKM'95, pp. 370-377, Baltimore, Maryland, USA, 1995.

[3] C. W. Chang and S. Y. Lee, “Indexing and Approximate Matching for Content-based Time-series Data in Video Database,” First Intl. Conference on Visual Information Systems Visual’96, pp. 567-576, Melbourne, Australia, 1996.

[4] C. Faloutsos, M. Ranganathan and Y. Manolopoulos, “Fast Subsequence Matching in Time-Series Databases,” ACM SIGMOD, pp. 419-429, Minneapolis, MN., USA, 1994.

[5] Flickner et al., “Query by Image and Video Content: The QBIC System,” IEEE Computer, pp. 23-32, Sep. 1995.

[6] D. E. Knuth, J. H. Morris and V. R. Pratt, "Fast Pattern Matching in Strings," SIAM J. Comput., Vol. 6, pp. 323-350, Jun. 1977.

[7] A. D. Narasimhalu, “Special Section on Content-based Retrieval,” Multimedia Systems, Vol. 3, No. 1, pp. 1-2, Feb. 1995.

[8] A. D. Narasimhalu, “Multimedia Databases,” Multimedia Systems, Vol. 4, No. 5, pp. 226-249, 1996.

[9] S. W. Smoliar and H. Zhang, "Content-Based Video Indexing and Retrieval," IEEE Multimedia, pp. 62-72, Summer 1994.

[10] S. Stevens and T. Little, “Introduction to the Special Issue on Video Information Retrieval,” ACM Trans. on Information System, Vol. 13, No. 4, pp. 371-372, 1995.

[11] Y. Tonomura, A. Akutsu, Y. Taniguchi and G. Suzuki, "Structured Video Computing," IEEE Multimedia, Vol. 1, pp. 34-43, Fall 1994.

參考文獻

相關文件

Define instead the imaginary.. potential, magnetic field, lattice…) Dirac-BdG Hamiltonian:. with small, and matrix

Coat video digital interaction teach

• Recorded video will be available on NTU COOL after the class..

Microphone and 600 ohm line conduits shall be mechanically and electrically connected to receptacle boxes and electrically grounded to the audio system ground point.. Lines in

– evolve the algorithm into an end-to-end system for ball detection and tracking of broadcast tennis video g. – analyze the tactics of players and winning-patterns, and hence

ƒ Visit the following page and select video #3 (How warrant works) to show students a video introducing derivative warrant, then discuss with them the difference of a call and a

Trace of display recognition and mental state inference in a video labelled as undecided from the DVD [5]: (top) selected frames from the video sampled every 1 second; (middle) head

(2) We emphasized that our method uses compressed video data to train and detect human behavior, while the proposed method of [19] Alireza Fathi and Greg Mori can only