Recently three-dimensional (3D) images have become more popular in the amusement parks. It is difficult to get the 3D images in the real world, because a 3D camera is heavy to carry around and it is difficult to adjust the convergence, the sharpness and the zoom of the 3D camera. So, even if 3D-related technologies have been actively developed and many 3D displays are used in the professional field. 3D fields have not opened up yet. One reason is 3D image software had not been provided enough. We make efforts on the research of a single 2D image converting into stereo images. We hope that people can feel more real on the 3D display.
Some authors have proposed a real time method to convert ordinary 2D images into 3D [1], [2] on television. They use the “Computed Image Depth Method” (CID) [1] to evaluate the depth by a single image and/or the “Modified Time Difference” (MTD) [2] to convert the images, in different contest conditions, depending on the motion content. Other authors consider multiple camera sources to establish the “Structure from Motion (SFM)” defining a system with multiple viewpoints [3], [4]. There are some methods of shape recovery from shading (SFS) by using ICA-based reflectance model [5]. Some authors proposed depth estimation from image structure [6]. They proposed a source of information for absolute depth estimation based on the whole scene structure that does not rely on specific objects.
Others authors proposed a technique used to generate stereo pair images starting from a single image source and its related depth map [7]. But some hypothesis must be done and a manual processing is needed in depth map extraction.
“Geometry recovering” (GR) [8] tries to identify basic contest structure such as polygons, vanishing point, and so on to identify the best 3D spider mesh to move the camera and the viewpoint around into the picture. This technique requires a big off-line
effort to recover and to reconstruct the missing data.
The real goal is the reconstruction of the binocular view of a source for a monocular sampled source. To do this, we first calculate the parallax [9]-[11] value of each object in the image. Then, we build the left and right eye images to the final user in a 3D perspective entertainment. And our 3D experimental equipment is a DTI 201XLS monitor which uses barriers to let the users see two different images on the left and right eyes.
We proposed a novel automatic 2D to 3D image conversion technique. It means to convert a 2D image into the left and right eye images and apply on the 3D stereo monitor.
The conversion technique can be divided into two parts. The first part is to encode the 2D image into the depth map which includes depth information, and the second part is to decode the depth map into the stereo images which simulate the left and right eyes.
The part of encoder can be divided into two steps. The first step is the image segmentation technique based on the property of 3D stereo images. According to the property of 3D stereo image, we use SSR (Single Scale Retinex) to decrease the light reflection effect. We also use hue, saturation, and intensity to be our features. The traditional clustering method FCM (Fuzzy C-means clustering) is used. We set the threshold of the size of the image segmentation region. If the size of the region is less than the threshold, the region will be merged into the neighbor regions until all the sizes of the image segmentation regions are more than the threshold. The second step is depth extraction. According to the depth cues and the depth rules between each object, we estimate the objects belong to the fixed or gradual depths. So we can get more real effects.
The second part, decoder can be divided into two steps. The first step is to convert the depth map into the left and right eye images. We proposed two methods. The first method is linear shift algorithm, and this is the simplified method which can increase the speed of conversion effectively. The second method is binocular vision shift algorithm, and this method is based on the property of the human vision. After shifting, some points can’t be
calculated. So the second step is “interpolating holes method”. After the process of encode and decode, we can automatically generate the stereo images used in 3D display from a single 2D image.
We also implement the SANYO’s method "CID" (Computed Image Depth Method) algorithm which is converting a single 2D image into 3D images. And we will introduce that algorithm in section 2.3. We also establish the index for calculating the accuracy of the depth extraction. We compare the results of our proposed method with CID algorithm by the human perception and scientific data.
The rest of the thesis is organized as follows. Chapter 2 describes the background knowledge and related works. We introduce the depth perception and the fundamental of 2D image on 3D display. Before we design the 2D/3D software system, we must know why people have depth perception and the fundamental of 2D image on 3D display. Thus, we can design a 2D/3D software system based on the human vision and applied on the 3D display.
In the post of chapter 2, we describe our 2D/3D system and the 2D/3D conversion adaptive algorithm proposed from SANYO. Chapter 3 describes the encoder of our 2D/3D conversion adaptive algorithm which estimate depth map from a 2D image. Chapter 4 describes the decoder of our 2D/3D conversion adaptive algorithm which constructs binocular 3D image based on depth map. And the experimental results of our methods and CID algorithm are discussed in chapter 5. Finally, conclusions and the future work are summarized in chapter 6.