Total energy and optimization

Chapter 3 Related Work 11

5.5 Total energy and optimization

v^′_{i, j}∈V^′∣∣(v^′i, j−v^′i, j−1)−(v^′i, j+1−v^′i, j)∣∣².

(5.9)

Note that the orientations of the edges do not need to resemble their original orienta-tions because the image is allowed to undergo a rotation if needed.

5.4 Scale energy

Ideally, the size of the input image should be maintained by the optimization system.

Otherwise, we can find solutions that prevent the QR code error by deforming the image to a very small size. Thus, we define the following scale energy as:

E_sc= ∑

(vp,vq)∈E∣∣(v^′p−v^′q)²−(v^p−v^q)²∣∣². (5.10)

5.5 Total energy and optimization

Before doing the optimization, we would locally translate the input image to find a position with lower QR code error energy. Since the rotate angle and the size of the input image are specified according to the user preference, we thus have no intend to do these transformations beforehand.

CHAPTER 5. CODE-AWARE IMAGE WARPING 25

The total energy for the code-aware warping is a weighted sum of the above energy terms:

E= wepE_ep+wemE_em+wecE_ec+wdE_d+wsE_s+wscE_sc. (5.11)

It is impossible to satisfy all constraints because they conflict with each other. Our method smoothly spreads the conflicts according to these weights and obtains the com-promised solution by a global optimization. Different relative weights result in different levels of distortion, and are determined by visual observation. For the shape distortion weight w_d, different types of images can afford different levels of distortions. For exam-ple, embedding an arbitrary deformable image allows the image to be naturally deformed for reducing the code error, while embedding a company logo requires the shape of the logo to be preserved. Therefore, we make the users to be able to adjust the shape dis-tortion weight w_d for different images. Empirically, we use the following set of weights:

w_ep= 0.1, w^em= 1, w^ec= 10, wd= [1,10], w^s= 0.01, w^sc= 0.000001, where wd= [1,10]

means the users are allowed to control in this range.

The total energy function is nonlinear, and does not have the analytical solution be-cause the QR code error energy term requires taking the image texture into account.

Therefore, we optimize the warping using an iterative steepest descent algorithm with a coarse-to-fine strategy by downsampling the input image and the QR code for a certain levels (3 levels in our implementation). The optimization process begins by solving the lowest resolution version of the warping problem. To achieve this, a coarser resolution mesh is constructed (due to the smaller image resolution) to optimize the total energy.

After the warping, we render the deformed image based on barycentric coordination, and

CHAPTER 5. CODE-AWARE IMAGE WARPING 26

the image is then upsampled to the next level. In the finer resolution level, a finer mesh is constructed, and the warping optimization problem is solved again. In our implementa-tion, the size of each grid is around 35×35 pixels, and further divided into two triangles.

The process iterates until the original resolution is reached. Because to evaluate the QR code error energy needs to perform the texture mapping at each iteration, the optimization process is not so efficient though can be improved by GPU. Usually it takes 5 to 8 minutes to get an optimized warping result with the current implementation.

Chapter 6 Module Stylization

The goal of the module stylization is to reshape the modules according to an input ex-ample shape, so that the modules will have similar appearance with the shape. In addition, the stylized modules should resemble the original square shapes in order to reduce the QR

QR Code modules

Examplar

. . . . . .

Figure 6.1: Assignment of boundary local patches. The examplar boundary patches (right) are assigned to the QR code boundary patches (left).

CHAPTER 6. MODULE STYLIZATION 28

code error. Since a shape can be represented as a binary image, we assume that the input is a binary (examplar) image. Our method synthesizes stylized QR code modules which resemble the examplar.

To synthesize the stylized modules, one possible way is to adopt the texture synthesis method on the binary images [9, 25]. However, these methods could unavoidably cause the QR code error because they do not have control on the shape boundaries. Instead, we develop a boundary synthesis method which synthesizes stylized boundaries of the modules. Our method is efficient because the search space is limited to boundary pixels, and the similarity metric only counts for contributions of boundary pixels rather than all pixels in a local window. Moreover, we balance the resemblance of the boundaries between the original regular shape and the stylized result, and therefore can reduce the QR code error in the synthesis process.

6.1 Similarity distance

The boundary synthesis method begins by representing the input examplar and the QR code modules using their boundary pixels as illustrated as the black outlines in Figure 6.1.

Let B_m and B_e denote the sets of boundary pixels of the original QR code modules and input examplar, respectively, our goal is to generate a set of boundary pixels B^′_m for the QR code modules, so that the new boundaries capture the local shape properties of the examplar boundary Be. Inspired by texture optimization [20], our method iteratively min-imizes the local appearance differences between the QR code module boundaries and the examplar boundaries.

CHAPTER 6. MODULE STYLIZATION 29

Let p_m∈ Bmand p_e∈ Bebe two boundary pixels of the original QR code modules and input examplar, respectively. We first trace the boundaries from the two pixels within a local window (with the size of 41× 41 in our implementation) to form two neighboring patchesN (p^m) and N (p^e) as illustrated as the red lines in Figure 6.1. Then, the similarity D(N (p^m),N (p^e)) between the two boundary patches N (p^m) and N (p^e) is defined as: q∈ N (p^e), respectively, and d(p,p^′) denotes the geodesic distance between the two pix-els when aligning the two patchesN (pm) and N (pe) together. Figure 6.2 illustrates an example of the similarity metric computation, where we sum up the average distances from a pixel of N (pe) to its corresponding pixel in N (pm) and vise versa. The cor-respondence of the pixels in N (p^m) and N (p^e) is decided by first parameterizing the boundaries while aligning their central pixels p_mand p_e and then matching the boundary pixels proportionally.

6.2 Boundary construction

We adopt an optimization method which iteratively modify the boundaries of the QR code modules, so that its local appearance is similar to that of the input examplar, and its global shape is similar to that of the original QR code module. In each iteration, we alternate between two steps:

1. matching each boundary patch of the QR code modules to a boundary patch of the

CHAPTER 6. MODULE STYLIZATION 30

Figure 6.2: An illustration of boundary patch similarity metric. The computation of the similarity between two boundary patches N (p^m) and N (p^e) is the sum of average per-pixel distances from a patch to the other.

input examplar.

2. modifying the boundaries of the QR code modules based on the matching results.

6.2.1 Boundary patch matching

As mentioned before, we want each boundary patch of the QR code modules to be similar to one of the input examplar. Therefore, for each boundary patch of the QR code modules, we search for the most similar boundary patch from the input examplar according to Eq. (6.1). For our purpose, solving the problem is not expensive because we only need to search along the boundary, and thus a brute force approach can be used to search over all boundary pixels. Note that if the input examplar is semantically directional (e.g., the fire examplar in Figure 6.3 (a)), the search space can be further limited so that the upper side of the QR code module should find the most similar boundary patch from

CHAPTER 6. MODULE STYLIZATION 31

Examplar (a) (b)

Figure 6.3: A comparison of the effect of label using. (a) uses the label control, (b) does not. As a result, the different side of the contour pixels of (a) have the features of the relative part of the examplar, while (b) does not have this effect.

the upper side of the fire examplar. Figure 6.3 shows a comparison of label using.

6.2.2 Boundary modification

After finding the most similar boundary patches, our goal is to modify the QR code module boundaries for increasing the boundary similarity between the QR code modules and the input examplar while maintaining the global features of the module boundaries.

To achieve this, we concatenate the boundary pixels in each matched boundary patch to construct the first-order representation of the modified boundaries. Figure 6.4 illustrates the process. Specifically, the boundaries on the QR code modules are first separated into horizontal and vertical directions. For a horizontal boundary, we align the x-coordinates of the matched boundary patches, and record the vector ∆p= [∆x^p, ∆yp]^T to the next boundary pixel for all pixels. At the regions where the x-coordinates of two boundary patches overlap, we find a cut point where their point vectors across the cut are the most

CHAPTER 6. MODULE STYLIZATION 32

similar. Similarly, the vertical boundaries can be handled similarly by aligning the y-coordinates matched boundary patches. This gives us the vector representation of the modified boundary in the form of an array of 2-dimensional vectors.

With the vector representation of the boundary in hand, we want to reconstruct an ideal boundary, so that it satisfies the previously stated goals. This is achieved by solving the following energy function which combines a local feature energy term (E_l) and a global shapeenergy term (E_g) as:

E= El+λEg. (6.2)

Ideally, the vector representation of the reconstructed boundary B^′ should resemble one obtained by concatenating matched boundary patches. Therefore, the local feature energy term (E_l) is defined as:

E_l= ∑

p_i ∣∣∆p^′i−∆pi∣∣², (6.3) where p_idenotes the set of boundary pixels.

The goal of the global shape energy term (E_g) is to prevent the modified boundary from changing too much, so we uniformly sample a set of points (usually with the module size¹) along the contours and fix their positions:

E_g= ∑

c_j ∣∣c^′j−cj∣∣², (6.4)

where c_j and c^′_j denote the positions of a sample pixel and in the modified boundary, respectively.

1We set 32 pixels as the width of a module and set the window size to be 41 pixels in our implementation.

CHAPTER 6. MODULE STYLIZATION 33

Original horizontal boundary pixels

Matched boundary patches

∆p

:

y x

1 0 1 1 0

2 1 1 1

1 1 0 1

1 ... Constructed

vector representation of the boundary Cut point

...

Figure 6.4: An illustration of concatenating matched patches on a horizontal boundary to form the vector representation of the modified boundary.

CHAPTER 6. MODULE STYLIZATION 34

The energy function is quadratic, and thus can be optimized by solving a linear sys-tem. In Eq. (6.2), the λ controls the contribution of the global shape energy term E_g to the energy function. Larger λ can better preserve the global shape of the QR code mod-ules as shown in Figure 6.5. Overall, we use λ= 1 to generate all results. The boundary patch matching and boundary modification steps are performed iteratively until conver-gence. After optimizing the location of each pixel, the neighboring pixels may not be connected to each other. Hence, we then enclose the boundary and fill the color inside the optimized contours. In addition, we perform an anti-aliasing technique to smooth the rendered stylized modules.

CHAPTER 6. MODULE STYLIZATION 35

λ = 0.001 λ = 0.01

λ = 0.1 λ = 1

Figure 6.5: A comparison of synthesized results using different values of λ . The bigger value of λ cause the contour pixels to be more restricted to the shape of original contour.

Chapter 7 Results

We applied our framework to embellish a wide variety of QR codes as shown in Fig-ure 7.2, FigFig-ure 7.3, and FigFig-ure 7.4. With the input images and examplars on the left, we can effectively decorate the QR codes while minimizing the error as much as possible.

In order to reduce the QR code error caused by the embedded images, darker objects should fit in black modules, and vice versa. As a result, the head and body of the

Mer-Figure 7.1: Demonstration of imposing a huge distortion weight on a special region which should not be deformed too much. The blue meshes in the right figure denote the region with a huge weight of distortion. The image is obtained from [3].

CHAPTER 7. RESULTS 37

Examplar

Input image Result

(a)

Examplar

Input image Result

(b)

Figure 7.2: Some results generated with our framework. These two cases are using examplars that are directional. The upper input image is obtained from [2].

CHAPTER 7. RESULTS 38

Examplar

Input image Result

(a)

Examplar

Input image Result

(b)

Figure 7.3: Some results generated with our framework. These two cases show that scaling may strongly reduce the caused QR code error. The input images are obtained from [6] and [1].

CHAPTER 7. RESULTS 39

Examplar

Input image Result

(a)

Examplar

Input image Result

(b)

Figure 7.4: Some results generated with our framework. Upper image deformed globally, and bottom image deformed locally. The input images are obtained from [7] and [3].

CHAPTER 7. RESULTS 40

lion in Figure 7.4 (b) and the body of the running people in Figure 7.2 (a) are slightly squashed to fit the shape of the black modules, and the body of the snake in Figure 7.4 (a) is stretched to fit in the black modules. On the other hand, the glass in Figure 7.3 (a) is compressed into the white modules. Sometimes, we may find out that uniform scaling may also reduce the error strongly, Figure 7.3 (b) is an example of this case. However, the slight distortion is not easy to be recognized, but can effectively increase the readability while keeping the artistic. Figure 7.2 (a) and (b) also show that our algorithm well per-forms the cases that the input examplars are directional. That means different sides of the modules should have different features.

Figure 7.1 demonstrates that for some regions which should not be distorted, users are allowed to impose a huge weight of distortion error on the corresponding meshes. Then, our optimization would preserve the shape of the specified regions as much as possible while minimizing the error caused by other regions.

In order to validate the effectiveness of the codeword-level and module-level errors, we made a comparison shown in Figure 7.7. The embedded panda is warped with respect to all three error levels (Figure 7.7 (a)) and only with respect to pixel-level error (Fig-ure 7.7 (b)), respectively. It turns out that although Fig(Fig-ure 7.7 (b) has lower pixel-level error (2.6% for (a) and 2.3% for (b)), Figure 7.7 (a) has fewer incorrect codewords and modules (2 incorrect codewords and 7 incorrect modules for (a) and 3 incorrect codewords and 8 incorrect modules for (b)).

CHAPTER 7. RESULTS 41

(a) (b)

Figure 7.5: Comparison with a previous method [23]. (a) A result of the previous method which which shifts and rotates the inserted image. (b) A result of our method with no global transformation and slight distortion.

7.1 Comparison with Previous Methods

We compare our result with previous method proposed by Ono et al. [23]. With global transformations, the embedded image could undergo a large translation or rotation that the user does not intend to do (Figure 7.5 (a)). Our method allows the input image to undergo a slight deformation instead (Figure 7.5 (b)). And compared with [16] and [35], since their method force one module to have only one color, in order to approximate the original resolution of inserted image, their method should adopt higher value of version.

Our method would not have the problem of resolution reduction or having noises on inserted images, but may not have as much as region for editing as their methods. To see the comparison, please refer to Figure 7.6.

CHAPTER 7. RESULTS 42

(a) (b)

Figure 7.6: Comparison with a previous method [35]. (a) A result of previous method which adopted non-systematic encoding method to form the result QR code. (b) A result of our method which would not reduce the input image resolution but would slightly deform the input image.

7.2 Performance

We implemented our method with a desktop PC equipped with an Intel i7 3.5GHz CPU, 16GB RAM. The performance of the QR code module stylization is efficient. The average execution time needs only 24 ms. The computational complexity of the code-aware non-uniform warping depends on the number of vertices and sizes of input image and QR code image. Our input images are usually with size of 2.5 megapixels, and QR code images are usually around with size of 6.4 megapixels. Overall, it takes about 5-8 minutes to converge because the optimization requires iteratively perform the texture mapping.

CHAPTER 7. RESULTS 43

(a)

(b)

Figure 7.7: Comparison of using three different error levels or not on the same input image [4]. The images on the left column texture show the incorrect pixels, modules, and codewords from top to bottom, respectively. (a) The result of optimizing with all three error levels. (b) The result of optimizing only with the pixel-level error.

CHAPTER 7. RESULTS 44

7.3 Discussion and Limitation

To evaluate the readability of the embellished QR codes, we revealed the ideal num-ber of incorrect codewords, modules, and pixels in our system. Additionally, we also tested the readability with nine different kinds of QR code readers (e.g., i-nigma, Scan, ScanMyDoc, etc.) as shown in Table 7.2. The readability is calculated as the number of successfully-recognizing readers over the total number of readers used (12 in our experi-ment). According to Table 7.2, using our optimization could reduce 10% codeword error on average. Different QR code readers may adopt different methods to analyze QR codes and rectify the perspective effect, thus some readers have lower successful detection rate.

Due to the above reasons we cannot ensure the embellished QR codes are definitely read-able. For example, in the third row of Table 7.2, although the QR code error is under the theoretical value that M level can tolerate, there is still a few QR code readers cannot decode the embellished results. We further examine the incorrect codeword percentage by randomly choosing weights in the range that proposed in Section 5 for thirty times of ten different images. The statistics are plotted in Figure 7.8, and the average percentage of the incorrect codewords before the optimization is 19%, and could be reduced to 9%

after the optimization. Since the fact that the incorrect codeword rate is over 7%, and the maximum incorrect codeword rate is under 16%, it may also indicate that using the M level for error correction is well enough to recover the error after our optimization, and we may have to admit that it is possible to generate unrecognized QR code results if users choose to use theL level for error correction. Moreover, if users impose a huge weight

CHAPTER 7. RESULTS 45

of distortion error globally, or embed an image that is too large for covering almost the entire QR code symbol, there may be no good way for us to diminish the error.

To further verify the robustness of the embellished QR codes, we tested the readability from different viewing directions.Table 7.1 shows the statistics generated by testing 6 em-bellished QR codes with 100% detection rate and their original counterparts. According to Table 7.1, if we scan the embellished QR codes image from the viewing angles which are more deviated from the center, the QR codes may have higher possibility to be unrec-ognizable. Regular QR codes also suffer from the deviating viewing angles. Fortunately, users usually scan a QR code with small viewing angles for guaranteeing the accuracy of the recognized result. Our embellished QR codes can also achieve similar readabilities to the regular QR codes when the viewing angles are small.

Our QR code stylizing method can effectively preserve the QR code correctness. Ta-ble 7.3 shows the statistics of errors when stylizing the modules with different examplars and λ (described in Eq. (6.2)). The percentage of the incorrect codewords is almost less than the error correction capability of the lowest error correction level (7%).

Although our framework can semi-automatically assist embedding an image into a QR code, the input image should not cover finder patterns and alignment patterns in the initial placement stage, since these patterns are crucial to decoding process and should always be revealed. In addition, our method finds a small region of the code where an image can be embedded without introducing too much error. Thus our method shares similar limitation that the larger the embedded image, the higher possibility that the embellished QR code cannot be recognizable.

CHAPTER 7. RESULTS 46

Figure 7.8: We randomly tested 30 different weights of combination and plotted the percentage of incorrect codewords before and after the optimization in the chart. The left and right red bars are the ranges of the percentage of incorrect codewords before and after the optimization, respectively. The red dotted lines denote the error percentage that each level can tolerate, and the black straight lines that cross the red bars denote the average

在文檔中 QR Code風格化之研究 (頁 36-0)