Printer Friendly

Joint template matching algorithm for associated multi-object detection.

1. Introduction

Object detection attracts lots of interests in the field of image analysis, recognition and tracking. Generally, the common algorithm for object detection is template matching (TM) [1][2]. The TM algorithm mainly includes two parts: one based on features [3][4][5][6][7] and the other based on normalized cross correlation (NCC) [8][9][10][11][12]. The former one requires that the features of objects are invariable to the position, scale and rotation. However extracting these features is not an easy task in many situations. The latter one computes the similarity between template and unknown objects according to the cross correlation, which is widely used because of its robustness under the brightness variations [13]. However, some drawbacks of NCC algorithm are obvious when detecting multi-objects. Firstly, only one object can be detected because we can obtain only one maximum value of NCC in an image; so the rate of miss-detection is high. Secondly, the phenomenon of false detection may occur when the images of objects are influenced by the factors such as occlusion, distortion, variation of light and similar objects [14][15].

If the relative positions among different objects are known, and the set of position distribution image is numerable, this kind of multi-object detection is called associated multi-object detection. In order to reduce the rate of miss-detection and false-alarm for associated multi-object detection, we propose a joint template matching (JTM) algorithm. Firstly, joint template (JT) is constructed based on the information of relative positions among different objects. Then, matching criterion according to NCC is constructed for multi-object matching. Finally, the proposed algorithm is applied to watermarks detection of bill, and good results are obtained.

The rest of this paper is organized as follows. In Section 2, the proposed algorithm is described. Results for watermarks detection are presented in Section 3, together with a comparison between the proposed algorithm and traditional NCC algorithm. Finally, conclusions are given in Section 4.

2. The Proposed Algorithm

2.1 Construction of JT

For associated multi-object detection, we can list all position distributions of the objects appearing in the image based on the information of relative positions among different objects. The set of position distribution is called JT, which can represent all position distributions of unknown objects which may appear in the image.

There are two basic assumptions to construct JT:

(1) The set of position distribution of multi-object is numerable;

(2) The relative position between two objects is known or measurable.

Let P be the number of members in the set of position distribution, and P is known according to condition (1), then JT = {[JT.sub.k]|1 [less than or equal to] k [less than or equal to] P}, where [JT.sub.k] = {[T.sup.l.sub.k]|1 [less than or equal to] l [less than or equal to] Q}, [T.sup.l.sub.k] is the lth template in the kth member, Q represents the total number of templates in the kth member. The relationship among different templates is determined by condition (2). Let ([x.sup.1.sub.k], [y.sup.1.sub.k]) be the coordinate of [T.sup.1.sub.k] in the image, then the coordinate of [T.sup.l.sub.k] is ([x.sup.1.sub.k] + [dx.sup.l.sub.k], [y.sup.1.sub.k] + [dy.sup.l.sub.k]). According to condition (2), [dx.sup.l.sub.k] and [dy.sup.l.sub.k] is known or measurable.

2.2 Joint Template Matching

The basic idea of JTM is to find similar multi-object with JT. We execute JTM by using NCC between templates and objects. The steps are as follows:

Step 1: Calculate the value of NCC between image and the kth member of JT.

Let (m, n) be the coordinates of [T.sup.1.sub.k] in the image, then the value of NCC between the image and the kth member of JT is:

[R.sub.k] (m,n) = [1/Q][Q.summation over (l=1)] [R.sup.max.sub.k](m,n,l) (1)

Where,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2)

Where, f(i, j) is the image under examination, [T.sup.l.sub.k](i, j) is the lth template in the kth member, of size M x N located at pixel coordinates (m + [dx.sup.l.sub.k] + x, n + [dy.sup.l.sub.k] + y). Because image distortion and calculation error always exist, there is some deviation between the information of relative position in multi-template and the one in multi-object. To find the best matching position of a single template [T.sup.l.sub.k], we need to compute the maximum value of NCC in a neighborhood region (NR) of the template. Therefore, [R.sup.max.sub.k](m, n, l) represents the maximum value of NCC when point (x, y) moves from one pixel to another in NR, with the size of (2 x [D.sub.x] + 1) x (2 x [D.sub.y] + 1).

Step 2: Finding the best matching position of the kth member in the template.

Let S be the region to which Tk1 can move in the image. When (m, n) is moving in S, we find the position where [R.sub.k](m, n) is maximum, which is the best matching position of the kth member in the template. The maximum value of [R.sub.k](m, n) is [R.sub.k]:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)

Step 3: Finding the best matching position of JT.

After finding best matching positions of every member in JT, we find the position where [R.sub.k] is maximum, which is the best matching position of JT:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (4)

Assuming that R achieves maximum value at member [JT.sub.k], meanwhile (m, n) moves to ([m.sub.k], [n.sub.k]), we can then detect all unknown multi-object, which are corresponding to the templates in [JT.sub.k]. Then their positions in the image can be obtained by adding position ([m.sub.k], [n.sub.k]) to the positions of the corresponding templates in [JT.sub.k].

Overall, the basic principle of JTM is to match multi-object according to the relative position among different objects, to reduce the rate of miss-detection and false-alarm for single-template matching. JTM algorithm can be used in the field of associated multi-object detection such as watermarks detection of bill and component detection of circuit board.

3. Watermarks Detection of Bill Based on JTM

3.1 Construction of JT for watermarks

In the processing of bill production, only one kind of watermark board is used to produce watermarks in bills. Therefore, the number of watermarks' types is constant, and the relative position among different watermarks is constant. A bill image is shown in Fig. 1, and all eight types of watermarks are shown in Fig. 2. The relative position among different watermarks in the image is known, and the set of position distribution of watermarks in different bills is numerable. So we can construct JT for watermarks as described in 2.1.

Let [W.sub.0] be the watermark at top-left corner of the image, then [W.sub.0] can determine the position distributions of watermarks in different bills. When [W.sub.0] varies from [W.sub.1] to [W.sub.8], there will be 8 sorts of different position distributions, so P=8. If we take no account of the incomplete watermarks on the boundaries of the image, there are 18 watermarks in each bill, so Q=18. Therefore, JT for watermarks can be described as: JT = {[JT.sub.k]|1 [less than or equal to] k [less than or equal to] 8}, [JT.sub.k] = {[T.sup.l.sub.k]|1 [less than or equal to] l [less than or equal to] 18}.

Components of [JT.sub.k] are shown in Fig. 3, where [T.sup.1.sub.k] is a watermark of [W.sub.1] type. [W.sub.0] = [W.sub.1], and its coordinate is ([x.sup.1.sub.k], [y.sup.1.sub.k]), dx and dy are the distance between two neighboring watermarks at horizontal and vertical direction, respectively. Thus, the coordinate of [T.sup.l.sub.k] is ([x.sup.1.sub.k] + [l.sub.1] x dx, [y.sup.1.sub.k] + (l-6 x [l.sub.1]) x dy), where [l.sub.1] = floor(l/6), and the floor() function returns the value of a number rounded DOWNWARDS to the nearest integer.

[FIGURE 1 OMITTED]

[FIGURE 2 OMITTED]

[FIGURE 3 OMITTED]

[FIGURE 4 OMITTED]

3.2 Image Preprocessing

There are two tasks need to achieve in the stage of image preprocessing:

(1) Image cutting and resizing is executed to obtain normalized bill image shown in Fig. 1;

(2) Two-level wavelet transform is executed to obtain the low-frequency image.

The contribution of wavelet transform has two aspects: (1) to reduce computation cost of following process by reducing image size while reserve most information of the image; (2) to enhance robustness by reducing the influence of light variation and noise in the image. In this paper, Mallat wavelet is used to achieve fast wavelet transform.

3.3 Image Binarization

For different images of bills, the degrees of occlusion to watermarks from lines, texts and frames are different. Therefore, it's difficult to obtain good results by using TM algorithm based on gray features. Instead, shape features are more important. In this paper, we execute JTM algorithm based on binary image, to avoid the influence of above factors by ignoring the distribution characters of gray level.

The distinction between watermarks and background is unapparent, so we obtain binary image by using local-threshold segmentation method. Two functions [F.sub.1] and [F.sub.2] are constructed in this paper. [F.sub.1] is used to describe weighted gradient of Laplacian. The weight value is the inverse of distance, so the closer pixels from center make greater contribution to gradient; [F.sub.2] is used to describe the difference between the gray value of one pixel and the average value of its neighborhood (SR), of size (2 x w + 1) x (2 x h + 1) pixels. Then, [F.sub.1] and [F.sub.2] can be described as:

[F.sub.1] (i, j) = r x [h.summation over (y = -h, [x.sup.2] + [y.sup.2] [not equal to] 0)][w.summation over (x = -w)] [f(i, j) - f(i + x, j + y)]/[square root of [x.sup.2] + [y.sup.2]] (5)

[F.sub.2] (i, j) = r x [f(i, j) - [1/(2w + 1) x (2h + 1)] [h.summation over (y = -h)][w.summation over (x = -w)] f(i + x, j + y)] (6)

Where, r is used to describe watermark's attributes of black or white. If watermark is black, r = 1, and the binary image is denoted by Img_B. Otherwise, r = -1, and the binary image is denoted by Img_W. Let g(i, j) be gray value of pixel (i, j) in the binary image, then:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (7)

There are many noises in the binary image, which may interfere with the process of object detection. Therefore, mathematical morphology method is executed to remove small noise and amalgamate neighboring pixels.

3.4 Joint Template Matching

The basic unit of JTM is to compute the value of NCC between a single template in JT and an unknown object in source image. Considering the differences of black watermarks and white ones, we need to construct the corresponding relationship among JT, binary template and image according to different values of l.

The corresponding binary templates of Fig. 2 are shown in Fig. 4, of size M x N pixels. According to Fig. 2, Fig. 3 and Fig. 4, the template-image index table of the kth member in JT is shown in Table 1, and the ones of the other 7 members can be constructed in the same way. Then, [R.sup.max.sub.k](m, n, l) is obtained by putting [T.sup.l.sub.k](i, j) and corresponding g(i, j) into formula (2).

According to formulas (1), (3) and (4), we can compute R after obtaining [R.sup.max.sub.k](m, n, l). Where, the region S to which [T.sup.1.sub.k] can move in the image can be determined based on the distribution of watermarks in bill:

S = {(m, n)|0 [less than or equal to] m [less than or equal to] dx, 0 [less than or equal to] n [less than or equal to] dy} (8)

After obtaining R with formula (4), the member [JT.sub.k] corresponding to R can be regarded as the matching member of JT. Then the types and positions of unknown multi-object in the image can be determined by the templates in [JT.sub.k], as described in 2.2.

Considering that both source image and template image are binary image, and the pixels' values of background and object can be denoted by 0 and 1, respectively, so we can rewrite [R.sup.max.sub.k](m, n, l) as formula (9), which can reduce computational cost by avoiding a lot of multiplication operators.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (9)

Where,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (10)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (11)

[N.sub.T _T] = [N.summation over (j=1)][M.summation over (i=1)] [T.sup.l.sub.k][(i, j).sup.2] = [N.summation over (j=1)][M.summation over (i=1)] [T.sup.l.sub.k](i, j) (12)

Where, symbol "&" represses the AND operator.

3.5 Experiment and Analysis

Experimental environment: Randomly selecting 500 pieces of bills, we capture 1500 images using 3 devices, and then sort them into 3 groups according to different device number. The size of original image, normalized image and that after wavelet transformation is 720 x 576, 444 x 256 and 111 x 64 (defined as W x H), respectively. The size of watermark template is 15 x 11 (defined as M x N). The distance parameter (dx, dy) between two neighboring watermarks is set to (17, 20).

Experimental methods: Firstly, binary templates of watermark are selected as shown in Fig. 4; secondly, binary images are obtained as described in 3.3; finally, two algorithms of NCC [1] and JTM are used to detect all 18 watermark objects in bills. The size of SR in the stage of image binarization is set to 5 x 5. And the size of NR of JTM is set to 15 x 11 ([D.sub.x] = 7, [D.sub.y] = 5). The results are shown in Table 2.

Experiment analysis:

(1) Analysis of object detection performance: It can be seen from Table 2 that, the rate of miss-detection by NCC algorithm is very high, because it can only detect one object each time according to the maximum value of NCC. For JTM algorithm, all position distributions of the objects appearing in the image have been listed in JT, so the rate of miss-detection is zero. Meanwhile, because of the similarity among different types of watermarks, and the occlusion to watermarks from lines, texts and frames in image, the matching position with the maximum value of NCC may not be the best one. Therefore, the rate of false-alarm by NCC algorithm is high. On the other hand, JTM algorithm uses the associated information among different watermarks to achieve corporate matching for all objects, and finds the best distribution from all possible position distributions in JT, which leads to great false-alarm rate reduction. In addition, the NCC algorithm is sensitive to interferences such as different occlusion of different types of watermarks, different gray distribution of watermarks due to the light's position and intensity, camera's position and other structures of different devices. But JTM algorithm reduces the influence on each object by matching all objects at one time, so it is more [begin strikethrough]the[end strikethrough] robust than the NCC algorithm.

(2) Analysis of algorithm efficiency: Most of the computation cost of JTM algorithm is to calculate the value of NCC. Let [t.sub.0] be the computation cost of NCC calculation for one time. In the process of watermarks detection, the computation cost of NCC algorithm is [t.sub.1] = (W-M) x (H-N) x [t.sub.0], the one to one template matching in JTM algorithm is [t.sub.2] = (2 x [D.sub.x] + 1) x (2 x [D.sub.y] + 1) x [t.sub.0], and the one to all templates matching in JTM algorithm is [t.sub.3] = P x Q x dx x dy x [t.sub.2]. In our experimental environment, [t.sub.1] = 5088 x [t.sub.0], [t.sub.2] = 165 x [t.sub.0] and [t.sub.3] = 8078400 x [t.sub.0]. As a whole, the computation cost of JTM algorithm is about 1588 times higher than that of NCC algorithm. But JTM algorithm can detect 18 objects, so the average computation cost to detect one object is (448800 x [t.sub.0]), which is about 88 times higher than that of NCC algorithm. As the computation process of each member in JTM is independent, parallel processing can be used to improve the efficiency of JTM algorithm.

(3) Analysis of memory cost: Compare to NCC algorithm, JTM algorithm needs more memory to store multi-object template and the template-image index table. The memory size is 4 x M x N + P x Q x 2 = 948 bytes in our experimental environment.

4. Conclusions

A novel JTM algorithm is proposed in this paper, which can reduce the rate of miss-detection and false-alarm for associated multi-object detection. The new algorithm can be used in the field of associated multi-object detection such as watermarks detection of bill and component detection of circuit board. Practically, the matching criterion of NCC can be displaced by feature matching according to the specific demands. Drawback is the efficiency of new algorithm. Future work will focus on eliminating this drawback.

DOI: 10.3837/tiis.2012.01.022

References

[1] K. P. William, "Digital image processing," of John Wiley & Sons, 2001. Article (CrossRef Link)

[2] B. Perret, S. Lefevre and C. Collet, "A robust hit-or-miss transform for template matching applied to very noisy astronomical images," Journal of Pattern Recognition, vol. 42, no. 11, pp. 2470-2480, Feb. 2009. Article (CrossRef Link)

[3] Y. H. Lin and C. H. Chen, "Template matching using the parametric template vector with translation, rotation and scale invariance," Journal of Pattern Recognition, vol. 41, no. 7, pp. 2413-2421, Jan. 2008. Article (CrossRef Link)

[4] F. Ullah and S. Kaneko, "Using orientation codes for rotation-invariant template matching," Journal of Pattern Recognition, vol. 37, no. 8, pp. 201-209, Jan. 2004. Article (CrossRef Link)

[5] H. Y. Kim, "Rotation-discriminating template matching based on Fourier coefficients of radial projections with robustness to scaling and partial occlusion," Journal of Pattern Recognition, vol. 43, no. 3, pp. 859-872, Aug. 2010. Article (CrossRef Link)

[6] F. Essannouni, and D. Aboutajdine, "Fast frequency template matching using higher order statistics," IEEE Transactions on Image Processing, vol. 19, no. 3, pp. 826-830, Mar. 2010. Article (CrossRef Link)

[7] S. Mattoccia, F. Tombari and L. Di Stefano, "Efficient template matching for multi-channel images," Journal of Pattern Recognition Letters, vol. 32, no. 5, pp. 694-700, Dec. 2011. Article (CrossRef Link)

[8] L. Di Stefano, S. Mattoccia and F. Tombari, "ZNCC-based template matching using bounded partial correlation," Journal of Pattern Recognition Letters, vol. 26, no. 14, pp. 2129-2134, May. 2005. Article (CrossRef Link)

[9] S. Mattoccia, F. Tombari and L. Di Stefano, "Fast full-search equivalent template matching by enhanced bounded correlation," IEEE Transactions on Image Processing, vol. 17, no. 4, pp. 528-538, Apr. 2008. Article (CrossRef Link)

[10] H. Li, H. B. Duan and X. Y. Zhang, "A novel image template matching based on particle filtering optimization," Journal of Pattern Recognition Letters, vol. 31, no. 13, pp. 1825-1832, Dec. 2010. Article (CrossRef Link)

[11] S. D. Wei and S. H. Lai, "Fast template matching based on normalized cross correlation With adaptive multilevel winner update," Journal of IEEE Transactions on Image Processing, vol. 17, no. 11, pp. 2227-2235, Nov. 2008. Article (CrossRef Link)

[12] H. B. Duan, C. F. Xu, S. Q. Liu et al., "Template matching using chaotic imperialist competitive algorithm," Journal of Pattern Recognition Letters, vol. 31, no. 13, pp. 1868-1875, Dec. 2010. Article (CrossRef Link)

[13] J. C. Yoo and T. H. Han, "Logic operation-based template matching algorithm for one-dimensional signals," Institution of Engineering and Technology(IET) Signal Processing, vol. 5, no. 2, pp. 261-269, Apr. 2011. Article (CrossRef Link)

[14] C. H. Wu, D. Z. Wang, A. Ip et al., "A particle swarm optimization approach for components placement inspection on printed circuit boards," Journal of Intelligent Manufacturing, vol. 20, no. 5, pp. 535-549, Jun. 2009. Article (CrossRef Link)

[15] [15] A. J. Crispin and V. Rankov, "Automated inspection of PCB components using a genetic algorithm template-matching approach," International Journal of Advanced Manufacturing Technology, vol. 35, pp. 293-300, Oct. 2007. Article (CrossRef Link)

Jianbin Xie (1), Tong Liu (1), Zhangyong Chen (2) and Zhaowen Zhuang (1)

(1) Department of Electronic Science and Engineering, National University of Defense Technology Changsha, China

(2) ZHONGCHAO Enterprise CO, LTD. Beijing, China

[e-mail: jbxie@126.com, liutong1129@126.com]

* Corresponding author: Tong Liu

Received September 6, 2011; revised November 9, 2011; December 15, 2011; Published January 31, 2012

Jianbin Xie received his B.S., M.S. and Ph.D. degrees from the National University of Defense Technology in 1993, 1995 and 1999. He is currently working as a research fellow at National University of Defense Technology. His main research interests include biometric identification, pattern recognition and machine learning.

Tong Liu received his B.S. and M.S. degrees from the National University of Defense Technology in 2004 and 2006. He is currently working toward a Ph.D. degree at National University of Defense Technology. His current research interests include biometric identification, pattern recognition and machine learning.

Zhangyong Chen received his B.S. and M.S. degrees from the National University of Defense Technology in 1993 and 1995. His current research interests include biometric identification, pattern recognition and machine learning.

Zhaowen Zhuang received his B.S. and M.S. degrees from the National University of Defense Technology in 1981 and 1984, and received his Ph.D. degrees from Beijing Institute of Technology in 1989. He is currently working as a professor in the National University of Defense Technology. His current research interests include target recognition and satellite navigation.
Table 1. The template-image index table

l                           1            2            3

[T.sup.l.sub.k](i,j)    [Tp.sub.1]   [Tp.sub.3]   [Tp.sub.2]
g(i,j)                    Img W        Img_W        Img_B

l                           7            8            9

[T.sup.l.sub.k](i,j)    [Tp.sub.3]   [Tp.sub.2]   [Tp.sub.4]
g(i,j)                    Img_W        Img_B        Img_B

l                           13           14           15

[T.sup.l.sub.k](i,j)    [Tp.sub.1]   [Tp.sub.3]   [Tp.sub.2]
g(i,j)                    Img B        Img_B        Img_W

l                           4            5            6

[T.sup.l.sub.k](i,j)    [Tp.sub.4]   [Tp.sub.1]   [Tp.sub.3]
g(i,j)                    Img_B        Img_W        Img_W

l                           10           11           12

[T.sup.l.sub.k](i,j)    [Tp.sub.1]   [Tp.sub.3]   [Tp.sub.2]
g(i,j)                    Img_W        Img_W        Img B

l                           16           17           18

[T.sup.l.sub.k](i,j)    [Tp.sub.4]   [Tp.sub.1]   [Tp.sub.3]
g(i,j)                    Img_W        Img_B        Img B

Table 2. The results of watermarks detection

Algorithms                                 NCC

Types of watermarks         [W.sub.1]   [W.sub.2]   [W.sub.3]

                  Group 1      13.         13.         7.2
The rate of                    2%          8%           %
false-alarm       Group 2      12.         13.         6.4
                               8%          8%           %
                  Group 3      13.         13.         6.0
                               2%          2%           %

The rate of miss-detection     94.         94.         94.
                               4%          4%          4%

Algorithms                                 NCC

Types of watermarks         [W.sub.4]   [W.sub.5]   [W.sub.6]

                  Group 1      6.8         12.         14.
The rate of                     %          8%          2%
false-alarm       Group 2      6.8         13.         14.
                                %          4%          2%
                  Group 3      7.0         13.         14.
                                %          2%          2%

The rate of miss-detection     94.         94.         94.
                               4%          4%          4%

Algorithms                           NCC               JTM

Types of watermarks         [W.sub.7]   [W.sub.8]   [W.sub.1]~
                                                    [W.sub.8]

                  Group 1      7.4         7.8         0.00
The rate of                     %           %           6%
false-alarm       Group 2      7.2         7.6         0.00
                                %           %           4%
                  Group 3      7.2         7.8         0.00
                                %           %           6%

The rate of miss-detection     94.         94.         0.0%
                               4%          4%
COPYRIGHT 2012 KSII, the Korean Society for Internet Information
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2012 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Xie, Jianbin; Liu, Tong; Chen, Zhangyong; Zhuang, Zhaowen
Publication:KSII Transactions on Internet and Information Systems
Article Type:Report
Date:Jan 1, 2012
Words:4113
Previous Article:myEvalSVC: an integrated simulation framework for evaluation of H.264/SVC transmission.
Next Article:A P2P-to-UPnP proxy gateway architecture for home multimedia content distribution.
Topics:

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters