Printer Friendly

A 3-Dimensional Object Recognition Method Using Relationships between Feature Points, and Invariance of Local Hue Histogram.

1 INTRODUCTION

1.1 Background of This Study

In recent years, the declining birthrate and an aging population are developing, and there is concern about the lack of labor power such as household chores and nursing care at home. Accordingly, application of robot technology to the living field is expected. In the living field, robots that support the lives of people are collectively referred to as Life Support Robots [1][2][3]. This robot is required to perform various tasks to support the human. Especially, objects recognition task is important when people request the robots to transport and rearrange objects. We consider that there are six necessary properties to recognize in domestic environment as follows.

1. Robustness against occlusion

2. Fast recognition

3. Pose estimation with high accuracy

4. Coping with erroneous correspondences

5. Recognizing objects in a noisy environment

6. Recognizing objects which have same shape but have different texture

Firstly, the robots need the robust recognition for occlusion because occlusion frequently occurs between different objects in domestic environment. Secondly, the robots need to recognize a target object fast to achieve required tasks fast. Thirdly, the robots need to estimate a pose of a target object with high accuracy to manipulate a target object. Fourthly, the robots need to cope with erroneous correspondence to accurately recognize objects which have the same feature in a local region but which are not same. For example, a cube and a rectangular parallelepiped they both have same feature points in their vertex, however aspect ratio is totally different. Fifthly, a target object contains some noises with high probability when recognizing from cameras and sensors, so the robots need the robust recognition for noise. Finally, the robots need to accurately recognize objects which have same shape but which have different texture.

As conventional object recognition method using 3-dimensional information, there is model-based recognition method such as the previous research by Kudo et al [4]. The previous research uses SHOT (Signature of Histogram of Orientations) descriptor as feature descriptor [5]. SHOT descriptor is expressed by a histogram with 352 dimensions which is described by the relationships between the reference point and surrounding points. As above, SHOT enables highly accurate pose estimation, and highly accurate object recognition in noisy environment by high dimensional feature description. Furthermore, the previous research uses some matched points by SHOT descriptor as feature points. Then the previous research generates a list by listing relationships of distances and angles between feature points, and matches lists. Thereby, the previous research can cope with erroneous correspondences.

However, the previous research erroneously recognizes objects which have same shape but which have different textures as shown in Figure 1 because the previous research uses only shape information of an object.

Table 1 shows properties of the previous research. As we mentioned, the previous research does not satisfy all the properties.

To satisfy all the properties for recognition, it is necessary to use not only shape information of an object but also texture information of an object.

As general object recognition method using texture information, Template Matching is widely known [6]. Template Matching calculates whether a pattern similar to the template exists in the image region by comparing pixels as shown in Figure 2.

Therefore, Template Matching may misrecognize objects when changes such as the scale change and the rotation change are applied to a target object. Since recognition environment is unspecified when recognizing an object in domestic environment, changes are applied to a target object with high probability. Therefore, we think that a recognition method which has robustness against those changes is indispensable in this research.

As object recognition method which has robustness against above changes, SIFT (Scale Invariant Feature Transform) is widely known [7]. SIFT uses the points which have extreme values in DoG image as feature points. Furthermore, SIFT descriptor is expressed by a gradient histogram based on the direction of the feature point as shown in Figure 3.

Thereby, SIFT can correctly recognize objects even when the scale change, the rotation change and the illumination change occurs. However, SIFT descriptor is easily effected because perspective projection adds distortion to an image. Furthermore, objects which have few textures have hardly a local luminance gradient, therefore SIFT is difficult to describe features. On the other hand, there is Color Indexing as a robust method for perspective projection [8]. Color Indexing uses 3-dimensional color histogram based on the RGB values in an image as feature descriptor. Figure 4 shows an example of 3-dimensional color histogram. As shown in Figure 4, a size of the square in 3-dimensional color histogram expresses the frequency of each color.

The values of 3-dimensional color histogram has a characteristic which is hardly effected from the scale change, the rotation change and perspective projection. Therefore, Color Indexing can correctly recognize objects even when the scale change, the rotation change and perspective projection occur. However, the RGB color system which is used for Color Indexing is easily affected by lighting.

Table 2 shows properties of these methods. As we mentioned, two of the method do not satisfy all the properties.

Therefore, to compensate for the defect of SIFT and Color Indexing, we have developed the previous research using texture information for the object recognition [9]. The previous research using texture information focuses on the invariance of the positions of the unevenness of the hue histogram. Figure 5 shows the invariance of the positions of the unevenness of the hue histogram for each change.

As shown in Figure 5, the positions of the unevenness of the hue histogram have a characteristic, which they do not change even when the scale change, the rotation change, the illumination change and the perspective projection occur. Furthermore, even when the occlusion occurs, the unevenness is seen in other places, but the positions of the original unevenness do not change. For these reasons, the previous research using texture information uses the positions of the unevenness of the hue histogram as feature descriptor. In addition, the previous research using texture information divides an image into plural regions and generates a hue histogram for each divided region. Thereby, even if objects have similar hue values, the previous research using texture information can accurately recognize ones. From above, we adopt the previous research using texture information as an object recognition method using texture information.

1.2 Purpose of This Study

To satisfy the six properties for recognition as shown in Table 1, we propose a 3-dimensional object recognition method using relationships between feature points, and invariance of local hue histogram. As our approaches, firstly, the proposed method extracts correspondence points by matching lists which consist of relationships of distances and angles between feature points. Secondly, the proposed method estimates a pose of a target object with high accuracy and performs registration between objects. Thirdly, the proposed method projects objects after registration to 2-dimensional plane. Fourthly, the proposed method divides the 2-dimensional planes into plural regions and generates a hue histogram for each divided region. Finally, the proposed method extracts the position of unevenness from the generated hue histograms as invariant feature, and matches based on extracted invariant features. Thereby, the proposed method can accurately recognize objects which have same shape but which have different textures.

2 PROPOSED METHOD

2.1 Flow of the Proposed Method

In this section, we describe about an overview of the proposed method based on its processing flow. Figure 6 shows the flow of the proposed method.

As shown in Figure 6, the proposed method consists of candidate regions extraction process and recognition process. We represent each process in the next sections.

2.2 Input Point Cloud Data

Firstly, the proposed method inputs a teaching data and a scene data as shown in Figure 7. Here, the teaching data has shape information and texture information for the entire circumference of the object.

2.3 Object Region Extraction

To extract useful information from a huge amount of 3-dimensional data, the proposed method segments object regions in scene data. In this paper, we assume that the target object is on a table or a floor in domestic environment as shown in Figure 7 (b). Therefore, the proposed method firstly detects the plane region by applying a plane detection method using RANSAC [10] and excludes it. Secondly, the proposed method uses the clustering method to cluster each object as shown in Figure 8.

2.4 SHOT Descriptor Description

To extract feature points, the proposed method uses SHOT (Signature of Histogram of Orientations) descriptor as feature descriptor. SHOT descriptor is expressed by a histogram with 352 dimensions which is described by the relationships between the reference point and surrounding points. Therefore, the surface features of the three-dimensional model can be described with unique and repeatability by using SHOT descriptor. In this section, we explain about how to describe the SHOT descriptor according to Figure 9.

As shown in Figure 9, SHOT descriptor is defined by the normal direction histogram of the peripheral point group. Firstly, surround the reference point with a sphere. This sphere is the range of points used for description. Furthermore, this sphere is divided into 32 rooms by dividing it into 2 rooms in the z axis direction, 2 rooms spherically on the center and outside, and 8 rooms for xy plane. Finally, in each room, the inner product of the normal [n.sub.i] of the point existing in the room and the norm r of the reference point is calculated. If the normal is normalized, the inner product can be expressed by cos [theta] of r and [n.sub.i]. Since cos [theta] takes value from 0 to 1 (-90 [less than or equal to] [theta] [less than or equal to] 90), it is divided into bin number and converted into a histogram.

2.5 Feature Points Extraction

The SHOT descriptor is represented by a vector of high dimensions. KNN search is used to match this feature. If the ratio of the distance to the first node and the distance to the second node obtained by KNN search is equal to or greater than a certain value, we consider that it is available for discrimination as a feature quantity and save the matching. Conversely, if there is not much difference between the distances of the first node and the second node, it is considered to be unstable to use for matching, and it is excluded. Furthermore, the most desirable matching point is searched out from the saved point group. In this research, the matched points are registered as feature points.

2.6 List Generating

In the list generating process, the proposed method generates the list of distances and angles between extracted feature points as relationships of these points. To generate the list of relationships, the proposed method firstly sorts the extracted feature points in descending order of the dispersion of SHOT descriptor as shown in Figure 10.

The dispersion is calculated by the following equation.

[[sigma].sup.2] = [[SIGMA].sup.n.sub.i = 1]([x.sub.i] - [bar.x]) (1)

Where, [[sigma].sup.2] is the dispersion. n is the number of data. [x.sub.i] is each data. [bar.x] is the average value.

Secondly, the proposed method extracts the combination of three points based on the order of the aligned feature points as much as possible, and describes the relationships of them in a list as shown in Figure 11, Table 3 and Table 4.

2.7 List Matching

In the list matching process, the proposed method matches the list of the teaching data and the list of each cluster data. As shown in Figure 11, Table 3 and Table 4, the lists have distances between the feature points, and an angle as element. Then, the proposed method matches between list number 1 of the teaching data and all the lists of each cluster data. Furthermore, in the proposed method, if the sum of the difference of distances between corresponding point 1 and corresponding point 2, and the difference of distances between corresponding point 1 and corresponding point 3, and the difference of the angle between the those vectors is minimum and less than the threshold, a list having it is registered as corresponding list. At this time, the feature points of each element of these lists are associated. Thereby, the proposed method can eliminate mismatched points which are occurred while the matching is conducted by SHOT descriptor.

2.8 Rigid Registration

To estimate the pose of the target object in the scene data, the proposed method applies the rigid registration to the teaching data as shown in Figure 12. Firstly, the proposed method fits the teaching data to each cluster data in the scene by calculating the optimum rotation matrix R and the translation vector t from associated feature points. Secondly, the proposed method calculates a corresponding rate M between a fitted teaching data and each cluster data by using

[mathematical expression not reproducible] (2)

[dist.sub.ij] = ||[p.sub.i] - [q.sub.j]||,

M = [Score/L] * 100. (3)

Where, N is the number of points of the teaching data. L is the number of points of each cluster data. [p.sub.i] is matched point of the fitted teaching data. [q.sub.j] is matched point of each cluster data in the scene. The proposed method counts a number of pi which are within a threshold t[h.sub.c] which is 1 [mm] of [q.sub.j] by the equation (2) as a score. And then, the proposed method calculates the corresponding rate M based on the score by equation (3). Finally, the proposed method selects objects which have the corresponding rate higher than the threshold value in scene data as candidate regions as shown in Figure 13.

2.9 2-Dimensional Projection

The proposed method projects the fitted teaching data and each candidate region onto a 2-dimensional plane respectively to extract texture information. It is not necessary to take into consideration problems related to rotation change and scale change since the positions of the teaching data and each candidate region are matched in rigid registration processing. Therefore, we use parallel projection as a 2-dimensional projection method. In Parallel Projection, a 3-dimensional point cloud of an object is projected to 2-dimensional plane by using

x' = x

y' = y (4)

z' = z = const

As shown in Figure 14. Where, (x.y.z) is point before conversion. (x', y', z') is after conversion.

Figure 15 shows the result of projecting the teaching data and a part of the candidate region to 2-dimensional plane.

2.10 Object Region Division

To generate the local hue histogram, the proposed method divides the planes of the teaching data and each candidate regions into plural regions. Here, we define the number of division as four as example.

2.11 Hue Histogram Generation

To generate the hue histogram, the proposed method extracts hue from each divided regions of the teaching data and each candidate regions. Then, the proposed method generates local hue histograms. The hue value of the generated histogram is represented from 0 to 359. Figure 16 shows the divided regions and local hue histograms. We define the vertical axis as the frequency, and the horizontal axis as the hue value.

Here, we focus item 2 of Figure 16. Figure 17 (a) shows an expanded hue histogram of item 2 of Figure 16. In Figure 17 (a), because there are small irregularities at 100 and 102 of hue value, the feature descriptor becomes unstable by extracting the positions of peak and trough of the hue histogram in this state. Therefore, to eliminate those small irregularities, the proposed method smooths the hue histogram by using Gaussian Filter. Figure 17 (b) shows a smoothed hue histogram of Figure 17 (a). In Figure 17 (b), we can see small irregularities of the hue histogram are omitted, and the characteristic positions of peak and trough of the hue histogram are remained.

2.12 Characteristic Position Extraction

In characteristic position extracting process, the proposed method registers the positions of peak and trough of the hue histogram as feature descriptor. The proposed method extracts the positions of peak from the smoothed hue histograms of teaching data and each candidate regions by using

([H.sub.x - 1] < [H.sub.x]) [LAMBDA] ([H.sub.x] > [H.sub.x+1]), (5)

and the positions of trough from the smoothed hue histograms of teaching data and each candidate regions by using

([H.sub.x-1] > [H.sub.x]) [LAMBDA] ([H.sub.x] < [H.sub.x+1]), (6)

and registers them as feature descriptor. Where, [H.sub.x] is the hue value which is focused on. [H.sub.x - 1] is the hue value before one of [H.sub.x]. [H.sub.x+1] is the hue value after one of [H.sub.x]. And then, the extracted positions of peak and trough of teaching data are expressed by using

[mathematical expression not reproducible] (7)

[mathematical expression not reproducible] (8)

Where, [P.sub.(a).sup.([alpha])] is a set of peak position of a smoothed hue histogram in divided region [alpha] of a. [mathematical expression not reproducible] which are each peak position of a smoothed hue histogram in divided region [alpha] of a. [T.sub.(a).sup.([alpha])] is a set of trough position of a smoothed hue histogram in a divided area [alpha] of [mathematical expression not reproducible] which are each trough position of a smoothed hue histogram in divided region [alpha] of a. In addition, the extracted positions of peak and trough of candidate regions are expressed by using

{h[p.sub.1.sup.([beta])],h[p.sub.2.sup.([beta])],***} [member of] h[P.sup.([beta])] (9)

{h[t.sub.1.sup.([beta])],h[t.sub.2.sup.([beta])],***} [member of] h[T.sup.([beta])] (10)

Where, h[P.sup.([beta])] is a set of peak position of a smoothed hue histogram in a divided region [beta]. h[p.sub.1.sup.([beta])], h[p.sub.2.sup.([beta])], *** which are each peak position of a smoothed hue histogram in a divided region [beta]. h[T.sup.([beta])] is a set of trough position of a smoothed hue histogram in a divided region [beta]. h[t.sub.1.sup.([beta])], h[t.sub.2.sup.([beta])],*** which are each trough position of a smoothed hue histogram in a divided region [beta].

2.13 Feature Matching

To recognize teaching data from candidate regions, the proposed method performs matching by the positions of peak and trough of the local hue histograms between teaching data and each candidate region. Firstly, the proposed method calculates the difference values between feature descriptor of teaching data and feature descriptor of the each candidate region by using

[mathematical expression not reproducible] (11)

[mathematical expression not reproducible] (12)

[mathematical expression not reproducible] (13)

[mathematical expression not reproducible] (14)

DP = [[SIGMA].sup.f.sub.[alpha]=1]D[p.sub.[alpha]], (15)

DT = [[SIGMA].sup.k.sub.b=1]D[t.sub.b], (16)

D = DP + DT, (17)

Where, n is the number of peak of a smoothed hue histogram in a divided region [beta]. is the number of division. m is the number of teaching data. l is the number of trough of a smoothed hue histogram in a divided region [beta]. D[p.sub.1], D[p.sub.2], *** which are the smallest difference values between h[P.sup.([beta])] to [mathematical expression not reproducible].

[mathematical expression not reproducible] which are each peak position of a smoothed hue histogram in a divided region [alpha] of a. D[t.sub.1], D[t.sub.2], *** are the smallest difference values between h[T.sup.([beta])] to [??] [mathematical expression not reproducible]. [mathematical expression not reproducible] which are each trough position of a smoothed hue histogram in a divided region [alpha] of a. f is the number of peak of a smoothed hue histogram in a divided region [alpha] of a. k is the number of trough of a smoothed hue histogram in a divided region [alpha] of a. is the total value of difference value of peak position. DT is the total value of difference value of trough position. D is the total difference value. As an example, Figure 18 shows the matching based on the positions of peak and trough of the hue histogram.

As shown in Figure 18, the proposed method compares the positions of peak and trough of a smoothed hue histogram of teaching data with the positions of peak and trough of a smoothed hue histogram of candidate region, and calculates difference values. Furthermore, the proposed method registers a peak and trough having the smallest difference value as the nearest peak and trough. Finally, the proposed method recognizes the object with the smallest D in scene data as target object.

3 EXPERIMENT

In this section, to evaluate effectiveness of the proposed method, we carry out a quantitative comparison against the previous research about six properties mentioned in section 1 as follows

1. Robustness against occlusion

2. Fast recognition

3. Pose estimation with high accuracy

4. Coping with erroneous correspondences

5. Recognizing objects in a noisy environment

6. Recognizing objects which have same shape but have different texture

For the properties 1 to 5, the effectiveness of the previous research is shown in the paper relating to the previous research [4]. The proposed method adopts the previous research in order to extract candidate regions. Therefore, we only show the effectiveness for property 6 of the proposed method in this paper.

3.1 Quantitative Comparison Experiment Relating to Effectiveness

3.1.1 Experimental Overview

In this experiment, we compared the proposed method with the previous research quantitatively to evaluate about a property as follows

6. Recognizing objects which have same shape but have different texture

We selected 15 objects frequently used in domestic environment from TUW Object Instance Recognition Dataset [11] as verification objects. Figure 19 and Figure 20 show verification objects.

Object (d) and (j) each have the same shape and different texture as object (e) and (k), respectively. These verification objects have shape information and texture information for the entire circumference of an object. Thereby, even if rotation is added to a recognition target existing in the scene data, it is expected that pose estimation with high accuracy can be performed. Therefore, in this experiment, in addition to the above evaluation for the property 6, we evaluated the correspondence rate of pose estimation at each rotation angle. To generate rotation scene, we rotate each 3-dimensional object data by 10 degrees up to 90 degrees around each axis (X axis, Y axis and Z axis) as shown in Figure 21.

To evaluate effectiveness of each method, we calculate the recognition rate by using

A = [c/z] x 100[%] (18)

Where, A is the recognition rate. c is the number which the each method could correctly recognize objects. z is the number of the verification object. A method of finding the recognition rate is that each method extract the most similar object from scene data which is each verification object with 2.5-dimensional in each verification object, and when it is equal to the teaching data, it is counted as correct recognition c. Detailed settings for recognizing a target object of each method are as follows. In the previous research, to evaluate pose estimation accuracy of a target object, we use the corresponding rate M mentioned in the rigid registration process (section 2.8). The corresponding rate M is calculated by using the equation (2) and (3) with t[h.sub.c] which is 1 [mm]. In case that, the corresponding rate M is the highest and is 70 percent or more, the object having that rate is recognized as a target object. In the proposed method, to evaluate pose estimation accuracy of a target object, we use the corresponding rate M as in the previous research. And then, in case that, the corresponding rate M is the highest and is 70 percent or more, an object having it is recognized as candidate object. Furthermore, to accurately recognize a target object from the candidate objects, we use the difference value D mentioned in the feature matching process (section 2.13). The difference value D is calculated by cumulating the difference value (D[p.sub.i], D[t.sub.j]) for each divided region. We define the number of division as four in this experiment. In case that, the difference value D is the lowest, an object having it is recognized as a target object. Also, the reported processing time is obtained using Intel(R) Core(TM) i7 2.20GHz with 8.00 GB of main memory.

3.1.1 Experimental Overview

Figure 22 shows the recognition rate of the previous research and the proposed method for each rotation angle around x axis as experimental result.

As shown in Figure 22, the proposed method obtained a high recognition rate which is 100 percent at each rotation angle, whereas the data of the previous research is a slightly lower recognition rate than the proposed method's rate. The same can be said for Figure 23 and Figure 24

As shown in Table 5, the difference between the processing time of the proposed method and the processing time of the conventional research was not as large as 30 ms to 40 ms.

3.1.1 Discussion

In this section, we discuss the results presented in experimental results (section 3.1.2). As shown in Figure 22, Figure 23 and Figure 24, it can be seen that the recognition rate of the previous research maintains 73 percent at all rotation angles around each axis. Therefore, we confirm which object the recognition rate of the previous research is decreasing. In this experiment, the recognition rate of the previous research decreased at the object (d), (e), (j) and (k). As an example, the corresponding rates for each scene data when the object (j) is used as teaching data are shown as follow.

As shown in Table 6, the corresponding rate of object (j) and (k) were equal. This is because the previous research uses only shape information of an object, it is impossible to accurately distinguish objects having same shape and different texture like object (j) and (k). On the other hand, since the proposed method uses not only shape information of an object but also texture information of an object, it was able to perform a robust recognition at all verification objects as shown Figure 22, Figure 23 and Figure 24. From these results, the effectiveness of the proposed method for the property 6 was shown.

4 CONCLUSION

In this paper, we proposed the 3-dimensional object recognition method using relationships between feature points, and invariance of local hue histogram for the purpose of improving the recognition technology for the life support robot. The proposed method has effectiveness for six properties necessary as follows for recognition in the domestic environment.

1. Robustness against occlusion

2. Fast recognition

3. Pose estimation with high accuracy

4. Coping with erroneous correspondences

5. Recognizing objects in a noisy environment

6. Recognizing objects which have same shape but have different texture

From experimental results of quantitative comparison experiment relating to effectiveness, we considered that the proposed method can accurately distinguish objects which have same shape but have different texture. In addition, since the difference between the processing time of the proposed method and the processing time of the previous research is 30 ms to 40 ms which is not very large, we considered that the proposed method can perform recognition with high speed and high accuracy. However, since SHOT descriptor used by the proposed method is difficult to describe features for objects including a lot of planes like box, pose estimation on ones may not be performed well. Therefore, when describing features of an object, we describe not only the shape feature but also the texture feature, and estimate the pose by using them. Thereby, we improve pose estimation accuracy of the proposed method in a feature work.

REFERENCES

[1.] S. Sugano, T. Sugaiwa, and H. Iwata, "Vision System for Life Support Human-Symbiotic-Robot," The Robotics Society of Japan, 27(6), pp. 596-599, 2009.

[2.] T. Odashima, M. Onishi, K.Tahara, T. Mukai, S. Hirano, Z. W. Luo, and S. Hosoe, "Development and Evaluation of a Human-interactive Robot Platform "RI-MAN"," The Robotics Society of Japan, 25(4), pp. 554-565, 2007

[3.] Y. Jia, H. Wang, P. Sturmer, and N. Xi, "Human/robot interaction for human support system by using a mobile manipulator," Robotics and Biomimetics (ROBIO), pp. 190-195, 2010.

[4.] H. Kudo, K. Ikeshiro, and H. Imamura, "A 3-Dimensional Object Recognition Method Using SHOT and Relationship of Distances and Angles in Feature Points," International Journal of New Computer Architectures and their Applications (IJNCAA), 7(4), pp. 149-155, 2017.

[5.] F. Tombari, S. Salti, and L. D. Stefano, "Unique signatures of histograms for local surface description," European conference on computer vision (ECCV), pp. 356-369, 2010.

[6.] INTELLIGENT SENSING LABORATORY, http://isl.sist.chukyo-u.ac.jp/Archives/tm.html

[7.] D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60(2), pp. 91-110, 2004.

[8.] M. J. Swain, and D. H. Ballard, "Color Indexing," International Journal of Computer Vision, 7(1), pp. 11-32, 1991.

[9.] T. Kanda, K. Ikeshiro, and H. Imamura, "An Object Detection Method Using Invariant Feature Based on Local Hue Histogram in Divided Areas of an Object," International Journal of New Computer Architectures and their Applications (IJNCAA), 7(4), pp. 112-122, 2017.

[10.] M. A. Fischler and R. C. Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography," Comm. of the ACM, 24(6), pp. 381-395, 1981.

[11.] AUTOMATION & CONTROL INSTITUTE (ACIN), https://repo.acin.tuwien.ac.at/tmp/permanent/dataset_index.php

Tomohiro Kanda, Kazuo Ikeshiro and Hiroki Imamura

Department of Information Systems Science, Graduate School of Engineering, Soka University

1-236, Tangi-machi, Hachiouji-shi, Tokyo, Japan 192-8577

el6m5206@soka-u.jp, ikeshiro@soka.ac.jp, imamura@soka.ac.jp
Table 1. Properties of the previous research and the proposed method.

           Robustness          Fast          Pose estimation
           against occlusion   recognition   with high accuracy

Previous   [OMICRON]           [OMICRON]     [OMICRON]
Research
Proposed   [OMICRON]           [OMICRON]     [OMICRON]
Method

           Coping with       Recognizing objects in
           erroneous         a noisy environment
           correspondences

Previous   [OMICRON]         [OMICRON]
Research
Proposed   [OMICRON]         [OMICRON]
Method

           Recognising objects which
           have same shape but have
           different texture

Previous   x
Research
Proposed   [OMICRON]
Method

Table 2. Properties of conventional methods and the previous research
using textures.

           Rotation    Scale       Illumination   Distortion by
           change      change      change         perspective projection

SIFT       [OMICRON]   [OMICRON]   [OMICRON]      x
Color      [OMICRON]   [OMICRON]   x              [OMICRON]
Indexing
Previous   [OMICRON]   [OMICRON]   [OMICRON]      [OMICRON]
research

           Occlusion   An object with
                       few textures

SIFT       [OMICRON]   x
Color      [OMICRON]   [OMICRON]
Indexing
Previous   [OMICRON]   [OMICRON]
research

Table 3. The list of the teaching data.

number   Corresponding   Corresponding   Corresponding   Distance
         point 1         point 2         point 3         between point
                                                         1 and 2

1        a               b               c               [l.sub.ab]
2        a               b               d               [l.sub.ab]
3        a               b               e               [l.sub.ab]
[??]     [??]            [??]            [??]            [??]
336      h               S               f               [l.sub.hg]

number   Distance        angle
         between point
         1 and 2

1        [l.sub.ac]      [[theta].sub.1]
2        [l.sub.ad]      [[theta].sub.2]
3        [l.sub.ac]      [[theta].sub.3]
[??]     [??]            [??]
336      [l.sub.hf]      [[theta].sub.336]

Table 4. The list of each cluster data.

number   Corresponding   Corresponding   Corresponding   Distance
         point 1         point 2         point 3         between point
                                                         1 and 2

1        a'              b'              c'              [l.sub.a'b']
2        a'              b'              d'              [l.sub.a'b']
3        a'              b'              e'              [l.sub.a'b']
[??]     [??]            [??]            [??]            [??]
336      h'              g'              f'              [l.sub.h'g']

number   Distance        angle
         between point
         3 and 3

1        [l.sub.a'c']    [[theta]'.sub.1]
2        [l.sub.a'd']    [[theta]'.sub.2]
3        [l.sub.a'e']    [[theta]'.sub.3]
[??]     [??]            [??]
336      [l.sub.h'f']    [[theta]'.sub.336]

Table 5. The average processing time in each verification object.

           Average processing time [ms]
           Air                            All      Burti    Bottle
           freshener                                        (Green)

           410834                         48268    38444    19346
           points                         points   points   points

Previous   122.82                         52.19     7.46    13428
research
Proposed   157.09                         87.17    42.44    167.56
method

           Average processing time [ms]
           Bottle   Downy    Pack     Water    Cup      Arm &
           (Blue)                     boiler            Hummer
                                                        (Yellow)
           19346    39996    13679    66109    points   52381
           points   points   points   points            points

Previous   134.28   24.10     5.32    123.51    7.40     83.80
research
Proposed   167.84   57.55    38.46    159.25   38.01    118.00
method

           Arm &    Telephone   Coca     Doll     Toy car
           Hammer               cola
           (Blue)
           52381    785093      15026    628812   16641
           points   points      points   points   points

Previous    83.80   248.80       4.78    219.90    6.29
research
Proposed   118.55   281.25      38.48    256.88   38.26
method

Table 6. The corresponding rate for each scene data at 0 degrees.

           Corresponding rate [%]

           Air                      All   Burti   Bottle    Bottle
           freshener                              (Green)   (Blue)

Arm &      0.0                      0.0   0.0     0.0       0.0
Hammer
(Yellow)

                 Corresponding rate [%]
           Downy   Pack   Water    Cup   Arm &      Arm &    Telephone
                          boiler         Hammer     Hammer
                                         (Yellow)   (Blue)

Arm &      0.0     0.0    0.0      00    74. 86     74.86    0.0
Hammer
(Yellow)

            Corresponding rate [%]
           Coca   Doll   Toy ear
           cola

Arm &      0.0    0.0    0.0
Hammer
(Yellow)
COPYRIGHT 2018 The Society of Digital Information and Wireless Communications
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Kanda, Tomohiro; Ikeshiro, Kazuo; Imamura, Hiroki
Publication:International Journal of New Computer Architectures and Their Applications
Article Type:Report
Date:Jan 1, 2018
Words:5513
Previous Article:A Finger-Mounted Haptic Device with Plane Interface.
Next Article:A Study for Dynamically Adjustmentation for Exploitation Rate using Evaluation of Task Achievement.
Topics:

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |