Printer Friendly

Multiple camera collaboration strategies for dynamic object association.

This research is supported by the Ubiquitous Computing and Network (UCN) Project, Knowledge and Economy Frontier R&D Program of the Ministry of Knowledge Economy(MKE), the Korean government, as a result of UCN's subproject 10C2-T3-10M.

1. Introduction

Recently, multiple cameras based surveillance system has received much attention to cover larger areas with possibly overlapped views of multiple cameras. The redundant view by multiple cameras can improve objects detection and tracking by minimizing the effect on the system caused by false or failed detection and occlusion [1][2]. One of the key requirements in multiple cameras based system is to have the consistent view of objects among different cameras as maintaining the redundant information consistently and robustly. Especially, when multiple cameras flexibly change their views for a large scale surveillance system, it is critical to minimize falsely associated objects as ensuring a high association rate. Once the inconsistent information by the false association is generated in the system, it may be propagated in time and it is difficult to be corrected. Therefore, minimizing the false association through multiple camera collaboration for the object association has become an important issue in the large scale surveillance system.

There are numerous association approaches using feature matching or the geometry of multiple cameras to find the correspondence of objects among multiple cameras [3][4][5][6][7][8][9][10][11][12][13][14][15][16]. Feature based approaches usually suffer from the unavailability of distinctive features for all objects [3][4][5][6J[7] [8], and more over are sensitive to detection performance. On the other hand, geometry based approaches require the accurate calibration process to construct the relationship among cameras |9][10][11]. Some methods combine both approaches to find the correspondence of objects [ 12][ 13]. However, feature based approaches have a higher chance to generate the false association due to the sensitivity to detection performance. In geometry based approaches, the boundaries of camera views on the ground plane are used as stationary homographic lines in other cameras to associate targets when they cross the corresponding boundaries of camera views [I4][15]. The boundary information of camera views is predetermined in advance or determined by observing the motion of objects. However, the association process is limited by stationary objects which do not cross the boundaries of camera views. Moreover, the determination process of the boundary information makes the system difficult to promptly support the flexible movements of cameras.

In order to dynamically establish the association for objects, Kyong et al. [16] present an association method that homographic lines are locally generated on targets in each camera and they are projected to among the other cameras. Since it is not necessary to have a ground plane as a common reference plane, all the cameras do not need to see the ground plane. Homographic lines are generated when the degree of separation between them is satisfied. The required minimum separation between each pair of cameras is predetermined by incorporating the effect of targets height uncertainty and frame synchronization errors because the reference plane may not be the same as the actual height of targets. The method can be extended to support multiple cameras through pair-wise collaboration for the object association and combine the association information from each pair of collaborating cameras. While the pair-wise collaboration is effective for objects with the enough separation, the association is not well-established for objects without the enough separation and it may generate the false association. Therefore, an effective camera collaboration is necessary to reduce inconsistent and uncertain information.

In this paper, we extend the locally initiating homographic lines based association method to two different multiple camera collaboration strategies that reduce the false association. Collaboration matrices are defined with the elements of the required minimum separation presented in [16]. The first strategy compares the collaboration matrices with the minimum separation of objects for each pair of cameras and selects the best pair out of many cameras satisfying the required minimum separation. After targets are associated in selected cameras, the association information is propagated to unselected cameras by transforming the global information constructed from the associated targets. The selection based strategy efficiently collaborates on the object association with the best pair of the cameras as reducing the false association. However, it requires the long operation time to increase the association rate due to unsatisfied separation when a large number of targets are detected. In order to shorten the operation time for the high association rate, the second strategy initiates the collaboration for all the pairing cases of cameras regardless of the separation. When each pair of cameras collaborates on object association, a homographic line is generated on each target and it is projected to the other collaborating camera. The other camera generates homographic lines on only the crossed targets by the projected homographic lines and they are re-projected to the one camera. This association process is iteratively operated for all the unassociated targets in all the pairing cases of cameras. The proposed methods are evaluated with real video sequences and they are compared with the basic pair-wise collaboration to demonstrate the effective and efficient association.

The remainder of this paper has 4 sections. In Section 2, we present the overview of homographic lines based association method and describe the association problem in terms of the false association and the computational costs. Section 3 investigates two collaboration strategies for objects association to minimize the inconsistency in the system and to improve die efficiency of using homographic lines. In Section 4, we verify the proposed methods with the real video sequences. Finally, our contribution is summarized in Section 5.

2. Background and Problem Description

2.1 Background

[FIGURE 1 OMITTED]

The locally generated homographic lines based association method is used to collaboratively associate targets among multiple cameras. Fig. 1 illustrates the homographic lines based association method where two cameras are used to associate objects [16]. A homographic line

[[L.sup.k].sub.i] is generated on [[T.sup.k].sub.i], a detected target of object 1, in each camera [C.sup.k]. Each of these homographic lines is transformed to a global plane and projected to the other camera. [[GL.sup.k].sub.i] denotes a transformed homographic line on a global plane and [[SL.sup.k].sub.i] denotes a projected homographic line from [[GL.sup.k].sub.i] on the other camera [C.sup.l]. The association between targets is established if a transformed homographic line intersects with a corresponding target distinctively. Table 1 illustrates the information of crossed targets for Fig. 1, For example, [[T.sup.1].sub.1] is crossed by a projected homographic line generated from [[T.sup.2].sub.1] and [[T.sup.2].sub.1] is crossed by a projected homographic line generated from [[T.sup.1].sub.1]. Thus, targets {[[T.sup.1].sub.1] [[T.sup.2].sub.1]} and {[[T.sup.1].sub.2], [[T.sup.2].sub.2]} are found as the correspondence of targets respectively.

2.2 Problem Description and Approach

[FIGURE 2 OMITTED]

The successful objects association depends on the separation between homographic lines on the other cameras. When the sufficient separation between homographic lines is not guaranteed, homographic lines can cross multiple targets and the intersections with multiple targets can create ambiguity in determining the correspondence of objects. Moreover, detection uncertainty as well as lack of common reference may create uncertainty in deciding the intersections. In order to guarantee the correct intersections of homographic lines with corresponding targets, a tolerance circle, [S.sub.min], is defined for each target by considering the effect of targets height uncertainty, frame synchronization errors and detection uncertainty. In Fig. 2, a white circle around each target denotes its tolerance circle, a red rectangle an associated target and a white rectangle an unassociated target. Because the required separation for objects association increases, the association performance is affected by the tolerance circle of targets. Due to the insufficient separation in the first frames, homographic lines are not generated in camera [C.sup.1] and targets remain unassociated with corresponding targets in the other cameras.

When many cameras (i.e. more than two cameras) are involved in the object association, the locally generated homographic lines based association method can be extended to support them through pair-wise collaboration. However, each pair of collaborating cameras may contradict the object association due to the insufficient separation. It may generate false associations and create the inconsistent information of uncertain association in the system. They are also propagated in tune and continuously degrade the association performance. Although only one object is not associated in one instant, it affects the overall performance of object association in time. For example, in Fig. 2, targets are unassociated during 9 frames and they create a false association between targets [[T.sup.1].sub.1] and [[T.sup.2].sub.2] in the second frames. The system keeps the inconsistent information of the falsely associated targets until they leave the surveillance region. Moreover, when multiple cameras collaborate on objects association without the sufficient separation between targets, homographic lines are unnecessarily generated on targets and projected to the other cameras without establishing objects association. The association failure due to the insufficient separation wastes the computational costs of transforming homographic lines. In a distributed camera network, minimizing data to be transferred is also very important to efficiently exchange data between the camera systems without any data loss or latency. It also increases the frame rate achieved by the system and improves the performance of algorithms. In general, each target requires transformation of a local homographic line to a global homographic line or transformation of a global homographic line to a local homographic line. The number of the intersection tests is proportional to the multiplication of the number of targets and the number of transformed homographic lines. It is assumed that a pair-wise association is utilized when more than two cameras are used. Then, the computational costs for using homographic lines are represented by the number of transformations and intersection tests, CT and Cc , respectively

[C.sub.T] = 2I x 2(K - 1),

[C.sub.C] = [I.sup.2] x 2 (K - 1), (1)

where I denotes the number of detected targets in K cameras. Fig. 3 shows the computational costs for using homographic lines according to the number of targets and the number of cameras. The number of the transformation increases linearly to the number of targets and the number of intersection tests increases exponentially to the number of targets. It also exponentially increases the amount of exchanged data for target information on the network. As the performance of object association is improved, the number of targets to be associated is decreased at one time. Eventually, it decreases the amount of the exchanged data in the long term. Thus, a proper collaboration camera strategy is necessary to efficiently associate targets among different cameras as minimizing the inconsistent and uncertain information.

[FIGURE 3 OMITTED]

We consider two effective collaboration strategies to reduce false associations as well as to improve the efficiency of using homographic lines. Collaboration matrices are defined to indicate the feasibility of successful association between any two cameras. The elements in the collaboration matrices represent the required minimum separation obtained by incorporating the effect of targets height uncertainty, frame synchronization errors and detection uncertainty. The first strategy uses the collaboration matrices to select the best pair out of many cameras by using the degree of separation between homographic lines. After targets in the selected cameras are associated, the association information is propagated to unselected cameras by transforming the global information constructed from the associated targets.

However, when a large number of objects is detected, the selected cameras cannot cover all the targets and the required minimum separation is hardly satisfied due to overlapped and occluded targets. The second strategy initiates the collaboration process of objects association for all the pairing cases of cameras regardless of the separation. In each pair of cameras, a homographic line is generated on each target and it is projected to the camera. Then, the other collaborating camera generates homographic lines on only the intersected targets with the projected homographic lines and they are re-projected to the one camera. Since the system tests the association for each target at a time, it minimizes association ambiguity caused by homographic lines generated from all the targets. This association process is iteratively operated for all the unassociated targets in all the pairing cases of cameras.

3. Multiple Camera Collaboration

3.1 Collaboration Matrices and Characterization

[FIGURE 4 OMITTED]

Two collaborating cameras participate in an association process by generating a homographic line on each target in each camera. The homographic line is transformed to a global reference

plane such as a ground plane and its transformed homographic line is projected to the other collaborating camera. While the system knows each target that generates its homographic line in a local camera, its corresponding target in the other collaborating camera is determined by the intersection with its projected homographic line. The association of the corresponding targets is established when a projected homographic lines intersects a corresponding target in each camera. However, the association is not established if a projected homographic line intersects multiple targets in each camera. Thus, the separation of projected homographic lines is a key parameter determining the successful association and the size of a target determines die required separation of projected homographic lines. For example, homographic lines are generated on each target in camera [C.sup.1] in Fig. 4 and the distance between projected homographic lines is denoted by d in camera [C.sup.2]. In order for projected homographic lines [[SL.sup.1].sub.1] and [[SL.sup.1].sub.2] to be effective in camera [C.sup.2], the separation of them should be greater than the twice size of targets to be associated. We denote the required minimum separation of homographic lines as a threshold and it is represented by [d.sub.th]. Since the threshold can be used to indicate the effectiveness of projected homographic lines, the determination of an appropriate threshold is critical in homographic lines based association.

The critical issue in determining the threshold is that the separation of projected homographic lines depends on the location and the orientation of cameras. Also, if cameras are flexibly titling and panning, the threshold should incorporate the effect of tilting and panning on the separation. When all the possible variations for a camera configuration are considered, the threshold needs to be determined for each case of a camera configuration. Since it is not trivial to construct all the thresholds according to all the camera configurations, we consider only die worst effect among them on the separation. The threshold is determined for each pair of cameras by measuring the maximum size of targets and the threshold matrix with the threshold for each pair of cameras is represented by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where its size is K X K and [[d.sup.k,l].sub.th] denotes the threshold between camera [C.sup.k] and [C.sup.l]. A negative value indicates that the threshold is always satisfied in a local camera because a system knows which targets generate homographic lines.

[FIGURE 5 OMITTED]

There are two additional factors influencing the threshold which increases the size of a target. One is the unknown heights of objects and another is the frame synchronization errors between cameras in a real situation. The dotted circle of targets represents the tolerance circle incorporating the effect of them as shown in Fig. 5. Due to the effect of the additional factors, d is increased and may be smaller than [d.sub.th] in the figure. The size of the tolerance circle is determined by measuring the amount of pixels that can be deviated from a corresponding target by the additional factors.

For the unknown heights of objects, the size of a tolerance circle is defined by [[S.sup.k].sub.min,i] = max([[r.sup.k].sub.b,i], [[[sigma].sup.k].sub.h,i] (2)

where [[r.sup.k].sub.b,i] denotes the original size of a target and [[[sigma].sup.k].sub.h,i] denotes the possible number of deviated pixels from the centroid of a target by mismatched height of a target in camera [C.sup.k]. [[[sigma].sup.k].sub.h,i] depends on camera configurations (locations, tilting angles and panning angles) generating homographic lines. Then, [[[sigma].sup.k].sub.h,i] is represented by

[[[sigma].sup.k].sub.h,i] = [max.sub.l[not equal to]k ([[[sigma].sup.k,l].sub.h,i], (3)

where [[[sigma].sup.,k,l].sub.h,i] denotes the maximum number of deviated pixels from the centroid of a target

when homographic lines are generated from camera [C.sup.l] to [C.sup.k] with possible camera configurations. We use the reference plane with the average height of targets to transform and project homographic lines to a different camera since the actual heights of targets are unknown. When a homographic line is transformed to the reference plane and its transformed homographic line is projected to a different camera, a projected homographic line is deviated from the point of a target with the actual height. In order to measure the amount of pixels to include the deviation by the height uncertainty, it is assumed that only the height range of targets is given to the system. The amount of pixels is measured by comparing a projected homographic line using the average height with a projected homographic line using the maximum or the minimum height because the heights of targets are unknown.

[FIGURE 6 OMITTED]

Each value of [[[sigma].sup.,k,l].sub.h,i] between cameras [C.sup.k] and [C.sup.l] is determined by finding the maximum effect of the given amount of height uncertainty according to object locations. Fig. 6 shows an example how many pixels are deviated from an original point between cameras [C.sup.1] placed at (x=3m, y-0m, z=3m) and [C.sup.2] placed at (x=6m, y=3m, z=3m). In order to measure the number of deviated pixels, two homographic lines are generated from camera [C.sup.2] to camera [C.sup.1] according to an object's location. One homographic lines is generated with an actual height and die other is generated with an average height different from die actual height by 0.1m. The number of deviated pixels is maximized when an object is close to a camera onto which homographic lines are transformed. The simulation to measure the amount of pixels is repeated for different camera configurations and the maximum value among them is selected. Fig. 7 illustrates the amount of pixels with other cameras incorporating the effect of height uncertainty (0.1m ~ 0.4m) where [[[sigma].sup.,k,l].sub.h,i] denotes the amount of pixels. They are proportional to the amount of height uncertainty.

[FIGURE 7 OMITTED]

[FIGURE 8 OMITTED]

Another factor that influences the size of the tolerance circle is frame synchronization errors between cameras. It also causes the deviation of a projected homographic line because targets can be different locations due to differently captured time. It is assumed that the differently captured time of cameras is within at most 1 frame. Fig, 8 shows the example of the frame synchronization errors. Solid circles and dotted circles represent two different locations where objects are detected by cameras. Object [O.sub.2] cannot be associated because [[SL.sup.1].sub.2] does not intersect with [[T.sup.2].sub.2]. In order to incorporate frame synchronization errors, the radius of a tolerance circle needs to be adjusted by

[[s.sup.k].sub.min,i] = [max.sub.l[not equal to]k] ([[sigma].sup.,k,l].sub.s,i], (4)

where [[[sigma].sup.k].sub.s,i] denotes the possible number of deviated pixels from the centroid of a target by the synchronization issue between cameras in camera [C.sub.k]. The synchronization effect depends on the sampling period of a camera [T.sup.F] and the direction toward which an object moves. Then, [[[sigma].sup.k].sub.s,i] is obtained by

[[[sigma].sup.k].sub.s,i] = [max.sub.l[not equal to]k] [[[sigma].sup.,k,l].sub.s,i], (5)

where [[[sigma].sup.,k,l].sub.s,i] the maximum number of deviated pixels from the centroid of a target by the synchronization issue between cameras [C.sup.k] and [C.sup.l]. Since the effect is maximized when the optical axes of two cameras are perpendicular to each other, only the paired cases of perpendicular cameras is considered.

[FIGURE 9 OMITTED]

Fig. 9 shows the amount of pixels to incorporate the effect of frame synchronization errors where [[[sigma].sup.,k,l].sub.s,i] denotes the amount of pixels. It is noted that the deviation of a projected homographic line is related to the relative speed of an object per each frame, not the absolute speed of an object. When the deviation of a projected homographic line is estimated by the simulation, the speed of an object is set to 2m/sec. Since the frame rate varies from 0.05.sec to 0.2sec in the simulation, it has the same effect of having the relative speed of an object, 0.1m/frame to 0.4m/frame. Since the frame synchronization errors are maximized with perpendicularly placed cameras, a homographic line is generated from the delayed image of [C.sup.1] to the image of [C.sup.2]. The maximum pixel distance error between a projected homographic line and a corresponding target in camera [C.sup.2] is measured. As the sampling rate increases, the amount of pixels decreases.

When the effects of the additional factors are considered at the same time, they can compensate for each other. For example, when a homographic line is generated from the deviated position of a target by detection algorithm, a transformed homographic line with an average height of a target can be accidentally shifted to the position with an actual height of a target. A similar effect can also occur with synchronization issues. However, the system cannot predict the compensation effect by non-ideal parameters. Thus, the size of a tolerance circle should consider the worst effect by

[[S.sup.k].sub.min,i] = max([[r.sup.k].sub.b,i] [[[sigma].sup.k].sub.h,i] + [[[sigma].sup.k].sub.s,i] (6)

[FIGURE 10 OMITTED]

Fig. 10 illustrates [S.sub.min] for cameras [C.sup.1] and [C.sup.2] in terms of the number of required pixels by (6) assuming that the height uncertainty is set to be 0.1m. If the effect of detection performance is considered, the values are expected to be increased. While expanded radii of association circles guarantee that targets are crossed by corresponding transformed homographic lines, they can degenerate association performance because homographic lines generation can be ineffective due to insufficient separations of targets. A threshold indicating effectiveness of homographic lines can be represented by radii of association circles. A threshold [[d.sup.k].sub.th] in camera [C.sup.k] is defined as

[[d.sup.k].sub.th] = arg [min.sub.,j], ([[s.sup.k].sub.min,i] + [[s.sup.k].sub.min,j.]), (7)

where i, j denote indices of neighboring targets. The smallest sum of the sizes of two neighboring tolerance circles indicates effectiveness of homographic lines in an association process in camera [C.sup.k].

3.2 Camera Selection Based Approach

[FIGURE 11 OMITTED]

A camera selection based approach is to select the best pair out of many cameras to increase the effectiveness of projected homographic lines. Since the effectiveness of projected homographic lines depends on the threshold in the other collaborating camera, a system tests the separation of homographic lines for each pair of cameras. Each camera determines the shortest distance between neighboring homographic lines to be tested for the effectiveness in the other collaborating camera. If the transformed shortest distance has enough separation in the other collaborating camera, the separations of other homographic lines are also satisfied since the transformation of a homographic line is a linear process.

In order to reduce the dependence of the shortest distance on the separation of targets, targets are grouped into two types as shown in Fig. 11. Type I is a newly detected target and Type II is a locally tracked target having die association. Each of the shortest distances is determined for each type of targets. If they are not grouped, the shortest between targets in camera C1 is the distance between targets [[T.sup.1].sub.1] and [[T.sup.1].sub.2] otherwise, the distance increases to the distance between targets [[T.sup.1].sub.1] and [[T.sup.1].sub.4] for Type I targets. g denotes a set of Type I targets and [[??]sup.k] denotes a set of Type II targets in camera [C.sup.k]. Then, sets for the targets are represented by,

[G.sup.1] = {[[T.sup.1].sub.1], [[T.sup.1].sub.4],[[??].sup.1] = {[[T.sup.1].sub.2], [[T.sup.1].sub.3]}, [G.sup.2] = {[[T.sup.2].sub.1], [[T.sup.2].sub.4],[[??].sup.2] = {[[T.sup.2].sub.2], [[T.sup.2].sub.3]}

The association processes are operated on sets of equivalent types by using the homographic line based association.

[H.sup.1,2] ([G.sup.1],[G.sup.2]), [[??].sup.1,2] ([[??].sup.1], [[??].sup.2]),

where function [H.sup.m,n] is the homographic line based association for Type I targets between camera [C.sup.m] and [C.sup.n] and function [[??].sup.m,n] for Type II targets. H and [??] are equivalent but the type of targets are different. The target grouping also decreases the number of intersection tests by projected homographic lines. When the targets are not grouped in this example, [C.sub.T] + [C.sub.C] is 16 + 32 = 48. Otherwise, cc decreases to 16 with grouped targets and [C.sub.T] + [C.sub.C] becomes 32.

[FIGURE 12 OMITTED]

When the shortest distance between neighboring homographic lines is determined in each camera, it is represented as a starting point and an ending point to be projected to the other cameras. It is assumed that an image has the top left origin. The x-coordinates of the starting point and the ending point are the same as the x-coordinates of the targets respectively and the y-coordinates are set to be the greater of y-coordinates of targets. The reason for selecting the greater of the two y-coordinates is that a closer line to a camera has a shorter length when it is transformed and the shorter one should be tested for the separation in the other cameras. For example, the coordinates of two targets are (482, 323) and (363, 289) in camera [C.sup.1] of Fig. 12. The shortest distance between them consists of two points (363, 482) and (482, 323). [[d.sup.k].sub.i,j] denotes the length of the shortest line between neighboring targets [[T.sup.k].sub.i] and [[T.sup.k].sub.j], and [[d.sup.k,l].sub.i,j] denotes the transformed length of [[d.sup.k].sub.i,j] at camera [C.sup.l]. The length of the shortest line for each pair of cameras can be represented by a matrix with the size of KxK for convenience. In this example, the distance matrix D is obtained by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The distance values on diagonals of this matrix are the pixel distance of two targets in each local camera and others are projected pixel distances in the other cameras. If only one object is detected by a camera, distance iso. Since this matrix is determined by the coordinates of targets, it is updated every frame.

The threshold matrix [D.sub.th] for Fig. 12 is obtained by using the simulated data from Fig. 7 and Fig. 9

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where the height error is set to 0.1 m and the frame rate is set to 8frames/sec. When D and [D.sub.th] are compared, a pair of [C.sup.1] and [C.sup.3] is possibly selected to cooperate for objects association. If multiple choices are possible, a pair of cameras having the maximum difference between D and [D.sub.th] can be chosen.

Targets in unselected cameras are associated by global information constructed by the associated targets in selected cameras. When targets are associated in two selected cameras, the system constructs global information such as height and position. Fig. 13 illustrates an association information update by selected cameras. When homographic lines generated with the known height from associated targets are transformed to other unselected cameras, they should intersect at near another corresponding target. [g.sup.[??]]([[T.sup.m].sub.i],[[T.sup.n].sub.j]) denotes an intersection point of homographic lines on camera [C.sup.[??]] from associated targets [[T.sup.m].sub.i] and [[T.sup.n].sub.j] where {m, n} [member of] K . If several unassociated targets exist in unselected cameras, a target having the minimum distance to the transformed point is associated with {[[T.sup.m].sub.i], [[T.sup.n].sub.j]} . Then, index [??] of unassociated targets to be associated with them is determined by,

arg [min.sub.i] D([g.sup.[??]]( [[T.sup.m].sub.i], [[T.sup.n].sub.j]), [[T.sup.[?/]].sub.i]), (8)

where D(a,b) returns the distance between points a and b. This may falsely associate targets when they are occluded by each other. Hence, if an intersection point is within the tolerance circle of more than two targets, association information is not updated to prevent from false association. Even with these strategies, false association is still possible when targets are occluded each other. However, their association can be confirmed after they are separated enough for the association. For the computational costs, this process requires one global to local transformation and the intersection tests by the number of targets in (8). The first strategy by the camera selection based approach is summarized in Algorithm 1.

[FIGURE 13 OMITTED]
Algorithm 1: Camera selection based approach
Input : Detected targets at each camera [C.sup.k], threshold matrix
[D.sub.th]
Output : Associated targets
repeat

   Classify targets into Type I and Type TT targets by local tracking
   information in each camera [C.sup.k]
   Construct distance matrix D with the minimum distance between targets
   in each camera [C.sup.k]
   Select the pair of the best cameras [C.sub.p] = {([C.sup.m],
   [C.sub.n])
   |m,n [member of] k} based on D and [D.sub.th]
   for [C.sup.k] [member of] [C.sub.p] do
    for all [[T.sup.k].sub.i] do
    if [[T.sup.k].sub.i] is Type I target then
      Generate a vertical homographic line and transform it to other
      selected
      camera
    end
  end
end
Find associated targets in two selected cameras with homographic lines
and [s.sub.min]
Append association information into associated objects list A
for A [member of] A do
for [C.sup.k] [member of] [C.sub.p] do
   for all [[T.sup.k].sub.i] do
    if [[T.sup.k].sub.i] is Type I target then
      Two homographic lines from A are transformed onto [C.sup.k] and
      check association
with
      [S.sub.min] by (8)
      end
    end
   end
  end
until System slops


3.3 Iteration Based Approach

The association performance of the camera selection based method is ineffective due to a large number of targets. Any elements of matrix D may not be satisfied with the threshold matrix [D.sub.th]. A distance matrix D for Fig. 14 is obtained by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The figure shows the ineffectiveness of homographic lines for associating targets when any pairs of cameras do not satisfy the threshold matrix [D.sub.th]. Thus, the system needs to wait until thresholds are satisfied. This may cause undetermined delay for the association process.

[FIGURE 14 OMITTED]

Moreover, the camera selection based method does not always provide a correct decision in selecting cameras because the average height of targets is used to transform and project a horizontal line to other cameras. Also, since projected homographic lines are not parallel in other cameras, the association process is not always successful. These limitations waste projected homographic lines without establishing objects association. For example, although cameras [C.sup.2] and [C.sup.3] are selected by the satisfied threshold in Fig. 15, any targets are not associated due to the insufficient separation. Another limitation of the camera selection based method is that some pairs of cameras having possible objects association are disregarded.

[FIGURE 15 OMITTED]

[FIGURE 16 OMITTED]

In order to remedy the limitations of the camera selection based method, a system initiates the collaboration process of objects association for all the pairing cases of cameras regardless of the separation. Similarly to the camera selection based approach, the targets in each camera are grouped by the two types defined in the previous section to minimize the unnecessary association process for the different types of targets. When each pair of cameras is operated for objects association, a homographic line is generated on each target at a time and it is transformed to the other collaborating camera. A homographic line can be generated in any order of targets such as the left to the right on image. After a homographic line is transformed to the other collaborating camera, all the crossed targets by the transformed homographic line generate homographic lines. The association is established for only the targets generating homographic lines. Fig. 16 illustrates an example of a homographic line generation on one target for two cameras. Camera [C.sup.1] generates a homographic line on only target [[T.sup.1].sub.2] first. Its homographic line is projected to camera [C.sup.2] and camera [C.sup.2] generates each homographic line on the intersected targets [[T.sup.2].sub.4] and [[T.sub.2].sub.2] . Since only the two of the four targets participate in an association process for target [[T.sup.1].sub.2], it has a higher chance that the projected homographic lines have the sufficient separation in camera [C.sup.1]. Table 2 illustrates the crossed targets by homographic lines of an iteration based strategy for Fig. 16. As a result, targets [[T.sup.1].sub.2] and [T.sup.2].sub.2] are associated. This process is repeated for each target in each pair of collaborating cameras. The second strategy by the iteration based approach is summarized in Algorithm 2.

Unassociated targets by the camera selected based approach in Fig. 15 are associated by the iterative association approach in Fig. 17. Since three cameras are used, six iterated association cases exist ([C.sup.1] to [C.sup.2], [C.sup.1] to [C.sup.3], [C.sup.2] to [C.sup.3], and vice versa). One additional association for targets is established in Fig. 17-(b). Although the iterative association approach has an advantage to check every association case, the computational costs of the transformation and the intersection test are much higher than those of the camera selection based association.
Algorithm 2: Iteration based approach
Input : Detected targets at each camera [C.sup.k]
Output : Associated targets
repeat
  Classify targets into Type I and Type II targets by local tracking
  information in each camera [C.sup.k]
  Construct all the possible pairing cases [C.sub.p] = {([C.sub.m],
  [C.sup.n])| m,n [member of] k}
  for [C.sub.p] [member of] [C.sub.p] do
    for [C.sup.k] [member of] [C.sub.p] do
      for all [[T.sup.k].sub.i] do
        if [[T.sup.k].sub.i] is Type I target then

        Generate a vertical homographic line and transform it to the
         other
        collaborating camera
        Check association with. [S.sub.min] and newly initiated
        homographic lines
        if single Type I target in the collaborating camera is
        associated with [[T.sup.k].sub.i] then
          Append association information into associated objects list A
        end
      end
    end
  end
end
for A [member of] A do
  for all [C.sup.k] do
    for all [[T.sup.k].sub.i] do
      if [[T.sup.k].sub.i] is Type I target then
          Two homographic lines from A are transformed onto [C.sup.k]
            and check
          association with
        [S.sub.min] by(8)
      end
    end
  end
end
until System stops


[FIGURE 17 OMITTED]

4. Simulation and Analysis

4.1 Simulation Setup

Fig. 18 illustrates a simulation setup with six objects and three cameras for analyzing the association performance and the computational costs of the transformation and die intersection tests. We also compare the proposed methods with basic pair-wise collaboration extended from [16] to prove the improvement on the false association rate. Because the basic pair-wise approach initiates the association process for each pair of participating cameras, three different pairs of cameras execute the association process redundantly. In order to compare the performance between the methods, we measure three different rates such as successful association rate, failed association rate, and false association rate to check the inconsistency. Camera C1 is placed at (x = 3.65m,y = 0m,z = 2.37m) with tilting angle 82.3[degrees] and panning angle 0[degrees], camera [C.sup.1] is placed at (x = 0m,y = 3.5m,z = 2.45m) with tilting angle 76[degrees] and panning angle 90[degrees], and camera [C.sup.3] is placed at (x = 1.83m,y = 7.32m,z = 2.37m) with tilting angle 78[degrees] and panning angle 156[degrees]. The total number of frames is 150 and the average height of targets is 1.7m. Another important issue is how to locally track targets in each camera. When targets (i.e., faces) are occluded each other by three fourths of their size, they fail in the local tracking and also lose association information. Since Type II targets are already associated, only Type I targets are considered for die association in the simulation.

[FIGURE 18 OMITTED]

4.2 Association Performance and Complexity Comparison

The first collaboration strategy selects a pair of cameras satisfying the threshold. The values for [D.sub.th] are the same as the values used in Section 3. The shortest distance between homographic lines in each camera is determined and its projected distance is compared with the corresponding element of [D.sub.th] at every frame. Fig. 19 shows the variation of each element of distance matrix D with respect to a corresponding element of threshold matrix [D.sub.th] for the camera selection based approach. If the value of D(m,n) / [D.sub.th] (m, n) is greater than 1, the transformed minimum distance is satisfied with the threshold for a pair of corresponding cameras [C.sup.m] and [C.sup.n]. When one unassociated target is remained in each camera, the distance between targets is set to be the width of image instead of the infinity for the simulation. Fig. 20 shows the result of selected cameras at each association time with the results of Fig. 19. If any thresholds are not satisfied, none of cameras is selected.

[FIGURE 19 OMITTED]

[FIGURE 20 OMITTED]

Fig. 21 and Table 3 show die comparison of the proposed approaches with the basic pair-wise approach. In Table 3, we consider only a case that objects are detected by multiple cameras because die collaboration for the object association is not required for a single camera. The performance of the basic pair-wise association approach almost indicates a limit because the scheme considers all the pairing cases of multiple cameras through pair-wise collaboration. However, the effectiveness of the basic pair-wise association is affected by the redundant collaboration of die pair-wise association process among multiple cameras and it creates false associations in the system. In order to clearly show the improvement of the performance, the false associations are not included in successful associations. The average false association rate by the method in [16] is about 6.2% and they are not corrected until they leave the surveillance region. It affects the average successful association rate in time. On the other hand, the iteration based approach reduces the average false association rate by almost a zero and improves the average successful association rate. When the iteration based approach is used for multiple objects association, the number of successful association is greater than that of the camera selection based approach. This is mainly because the camera selection based approach usually selects only a pair of cameras satisfying the threshold and disregards the rest of possible association cases. On the other hand, the iteration based approach initiates the association process for each target and it leads to improve the association performance. Thus, the camera selection based approach requires the longer operation time to increase the association rate than the iteration based approach.

[FIGURE 21 OMITTED]

Fig. 22 shows the simulation result to measure the number of transformations (i.e. [C.sub.T]) and the number of intersection tests (i.e. [C.sub.C]) by the association approaches according to the number of objects. The results show that the proposed methods are more efficient than the basic pair-wise approach. Also, the number of transformations and intersections tests in the camera selection based approach is lower than that of the iteration based approach. This is because only two selected cameras are participating in the association process. However, in terms of the association performance, the selected cameras cannot cover occluded targets as the number of targets increases. As a result, the association performance is degraded as compared with the iteration based approach in Fig, 23. The iteration based approach increases the association performance with the cost of the increased number of transformations and the number of intersection tests.

[FIGURE 22 OMITTED]

[FIGURE 23 OMITTED]

4.3 Discussion

The simulation result shows that the camera selection based approach has the lower successful association rate than the others. However, it is not critical because the result is obtained in the limited amount of time and the objects are associated when the sufficient separation is satisfied. Hence, the successful association rate can be improved as objects move within the surveillance region in time. A more important issue is to minimize the false association so that the consistent global view of multiple cameras is maintained in the system. Once objects are falsely associated, they are propagated in time and it may continuously generate the inconsistent information through tracking. Especially, when objects are more densely populated than the simulation, there is a high possibility that homographic lines arc not effective for objects association due to die insufficient separation. Then, the basic pair-wise collaboration may create false associations more and they can corrupt the consistent information in the system. Thus, the effective association collaboration scheme is important to accurately and effectively maintain association information for the insufficient separation.

5. Conclusions

We present two different strategies for multiple camera collaborations to reduce the false association for the object association. Collaboration matrices are defined with die required minimum separation for each pair of cameras and used to select a pair of cameras having the maximum separation of homographic lines in the first strategy. We have shown that the first strategy reduces die number of transformations and intersection tests using homographic lines for the object association. However, as a large number of objects are detected, it may require the long operation time to achieve the high association rate due to the unsatisfied separation. In order to remedy the limitation of the first strategy, the second strategy initiates the collaboration process of objects association for all the pairing cases of cameras regardless of the separation. The simulation result demonstrates that the association performance is improved by the repetitive association processes while the computational costs of using homographic lines increase exponentially. The comparison simulation with the basic pair-wise approach also shows that the proposed methods reduce the false association effectively.

DOI:10.3837/tiis .2 010.12.011

References

[1] W. Hu, M. Hu, X. Zhou, T. Tan, J. Lou and S. Maybank, "Principal axis-based correspondence between multiple cameras for people tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 663-671, Apr., 2006. Article (CrossRef Link)

[2] S. Velipasalar and W. Wolf, "Multiple object tracking and occlusion handling by information exchange between uncalibrated cameras," in Proc. of IEEE International Conf. on Image Processing, pp. 418-421, Sept. 11-15, 2005. Article f CrossRef Link)

[3] J. Orwell, P. Remagnino and G.A. Jones, "Multiple camera color tracking," in Proc. of IEEE International Workshop on Visual Surveillance, pp. 14-24, June 26, 1999.

[4] J. Krumm, S. Harris, B. Meyers, B. Brumitt, M. Hale and S. Shafer, "Multi-camera multi-person tracking for easy living," in Proc. of IEEE International Workshop on Visual Surveillance, pp. 3-10, July 1, 2000. Article (CrossRef Link)

[5] A. Mittal and L.S. Davis, "M2Tracker: A multi-view approach to segmenting and tracking people in a cluttered scene using region based stereo," in Proc. of European Conf. on Computer Vision, pp. 18-36, May 28-31, 2002. Article (CrossRef Link)

[6] J. Li, C.S. Chua and Y.K. Ho, "Color based multiple people tracking," in Proc. of IEEE International Conf on Control. Automation. Robotics and Vision, vol. 1, pp. 309-314, Dec. 2-5, 2002. Article (CrossRef Link)

[7] Y. Caspi, D. Simakov and M. Irani, "Feature-based sequence-to-sequence matching," International Journal of Computer Vision, pp. 53-64, June, 2006. Article (CrossRef Link)

[8] Q. Cai and J. K. Aggarwal, "Tracking human motion using multiple cameras", in Proc. of International Conf. on Pattern Recognition, Vienna, Austria, vol. 3, pp. 68-72, Aug. 25-29, 1996. Article (CrossRef Link)

[9] J. Black and T. Ellis, "Multi camera image tracking," Image and Vision Computing, vol. 24, pp. 1256-1267, Nov., 2006. Article (CrossRef Link)

[10] H. Tsutsui, J. Miura and Y. Shirai, "Optical flow-based person tracking by multiple cameras," in Proc. of IEEE International Conf on Multisensor Fusion and Integration in Intelligent Systems, pp. 91 -96, Aug. 2001. Article (CrossRef Link)

[11] A. Utsumi, H. Mori, J. Ohya and M. Yachida, "Multiple human tracking using multiple cameras," in Proc. of IEEE International Conf. on Automatic Face and Gesture Recognition, pp. 498-503, Apr. 14-16,1998. Article (CrossRef Link)

[12] J. Kang, I. Cohen, and G. Medioni. "Continuous tracking within and across camera streams," in Proc. of IEEE International Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 267-272, June 18-20, 2003. Article (CrossRef Link)

[13] T. H. Chang, S. Gong and E.J. Ong, "Tracking multiple people under occlusion using multiple cameras," in Proc. of British Machine Vision Conf., pp. 566-575, Sept. 11-14, 2000. Article (CrossRef Link)

[14] S. Calderara, A. Prati, R. Vezzani and R. Cucchiara, "Consistent labeling for multi-camera object tracking," in Proc. of International Conf. on Image Analysis and Processing, pp. 1206-1214, Sept. 6-8, 2005. Article (CrossRef Link)

[15] S. Khan and M. Shah, "Consistent labeling of tracked objects in multiple cameras with overlapping fields of view," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, Oct., 2003. Article (CrossRef Link)

[16] Y. Kyong, S. H. Cho, S. Hong and W. D. Cho, "Local initiation method for multiple object association in surveillance environment with multiple cameras," in Proc. of IEEE International Conf on Advanced Video and Signal based Surveillance, Sept. 1-3,2008. Article (CrossRef Link)

Shung Han Cho (1), Yunyoung Nam (2) and Sangjin Hong (1)

(1) Mobile Systems Design Laboratory, Dept. of Electrical and Computer Engineering. Stony Brook University-SUN Y, Stony Brook, NY 11794 - USA [e-mail: {shcho, snjhong}@ece.sunysb.edu]

(2) Center of excellence for Ubiquitous System, Ajou University, Suwon, 443-749 - South Korea [e-mail: youngman@ajou.ac.kr]

* Corresponding author: Sangjin Hong

Received August 3, 2010; revised September 7, 2010; accepted October I, 2010; published December 23, 2010

Shung Han Cho received B.E. (Summa Cum Laude) with specialization in Telecommunications from both the department of Electronics Engineering at Ajou University, Korea and the department of Electrical and Computer Engineering at Stony Brook University - SUNY, NY in 2006. He was a recipient of Award for Academic Excellence in Electrical Engineering by College of Engineering and Applied Sciences at Stony Brook University. He received M.S. in Electrical and Computer Engineering from Stony Brook University with Award of Honor in recognition of outstanding achievement and dedication in 2008. He is currently pursuing his Ph.D. degree in the department of Electrical and Computer Engineering at Stony Brook University. He was a recipient for International Academic Exchange Program supported by Korea Research Foundation (KRF) in 2005. lie was a member of Sensor Consortium for Security and Medical Sensor Systems sponsored by NSF Partnerships for Innovation from 2005 to 2006. His research interests include collaborative heterogeneous signal processing, distributed digital image processing and communication, networked robot navigation and communication, heterogeneous system modeling and evaluation.

Vunyoung Nam received B.S. M.S. and Ph.D. degree in computer engineering from Ajou University, Korea in 2001. 2003, and 2007 respectively, lie was a research engineer in the Center of Excellence in Ubiquitous System from 2007 to 2009. He was a post-doctoral researcher at Stony Brook University in 2009, New York. He is currently a research professor in Ajou University in Korea. He also spent lime as a visiting scholar at Center of Excellence for Wireless & Information Technology (CEWIT), Stony Brook University - State University of New York Stony Brook, New York. He was a recipient the Presidential Award for Excellence in Graduate School of Information and Communication in 2004 and 2007. He earned the Best Paper Award at KDC 2006. He is selected for inclusion in the 2011 Edition of "Who's Who in America." His research interests include multimedia database, ubiquitous computing, image processing, pattern recognition, context-awareness. conflict resolution, wearable computing, and intelligent video surveillance.

Sangjin Hong received the B.S and M.S degrees in EECS from the University of California, Berkeley. He received his Ph.D in EECS from the University of Michigan, Ann Arbor. He is currently with the department of Electrical and Computer Engineering at Stony Brook University. Before joining Stony Brook University, he has worked at Ford Aerospace Corp. Computer Systems Division as a systems engineer. He also worked at Samsung Electronics in Korea as a technical consultant. His current research interests are in the areas of multimedia wireless communication s and digital signal processing systems, reconfigurable VLSI Systems and optimization. Prof, Hong is a Senior Member of IEEE and a member of EURASIP journal editorial board. Prof. Hong served on numerous Technical Program Committees for IEEE conferences.
Table 1. Crossed targets by homographie lines for Fig. 1

                    Crossed targets        Crossed targets
                      in [C.sup.1]           in [C.sup.2]

[T.sub.1.sup.1]            --             {[T.sub.2.sup.1]}
[T.sub.2.sup.1]            --             {[T.sub.2.sup.2]}
[T.sub.1.sup.2]    {[T.sub.1.sup.1]}              --
[T.sub.2.sup.2]    {[T.sub.2.sup.1]}              --

Table 2. Crossed targets by homographie lines for Fig. 16

                    Crossed targets        Crossed targets
                      in [C.sup.1]           in [C.sup.2]

[T.sub.2.sup.1]            --              {[T.sub.2.sup.2]
                                           [T.sub.4.sup.2]}
[T.sub.2.sup.2]    {[T.sub.2.sup.1]}              --

[T.sub.4.sup.2]    {[T.sub.1.sup.1],              --
                    [T.sub.3.sup.1],
                    [T.sub.4.sup.1]}

Table 3. Performance comparison for Fig. 21

[O.sub.i]     Camera selection based approach

              Success      Failure       False
                (%)          (%)          (%)

[O.sub.1]       82.3         17.7         0.0
[O.sub.2]       64.9         34.3         0.8
[O.sub.3]       69.0         30.2         0.8
[O.sub.4]       39.3         60.7         0.0
[O.sub.5]       75.8         24.2         0.0
[O.sub.6]       96.1         3.9          0.0
Avg.            71.2         28.5         0.3

[O.sub.i]         Iteration based approach

              Success      Failure       False
                (%)          (%)          (%)

[O.sub.1]      100.0         0.0          0.0
[O.sub.2]       84.8         14.4         0.8
[O.sub.3]       85.2         14.8         0.0
[O.sub.4]       91.6         8.4          0.0
[O.sub.5]       98.5         0.0          1.5
[O.sub.6]      100.0         0.0          0.0
Avg.            93.4         6.2          0.4

[O.sub.i]        Basic pair-wise approach

              Success      Failure       False
                (%)          (%)          (%)

[O.sub.1]       88.3         2.9          8.8
[O.sub.2]       83.3         7.9          8.8
[O.sub.3]       89.1         7.8          3.1
[O.sub.4]       75.0         20.2         4.8
[O.sub.5]       67.0         27.2         5.8
[O.sub.6]       S9.6         4.7          5.7
Avg.            82.1         11.7         6.2
COPYRIGHT 2010 KSII, the Korean Society for Internet Information
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2010 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Cho, Shung Han; Nam, Yunyoung; Hong, Sangjin
Publication:KSII Transactions on Internet and Information Systems
Article Type:Report
Date:Dec 1, 2010
Words:8812
Previous Article:A mobile-aware adaptive rate control scheme for improving the user perceived QoS of multimedia streaming services in wireless broadband networks.
Next Article:Statistical and entropy based human motion analysis.
Topics:

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters