A Real-Time 3D Perception and Reconstruction System Based on a 2D Laser Scanner.
Three-dimensional perception is very important to many mobile robotic systems, since the robot is increasingly required to operate in dynamic, unstructured environments and interact safely and effectively with humans or other robots [1-5]. Beside robotics domain, three-dimensional perception and reconstruction are also a key technology for many nonrobotic applications such as surveying, object scanning, 3D manufacturing, and entertainments [6-10].
Nowadays, there are many sensors that can be used to acquire 3D data, for example, scanning LiDAR, stereo vision, and structure light sensors. Among those sensors, scanning LiDAR is mostly used for acquiring 3D data due to its accuracy and long measurement range . In geographic areas, there are several commercial products already available in the market for 3D laser mapping, for example, scanning LiDAR from Faro (http://www.faro.com/) and Leica (https://lasers.leica-geosystems.com/) company. However, those sensors are usually very expensive, and the sizes of those sensors are usually very big. Besides, most of them are only suitable for "stop-and-scan" applications, which means you must put the sensor statically to acquire data and process the data offline to get 3D model of the environments. Therefore, those sensors usually are not suitable for robotics application. In robotics area, the most widely used commercial 3D range finders are provided by Velodyne (http://velodynelidar.com/) with multibeam models HDL-64E, HDL-32E, and VLP-16. These models differ mainly in the number of laser/detectors located in the rotating head. In recent years, there are also some new 3D range finders based on solid-state LiDAR technology, for example, S3 LiDAR sensor from Quanergy (http://quanergy.com/). The main advantage of these sensors is the high acquisition rate that makes them suitable for vehicles moving at high speeds. However, they have a limited vertical resolution and are more expensive than 2D laser scanners. There are also several hand-held 3D scanners that are commercially available (e.g., Leica's T-scan). Unfortunately, those sensors are primarily intended for object scanning applications. Therefore, they often require modification to the environment and have limited working volume that is not appropriate for largescale mobile applications. In recent years, RGB-D cameras , which are based on structured lighting or TOF (time of flight) technology, become very popular due to its low price and depth measurement ability. However, those sensors are limited in their sensing range, precision, and the lighting conditions in which they can operate. For example, the Asus Xtion has an effective range from 0.5 to 4 meters with precision of around 12 cm at 4 m range and does not work in bright sunlight. Stereo cameras are also widely studied for real-time 3D reconstruction . However, stereo cameras have similar precision characteristics, and their performance is dependent on lighting conditions and textural appearance of the environments.
Besides those commercial 3D scanning sensors, researchers have also developed many customized 3D sensors for their specific applications. Up to now, several customized 3D laser scanners have been developed for autonomous robots by means of a rotating laser scanner with a 360[degrees] horizontal FoV, allowing for detection of obstacles in all directions. For example, Morales et al.  designed a low-cost 3D laser rangefinder based on pitching a commercial 2D rangefinder around its optical center. They also developed a 3D laser scanner by continuously rotating a 2D laser scanner . However, in their work, they only describe the design and development of the hardware of the sensor and do not propose efficient algorithms for processing the received 3D point cloud. Therefore, those sensors are generally suitable for "stop-and-scan" applications. Zhang and Singh  proposed a continuously scanning system LOAM which could be a back-and-forth or continuous rotation configuration. The mapping results of their system are very impressive. However, they do not provide detailed description of their hardware system. Besides, the author only proposed LiDAR odometry and mapping algorithms. Though the local drift is very low, it is not suitable for large-scale mapping since accumulative error is inevitable. Bosse et al.  also proposed a sensor called Zebedee, which is constructed from a 2D range scanner coupled with a passive linkage mechanism, such as a spring. By shaking the sensor, the device's field of view could be extended outside of its standard scanning plane. However, this system needs to collect the data first and then process the data offline. Therefore, it is not suitable for real-time applications, such as autonomous navigation of mobile robots.
In this paper, we present a low-cost customized 3D laser scanner based on a rotating 2D laser scanner which is suitable for both autonomous navigation and large-scale environment reconstruction. We not only describe the details of the hardware design but also present a real-time motion estimation algorithm that can continuously estimate the trajectory of the sensor and build the 3D point cloud of the environment simultaneously. The proposed 3D sensor system is advantageous due to its mechanical simplicity, mobility, low weight, low cost, and real-time estimation. Therefore, our sensor can not only be used for autonomous navigation but also be suitable for surveying and reconstruction of areas. To validate the performance of the proposed system, we carried out several experiments to verify its accuracy, robustness, and efficiency. The experimental results show that our system can accurately estimate the trajectory of the sensor and build a quality 3D point cloud map simultaneously.
The rest of the paper is organized as follows. In Section 2, a brief discussion of related work is presented. Section 3 describes the hardware and software system of our sensor. Then, our real-time pose estimation, 3D mapping, and global map optimization algorithms are detailed in Section 4. In Section 5, we validate our system in different environments. Section 6 concludes the paper.
2. Related Work
In the recent year, several different kinds of customized 3D laser scanner system built from a 2D laser scanner have been proposed in the robotics community. Those systems mainly differ in the driving mechanism [13, 16, 17] and data processing methods [18-20]. In the following, we will have a brief review of the existing systems and the reconstruction methods they used.
2.1. Driving Mechanism. Currently, there are several ways to make 3D scanners from a 2D laser scanner. According to its driving mechanism, they can be divided into two groups, namely, passive driving mechanism and active driving mechanism.
The first type is to use passive mechanisms. For example, Bosse et al.  proposed a sensor called Zebedee, which is constructed from a 2D range scanner coupled with a passive linkage mechanism, such as a spring. By mounting the other end of the passive linkage mechanism to a moving body, disturbances resulting from accelerations and vibrations of the body propel the 2D scanner in an irregular fashion, thereby extending the device's field of view outside of its standard scanning plane. By combining the information from the 2D laser scanner and a rigid-mounted industrial-grade IMU, this system can accurately recover the trajectory and create the 3D map. However, this sensor is not very good for mobile navigation since if the robot runs on a flat floor, there would be no enough vibrations to actuate the sensor. Kaul et al.  also proposed a passively actuated rotating laser scanner for aerial 3D mapping. A key feature of the system is a novel passively driven mechanism to rotate a lightweight 2D laser scanner using the rotor downdraft from a quadcopter. The data generated from the spinning laser is input into a continuous-time simultaneous localization and mapping solution to produce an accurate 6DoF trajectory estimate and a 3D point cloud map.
The second type is to use active mechanisms. For this type, usually there is a motor actively driving the sensor around one axis. When rotating a 2D unit, two basic 3D scanning configurations are possible: pitching (x-axis rotation) and rolling (y-axis rotation) scans. For example, Morales et al.  designed a low-cost 3D laser rangefinder based on pitching a commercial 2D rangefinder around its optical center. The pitching scan is preferable for mobile robotics and area monitoring because it obtains information of the region in front of the 3D rangefinder faster than the rolling scan. This is because the rolling scan always requires a complete 180[degrees] rotation to avoid dead zones. However, compared to pitching scanner, the advantage of a rolling scanner is that its field of view can be widened to 360[degrees]. Therefore, they also developed a 3D laser scanner by continuously rotating a 2D laser scanner .
2.2. Motion Estimation and Registration Methods. For the customized 3D LiDAR sensor which is based on a 2D laser scanner, there are different ways to estimate the motion of sensor and reconstruct the environment using the perceived 3D point cloud. The existing method could be divided into two types: stop-and-scan methods and continuous methods.
Previously, most existing 3D mapping solutions either eliminate sensor motion by taking a stop-and-scan approach or attempt to correct the motion using odometric sensors, such as wheel or visual odometry. Many researchers take the approach of frequently stopping the platform in order to take a stationary scan . However, for mobile robotics system, this type of behavior is undesirable as it limits the efficiency of the task. Since there is no movement when taking a scan, the advantage of this kind of scanning is that there is no distortion in the perceived 3d point cloud. Therefore, it is easier to register the point cloud. As to the registration methods, several approaches have been proposed to estimate the motion by means of 3D scan registration [19, 22, 23]. Most of these approaches are derived from the iterative closest Point (ICP) algorithm [18, 24]. For example, generalized ICP (GICP)  unifies the ICP formulation for various error metrics such as point-to-point, point-to-plane, and plane-to-plane. Beside ICP methods, the 3DNDT  method discretizes point clouds in 3D grids and aligns Gaussian statistics within grid cells to perform scan registration.
Nowadays, people prefer to develop continuous estimation methods for a rotating 2D laser scanner. However, collecting consistent 3D laser data when the sensor is moving is often difficult. The trajectory of the sensor during the scan must be considered when constructing point clouds, otherwise the structure of the cloud will be severely distorted and potentially unrecognizable. To solve this problem, several methods have been proposed. For example, a two-step method is proposed to remove the distortion : an ICP-based velocity estimation step is followed by a distortion compensation step, using the computed velocity. A similar technique is also used to compensate for the distortion introduced by a single-axis 3D LiDAR . However, if the scanning motion is relatively slow, motion distortion can be severe. This is especially the case when a 2-axis LiDAR is used since one axis is typically much slower than the other. Therefore, many researchers try to use other sensors to provide velocity measurements to remove the distortion. For example, the LiDAR cloud can be registered by state estimation from visual odometry integrated with an IMU [5, 27].
3. System Overview
In this section, we will give a detailed description of the hardware and software system of our sensor. We first describe the hardware system, then present the architecture of the software system.
3.1. Hardware System. The 3D laser scanner is based on spinning a 2D laser scanner. The mechanical system of the 3D reconstruction system is composed of two parts, namely, the rotating head and driving body as shown in Figure 1. The head is mainly composed of a continuously rotating 2D laser scanner. To make the 3D point cloud registration easy, we align the rotating axis y along with the scanning plane [x.sub.1][y.sub.1]. Axis [y.sub.1] of the LiDAR sensor is motored by a step motor with a coupling. Since the scanner will rotate continuously, we connect its signal and power lines to a slip ring to solve revolving problem. The step motor is equipped with an encoder to record the rotating angle of the 2D scanner.
The whole 3D sensor system consists of a 2D laser scanner, a step motor driving system, and a PC for real-time registration as shown in Figure 2. The PC is a Surface Pro3 tablet from Microsoft, which has an i7 processor, 256 Gb solid hardware and 12-inch touch screen for displaying. The PC is responsible for receiving scans from the 2D laser scanner and the encoder data from the step motor, then converting the 2D scans into 3D point cloud for LiDAR odometry estimation and mapping, and finally displaying the reconstructed 3D point cloud map in real time. The 2D laser scanner used in our system is a Hokuyo UTM-30LX, which is a 2D time-of-flight laser with a 270 deg field of view, 30 m maximum measurement range, and 40 Hz scanning rate. The dimension of the UTM-30LX is 60 x 60 mm, and its mass is 210 g, which makes it ideal for low-weight requirements. The motion controller is responsible for controlling the rotating speed of the 2D laser scanner, which is composed of an ARM STM32 board, a step motor, an encoder, and a communication board. The whole system is powered by a 12 V lithium battery.
3.2. Software Systems. The software of the whole 3D reconstruction system is composed of two parts, namely, low-level motion control and high-level reconstruction. The low-level motion control module is responsible for controlling the 2D laser scanner rotating at a constant speed and reading and sending the encoder data to the PC via USB-RS232 data line. The high-level reconstruction module is responsible for receiving the scanning data and encoder data and then estimating the motion of the sensor and registering the received point cloud into a 3D map.
3.2.1. Low-Level Motion Control. The 2D scanner is controlled to rotate at a speed of 180[degrees]/s by the STM32 embedded controller. The overall precision of the LiDAR sensor strongly depends on the measurement precision of the instantaneous angle of axis y (as shown in Figure 1), henceforth denoted as [beta]. To determine that angle, a high-precision absolute rotary encoder is used. The encoder's resolution is 10 bits. In general, each scan point must be related to the very value of [beta] (continuously provided by the encoder) at the very time when this point is measured. However, the UTM-30LX does not output point after point in real time. Fortunately, the UTM-30LX triggers a synchronization impulse after it finishes a complete scanning. The embedded controller reads the encoder angle data at the exact time it receives a trigger signal. And finally, the embedded controller sends the raw angle and scans to Surface Pro3 via USB to serial converter. We assume the rotation speed of 2D laser scanner during a scan is constant, then we can calculate the exact angle when each point is measured.
3.2.2. High-Level Reconstruction Software. After receiving the 2D scan and angle data from the 2D laser scanner and the motion controller, the high-level reconstruction algorithm needs to estimate the motion of sensor based on the received data and construct the 3D map in real time. The whole architecture of the motion estimation and reconstruction software is shown in Figure 3. We first need to convert the 2D scans into 3D point cloud by computing the Cartesian coordinates of each point. Then, we need to calibrate the sensor since small errors in the attachment of the 2D device to the rotating mechanism provoke a big distortion in the point cloud. Finally, we need to estimate the motion of the sensor and register the increasingly perceived 3D point cloud into a 3D world map.
(1) Generate 3D Point Cloud. The Microsoft Surface PC receives scan data from the 2D laser scanner which is a vector composed of angle [theta] and range measurement [rho]. At the same time, the computer receives the rotating angle [phi]. The whole coordinate systems are shown in Figure 4. A point of the 2D device is given by its polar coordinates: angle [theta] and range [rho]. If we assume the reference frame oxyz of the system is defined as coincident with that of the 2D device, then, the Cartesian coordinates of the point cloud can be computed from following equation:
[mathematical expression not reproducible]. (1)
(2) Geometric Calibration. Practically, due to the installation error, small errors in the attachment of the 2D device to the rotating mechanism results in [y.sub.1] not perfectly aligned with y as shown in Figure 5. This error will result in a distortion in the point cloud computed with 1. To remove the distortion, we need to calibrate the sensor. For our customized 3D laser scanner which is a 3D scanner with a 2D laser rangefinder rotating on its optical center, the calibration method proposed in  can be used to obtain mechanical misalignments. After calibration, 2 can be employed to obtain 3D Cartesian coordinates of a point in the 3D frame instead of 1.
[mathematical expression not reproducible], (2)
where [THETA] = [[theta].sub.0] + [theta] and [[theta].sub.0] represent the zero angle of the rotation mechanism. [[theta].sub.0] is an extrinsic parameter that depends on the installation of the sensor, which will not provoke distortion.
(3) Motion Estimation and Reconstruction. After calibration, if we place the sensor at a place statically, we can get a locally undistorted point cloud. However, in our system, we want to carry the sensor and move around. The movement of the sensor will cause the distortion of point cloud; therefore, we need to continuously estimate the trajectory (laser odometry) of the sensor and remove the distortion of the point cloud. After that, we can register the undistorted point cloud into a global map (mapping). To reduce the local drift of laser odometry estimation, we build a local map and calculate the laser odometry based on local map instead of last sweep. To reduce the global drift for large-scale reconstruction, we first use ELCH  algorithm to coarsely distribute the loop-closure error. Then, we use g2o  framework to finely optimize the pose graph to get globally consistent map (pose graph optimization). The details of motion estimation and graph optimization algorithms will be described in the following section.
4. Motion Estimation and Reconstruction Algorithm
Large-scale environment reconstruction by using 3D laser scanners has been widely used [26, 30]. The challenge for our system is that our 3D reconstruction system is based on a rotating 2D laser scanner; therefore, there are big distortions as our sensor moves, which pose big challenges for correct and accurate registration. Currently, most 3D reconstruction systems estimate the motion and reconstruct the environment through offline processing. They usually collect the data using the sensor and then process those data on a powerful computer to get accurate model of the environment. In contrast to them, our algorithm runs in real time. We continuously estimate the motion of the sensor, remove the distortion of the point cloud, and register the local point cloud to get the whole map. In the following, we will detail the process of motion estimation, mapping, and global optimization algorithms.
4.1. Odometry Estimation Algorithm. First, we assume that the movement of the sensor is continuous and smooth over time, without abrupt changes. As a convention, we use right uppercase superscription to indicate the coordinate systems. We define a sweep as the sensor complete one-scan coverage. We use right subscription k to indicate the sweeps and [P.sub.k] to indicate the point cloud perceived during sweep k. The function of odometry estimation is to estimate the relative motion of the sensor during a sweeping time and use the estimated motion to remove the distortion of sweep [P.sub.k] to get undistorted point cloud. We use similar idea as described in  to estimate the odometry. The difference is that they use scan-to-sweep strategy for relative pose estimation. But in our work, in order to improve the robustness, we use scan-to-local map strategy instead of scan-to-sweep strategy. The detail of the odometry estimation is as follows.
4.1.1. Feature Extraction. First, we extract feature points from the LiDAR cloud [P.sub.k]. For each scan, it returns range values on a scan plane with 0.25[degrees] resolution. However, as the laser scanner rotates at an angular velocity of 180[degrees]/s and generate scans at 40 Hz, the resolution in the perpendicular direction to the scan planes is 180[degrees]/40 = 4.5[degrees]. We select two kinds of features which are edge points and planar points. Let [X.sup.L.sub.(k,i)] be a point in [P.sub.k] (the superscript L means in laser coordinate system), and let S be the set of consecutive points of i returned by the laser scanner in the same scan. We use the following equation to determine whether a point belonged to edge or planar points:
[mathematical expression not reproducible]. (3)
According to the c value, the points in a scan can be classified into edge points and planar points. The edge points usually have big c values while the planar points have small c values. We select a feature as planar feature if its c value is smaller than a threshold and a feature as edge feature if its c value is bigger than a threshold. In our implementation, the threshold is determined by experiments. In order to evenly distribute the feature points, we also divided a scan into four identical subregions. In each subregion, there are 2 edge points and 4 planar points in maximum.
4.1.2. Feature Matching
(1) Feature Projecting. Since we assume that the angular and linear velocities of the sensor during a sweep are constant, therefore this allows us to linearly interpolate the pose transform within a sweep for the points that are received at different times. Given a point i, i [member of] [P.sub.K+1], and its corresponding timestamp [t.sub.i] and let [T.sup.L.sub.(k+1,i)] be the pose transform between [[t.sub.k+1], [t.sub.i]], then [T.sup.L.sub.(k+1,i)] can be computed by linear interpolation of [T.sup.L.sub.k+1]:
[T.sup.L.sub.(k+1,i)] = [[t.sub.i] - [t.sub.k+1]/t - [t.sub.k+1]][T.sup.L.sub.k+1]. (4)
Assuming that we have point cloud [P.sub.k+1], then we can extract edge feature points [[epsilon].sub.k+1] and planar feature points pk+1 using 3. We transform features [[epsilon].sub.k+1] and [[mu].sub.k+1] to the starting time of sweep k +1 and denote them as [[epsilon].sub.k+1] and [[??].sub.k+1], then we can get the following equation according to 4:
[mathematical expression not reproducible], (5)
where [X.sup.L.sub.(k+1,i)] is a point i in [[epsilon].sub.k+1] or [[mu].sub.k+1] and [[??].sup.L.sub.(k+1,i)] is the corresponding point in [[??].sub.k+1,i] or [[??].sub.k+1].
(2) Feature to Local Map Matching. For relative motion estimation, a common strategy is to compute the transform between the current sweep [P.sub.k+1] to the last sweep [P.sub.k]. This is a kind of strategy similar to frame-to-frame matching strategy which is widely used in visual odometry. However, this strategy is not very robust. This is because the time duration of one sweep is very short; if there is a sharp turn around a corner, the scanning scene will change dramatically. If we just register current scan to the last sweep, the registration may fail due to insufficient overlap. Here, we use a local map strategy. Instead of matching to the last sweep, we create a local map by accumulating the last 3 sweeps and then match current scan to the created local map, which improves the robustness of the whole algorithm.
At the beginning of running our system, we assume the sensor is static; therefore, we concatenate the first 3 full sweeps as a local map [M.sup.L.sub.k], (k [greater than or equal to] 3). We transform the local map [M.sup.L.sub.k] to time stamp [t.sub.k+1] and denote it as [[bar.M].sup.L.sub.k] and store it in a 3D kd tree. For [[epsilon].sub.k+1] and [[mu].sub.k+1], we need to find the corresponding edges and planes in [[bar.M].sup.L.sub.k]. The feature matching process is described in Figure 6. Let i be a point in [[??].sub.k+1], i [member of] [[??].sub.k+1]. The edge line is represented by two points. Let j be the closest neighbor of i in [M.sub.k], and let l be the closest neighbor of i in the two consecutive scans to the scan of j as shown in Figure 6(a). Therefore, (j, l) forms the correspondence of i. To verify if both j and l are edge points, we check the smoothness of the local surface based on 3. Figure 6(b) shows the procedure of finding a planar patch as the correspondence of a planar point, where (j, l, m) forms the correspondence of point i, i [member of] [[??].sub.k+1].
(3) Error Metric Computation. After finding the correspondences, we can calculate the distance from a feature point to its correspondence. We will recover the motion by minimizing the overall distances of the features.
For an edge point [mathematical expression not reproducible], if (j, l) is its corresponding edge line [mathematical expression not reproducible], then the point to line distance can be computed as
[mathematical expression not reproducible]. (5)
For a planar point [mathematical expression not reproducible] is the corresponding planar patch [mathematical expression not reproducible], then the point to plane distance is
[mathematical expression not reproducible]. (7)
(4) Motion Estimation. Recall that 6 and 7 compute the distances between feature points and their correspondences. By combining 4, 5, 6, and 7, we can get the geometric relationship between the feature point and their correspondences:
[f.sub.[epsilon]]([X.sup.L.sub.(k+1,i)], [T.sup.L.sub.k+1]) = [d.sub.[epsilon]], i [member of] [[epsilon].sub.k+1], (8)
[f.sub.[mu]]([X.sup.L.sub.(k+1,i)], [T.sup.L.sub.k+1]) = [d.sub.[mu]], i [member of] [[mu].sub.k+1]. (9)
For each feature point in [[epsilon].sub.k+1] and [[mu].sub.k+1], we can get one of the above equation. Stack all the equations and we can get
f([T.sup.L.sub.(k+1)])= d, (10)
where each row corresponds to one feature point and d is its corresponding distance. Our object is to minimize d towards zero through nonlinear iterative optimization to estimate the transform [T.sup.L.sub.(k+1)]. Taking the first-order Taylor expansion at [T.sup.L.sub.(k+1)], we get
f [approximately equal to] f([T.sup.L.sub.(k+1)]) + A([T.sup.L.sub.(k+1)])([T.sup.L] - [T.sup.L.sub.(k+1)]), (11)
where A([T.sup.L.sub.(k+1)]) is the Jacobian matrix off at [T.sup.L.sub.(k+1)]. Therefore, we convert the nonlinear optimization problem into a linear least square problem:
[mathematical expression not reproducible]. (12)
Finally, we can get
[T.sup.L.sub.(k+1)] [left arrow] [T.sup.L.sub.(k+1)] [DELTA]T, (13)
where [mathematical expression not reproducible].
4.2. Mapping Algorithm. At the end of sweep k +1, the odometry estimates an undistorted point cloud Pk+l and simultaneously a pose transform [T.sup.L.sub.k+i] which is the sensor motion during the sweep between time[[t.sub.k+1], [t.sub.k+2]]. The mapping algorithm matches and registers the undistorted point cloud [[bar.P].sup.L.sub.k+1] to the global map in the world coordinate, as shown in Figure 7. In the figure, the blue curve denotes the global pose [G.sup.W.sub.k] of the LiDAR sensor at the end time of sweep k, and the orange curve denotes the pose estimation [G.sup.L.sub.k+1] for sweep k +l from the LiDAR odometry estimation algorithm. Combining [G.sup.W.sub.k] and [G.sup.L.sub.k+1], we can get an initial guess of transform [G.sup.W.sub.k+1]. Then, we can transform the local undistorted point cloud [[bar.P].sup.L.sub.k+1] to the global world coordinate, denoted as [[bar.Q].sub.k+1] and shown as green cloud in Figure 7. Next, the mapping algorithm matches [[bar.Q].sub.k+1] to [Q.sub.k] to get a refined pose estimation [G.sup.W.sub.k+1] and register [[bar.Q].sub.k+1] onto [Q.sub.k] to get a new map [Q.sub.k+1].
4.3. Pose Graph Optimization. In the previous section, we obtain the refined pose transformation by the motion estimation in the mapping algorithm. However, since the accumulative error is inevitable in the calculation process, we cannot construct globally consistent map for large-scale environments. Here, we first utilize ELCH  as the coarse graph optimization method to solve the global map consistency problem to a certain extent by finding the minimum accumulated error in the map and trajectory estimation. Then, we use g2o  for fine graph optimization which considers the relationship among global constraints.
4.4. Coarse Graph Optimization. The coarse graph optimization method is mainly based on geometric relationship of point cloud. First, we save the point cloud data every 3 s as a keyframe, which is stored in a vector containing keyframes and pose information. Then, the Euclidean distance between the current frame and all the historical keyframes is calculated. When the distance is less than the threshold of 1.5 m, it is considered that there potentially exists a closed loop. Since the previous frames adjacent to the current frame are also easily able to meet the above requirement, therefore these data should be excluded. Here, we use ELCH , a heuristic algorithm, to do coarse graph optimization. The approach's key idea is that it will spread the final accumulated error to every frame according to the value of uncertainty.
When loop closure is detected, we use ICP algorithm to calculate the accumulated error transformation matrix AT between the start keyframe and end keyframe in a closed loop. This [DELTA]T is then assigned to each node in the SLAM graph. Here, the SLAM graph is denoted as G = (V, E), where the vertex V represents the pose of the scanned keyframes and E is the uncertainty between the vertices. Suppose that an undirected graph G = (V, E) has been constructed, which contains two special nodes [v.sub.f] and [v.sub.l], denoting the starting and ending nodes of the closed loop in the graph, with corresponding weights of 0 and 1, respectively.
Then, according to the Dijkstra algorithm, we obtain the uncertainty value equivalent to weight [w.sub.k] The Dijkstra algorithm is used to calculate the shortest path from each node to other nodes in the set S. Since the initial set contains only closed-loop nodes [v.sub.f] and [v.sub.f], the first iteration only calculates the shortest path from the beginning to the end, and the cost of the path is d([v.sub.l], [v.sub.f]):
d([v.sub.l], [v.sub.f]) := [summation over (edge<i,j>[member of]path)] [c.sub.i,j]. (14)
[c.sub.i,j] in 14 represents the edge of the connected nodes Vf and [v.sub.f], indicating the uncertainty of the edge, so the cost of a path represents the sum of the uncertainties of all the edges in the path. Based on this path cost value, the weight of each node in the path is updated according to following equation:
[w.sub.i] = [w.sub.s] + [d([v.sub.s], [v.sub.i])/d([v.sub.s], [v.sub.e])]([w.sub.e] - [w.sub.s]). (15)
[w.sub.e] and [w.sub.s] in 15 denote the last and first nodes of the current closed loop.
While updating the weight of each node in the path, it is also necessary to detect the connectivity between the nodes. For example, when the degree of a node exceeds 2, the node is added to the set S. When a closed-loop path is computed, all edges in the path are removed from the graph G, and then the degree of the nodes [v.sub.s] and [v.sub.e] is determined. If the node degree is zero, the node is also removed from the graph. After that, the coarse graph optimization algorithm continues iteratively to calculate the remaining points of the set. The processing flow is the same as the above process until all the closed loops are processed; thus, the weight value of each node in the graph is obtained.
4.5. Fine Graph Optimization. In the previous section, we introduce the coarse graph optimization. However, it only considers the local constraints in the optimization process and does not consider the relationship among the global constraints. In this paper, a general framework for optimization (g2o)  is also used for fine graph optimization. g2o is an extensible template library written in C++ that provides an easy-to-use framework for handling SLAM-related issues. Here, for our laser three-dimensional reconstruction system, we only use the pose optimization, as shown in Figure 8.
In our 3D laser SLAM system, we send the edges after the coarse graph optimization to g2o. At the same time, we continue to create the local constraints by connecting each key frame to the first three keyframes through ICP algorithm to establish a constraint relationship. If the matching in the ICP is successful, the matching constraint is taken as a new edge in g2o. We set the number of optimization iterations to 100. We use Levenberg-Marquardt algorithm as the nonlinear optimization algorithm and Huber kernel as the edge kernel function. An illustrative example is shown in Figure 9:
5. Experiments and Analyses
The customized 3D reconstruction sensor has been tested in variant practical indoor environments. Since the used Hokuyo UTM-30LX laser scanner is an indoor laser scanner, we did not carry out experiments in outdoor environments. We carried out 3 different indoor experiments to test the performance of our system. We evaluate the quality of our system by considering both the map and trajectory accuracies. In the first experiment, we quantitatively analyze the performance of our system using a motion capture system. We compared the estimated trajectory of the sensor to the ground-truth data provided by the motion capture system to show the estimation accuracy of our system. In the second experiment, we put our mapping sensor on a ground mobile robot. On the ground mobile robot, we also have a 2D laser scanner, which performs a 2D SLAM algorithm by using wheel odometry and laser data. We compare the trajectory between 2D SLAM and 3D mapping to show the accuracy of the estimation in large indoor environment. We also compared the 2D map projected from 3D global map to the 2D map generated by the 2D SLAM to show the accuracy of the mapping. In the third experiment, we test our sensor in indoor long corridors by manually carrying the sensor. In this experiment, we show the importance of loop-closure detection and global pose optimization for reducing global drift in challenging environments.
During experiments, the system processing the LiDAR data using a Microsoft Surface Pro 3 tablet with Intel i7 4650U CPU and 8 Gb memory running Ubuntu 14.04 and ROS Indigo. We developed our algorithms using C++, PCL 1.7, Eigen. The experiment video can be found at https:// youtu.be/qXESSWWK1rQ or in the Supplementary Material (available here).
5.1. Motion Capture Experiments. Quantitative analysis can be performed on the trajectory provided a ground-truth trajectory estimate is available. We therefore evaluate the accuracy of the trajectory in a motion capture environment in which we can track the position of our sensor with high precision. An OptiTrack motion capture system consisting of 8 Prime 13W cameras is capable of tracking the positions of tags containing multiple reflective targets to millimeter precision at more than 200 Hz, which is good enough as ground truth.
In this experiment, the sensor body follows an irregular path with the sensor held at varying heights. The sensor moves at a speed about 0.5 m/s. The environment containing the OptiTrack system is a 7 x 6 m room as shown in Figure 10, with few computer workstations and furnitures. We mounted several reflective tags on top of our LiDAR sensor to provide 6DoF tracking as it is moved around in the room. The estimated trajectory and ground-truth trajectory from the OptiTrack system are plotted in Figure 10.
We quantify the accuracy of the estimated trajectories using the measure of the absolute trajectory error (ATE) proposed by . It is based on determining relative and absolute differences between estimated and ground-truth poses. Global consistency is measured by first aligning and then directly comparing absolute pose estimates (and trajectories):
ATE([F.sub.i:n]) := [([1/m][m.summation over (i+1)][[parallel]trans([F.sub.i]([DELTA]))[parallel].sup.2]).sup.1/2] (16)
with [F.sub.i]([DELTA]) := [Q.sup.-1.sub.i] [SP.sub.i], where S is the rigid body transformation mapping the estimated trajectory [P.sub.i:n] to the ground-truth trajectory [Q.sub.i:n]. In Figure 11, we show the trajectory estimates obtained from our system as well as the deviation from the ground-truth trajectory. The mean translational and rotational errors are 0.049 m and 0.536[degrees], respectively. From Figures 10 and 12, you can see that the experimental environment is very clear. Most times, there are only smooth walls and glass windows. These are very challenging for laser-based reconstruction system, since there are only few geometry features and laser sensor does not work well with glass. In the future, we will incorporate visual information to further improve the robustness and accuracy of the system in geometry feature-less environments.
5.2. Mobile Mapping Experiments. Though in motion capture room we can get accurate ground truth, the environment is simple and small. In this experiment, we put our sensor on a ground mobile robot as shown in Figure 13. The robot is driven through a big office environment with tables, chairs, and computers. This environment was chosen to test the performance of our system in big and complex indoor environments. Since it is difficult to get ground truth in big indoor environment, we compare the estimated 6DoF trajectory with the ground-truth trajectory computed from the robot's 2D laser SLAM system to show the performance of our system.
The top figures of Figure 14 depict the 2D view of the estimated trajectory of our sensor and the trajectory computed from the 2D laser SLAM. The total length is about 60 m. As you can see, our estimated trajectory can accurately align with the 2D SLAM trajectory in most times. The difficult motion for accurate recovering is fast spot turns, when portions of the sweeps might not be sufficiently constrained. For example, when the sensor is turning around a corner where there is only one smooth wall, it is difficult to recover the translation since the problem becomes a degeneration problem. The bottom figure of Figure 14 shows an oblique view of the 3D point cloud generated from an experiment and overlaid with the points from the horizontal laser. In the zoomed-in region, we can see the undistorted shapes of the environments. Besides, from top left figure of Figure 14, we can see that our 3D point cloud model aligned with the 2D occupancy grid map very accurately; therefore, our system works very well in complex indoor environments. The mean translational and rotational errors in these experiments are 0.023 m and 0.373[degrees], respectively.
5.3. Hand-Held Experiments. We also tested our sensor and algorithms by carrying our system and walking around in indoor environments. We conducted tests to measure accumulated drift of the motion estimate. Since it is difficult to get 6DoF ground-truth data in large indoor environments, we choose a 30 mx20 m rectangular hallway which contains a closed loop. This environment is very challenging for our system since the long corridor contains little geometry features that our algorithms can use. We started our sensor from one place, then walked around, and finally went back to the same place. The motion estimates generate a gap between the starting and finishing positions, which indicates the amount of drift. In practice, there is an accumulative error. The experimental results show that the growth of translational errors is about to be less than 2% of distance traveled and rotational errors less than 0.3[degrees] per meter. If we do not have loop-closure detection and global optimization, the map will become inconsistent after a big loop traveling. However, since our system has loop-closure detection and coarseto-fine graph optimization, the error can be minimized to make the map consistent as shown in Figure 15. The left figure of Figure 15 shows the map before optimization and the right figure shows the map after optimization. From the reconstructed map in the red box, it can be seen that the map shows obvious inconsistency before using the closed-loop correction algorithm. After optimization, the inconsistency is minimized. The green line is the precorrection trajectory and the red line is the closed-loop corrected trajectory. This experiment shows the importance of loop-closure detection and global optimization for long-distance movement.
5.4. Odometry Estimation Comparison. In this part, we did an experiment to show the improvement of our scan-to-local map method compared to original scan-to-sweep method. We carried out the experiment in a long-corridor environment of one building in our university. The long corridor is around 200 meters and it has two closed loops. There are some geometry features in the corridor. LOAM uses scan-to-sweep method to match the feature points which however leads to a big accumulative error and makes the map distorted. We use scan-to-local map matching method, which in a way can reduce the local accumulative error and distortion by using more information in matching corresponding points.
The experiment results are shown in Figure 16. The left column figures are the results of LOAM method, while the right figures are ours. The first row shows the map in top view, while the second row shows the map in side view. For fairness, we disabled the loop-closure function in our system and run both algorithms using the same dataset. From the results, you can see that there is a big drift of LOAM algorithm in the map as denoted by the red circle. For the same dataset, however, our method has smaller drift. We can also find that our method has much smaller accumulative error in the z-axis direction from the bottom figures in Figure 16. The experimental results demonstrate that our scan-to-local map method has a lower drift.
This paper has described a 3D laser scanner with 360[degrees] field of view. Our low-cost 3D laser rangefinder consisted of a 2D LiDAR scanner continuously rotating around its optical center which is suitable for 3D perception in robotics and reconstruction in other applications. We also described a parallel motion estimation, mapping, and global pose optimization algorithm that enable our system to output the 6DoF sensor pose and reconstruct consistent 3D map in real time. In the future, we will continue to improve the performance of our system from hardware and software aspects. For the hardware, we will make the sensor more compact and integrate with other sensors like IMU or camera. For the software, we will consider to use sensor fusion methods to further reduce the drift of the estimation algorithm.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this paper.
This work was supported by the National Natural Science Foundation of China (no. 61573091 and no. 61673341), Open Research Project of the State Key Laboratory of Industrial Control Technology, Zhejiang University, China (no.ICT170302), Fundamental Research Funds for the Central Universities (N172608005) and Doctoral Scientific Research Foundation of Liaoning Province (no.20170520244).
The supplementary material is a video clip of all the experiments. The experiment video can also be found at https:// youtu.be/qXESSWWK1rQ. (Supplementary Materials)
 A. Hornung, M. Phillips, E. Gil Jones, M. Bennewitz, M. Likhachev, and S. Chitta, "Navigation in three-dimensional cluttered environments for mobile manipulation," in 2012 IEEE International Conference on Robotics and Automation, pp. 423-429, Saint Paul, MN, USA, May 2012.
 M. Nieuwenhuisen and S. Behnke, "Hierarchical planning with 3D local multiresolution obstacle avoidance for micro aerial vehicles," in ISR/Robotik 2014; 41st International Symposium on Robotics, pp. 1-7, Munich, Germany, June 2014.
 M. Schadler, J. Stuckler, and S. Behnke, "Rough terrain 3D mapping and navigation using a continuously rotating 2D laser scanner," KI--Kunstliche Intelligenz, vol. 28, no. 2, pp. 93-99, 2014.
 D. Maier, A. Hornung, and M. Bennewitz, "Real-time navigation in 3D environments based on depth camera data," in 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), pp. 692-697, November-December 2012.
 D. Droeschel, Holz, and S. Behnke, "Omnidirectional perception for lightweight MAVs using a continuously rotating 3D laser<BR>Omnidirektionale wahrnehmung fur leichte MAVs mittels eines kontinuierlich rotierenden 3D-laserscanners," Photogrammetrie--Fernerkundung--Geoinformation, vol. 2014, no. 5, pp. 451-464, 2014.
 R. F. Salas-Moreno, R. A. Newcombe, H. Strasdat, P. H. J. Kelly, and A. J. Davison, "SLAM++: simultaneous localisation and mapping at the level of objects," in 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1352-1359, Portland, OR, USA, June 2013.
 R. Zlot and M. Bosse, "Efficient large-scale three-dimensional mobile mapping for underground mines," Journal of Field Robotics, vol. 31, no. 5, pp. 758-779, 2014.
 S. Vidas, P. Moghadam, and M. Bosse, "3D thermal mapping of building interiors using an RGB-D and thermal camera," in 2013 IEEE International Conference on Robotics and Automation, pp. 2311-2318, Karlsruhe, Germany, May 2013.
 P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox, "RGB-D mapping: using Kinect-style depth cameras for dense 3D modeling of indoor environments," The International Journal of Robotics Research, vol. 31, no. 5, pp. 647-663, 2012.
 A. Nuchter, K. Lingemann, J. Hertzberg, and H. Surmann, "6D SLAM--3D mapping outdoor environments," Journal of Field Robotics, vol. 24, no. 8-9, pp. 699-722, 2007.
 F. Pomerleau, F. Colas, and R. Siegwart, "A review of point cloud registration algorithms for mobile robotics," Foundations and Trends in Robotics, vol. 4, no. 1, pp. 1-104, 2015.
 A. Geiger, J. Ziegler, and C. Stiller, "StereoScan: dense 3d reconstruction in real-time," in 2011 IEEE Intelligent Vehicles Symposium (IV), pp. 963-968, Baden-Baden, Germany, June 2011.
 J. Morales, J. L. Martinez, A. Mandow, A. Pequeno-Boter, and A. Garcia-Cerezo, "Design and development of a fast and precise low-cost 3D laser rangefinder," in 2011 IEEE International Conference on Mechatronics, pp. 621-626, Istanbul, Turkey, April 2011.
 J. L. Martinez, J. Morales, A. J. Reina, A. Mandow, A. Pequeno-Boter, and A. Garcia-Cerezo, "Construction and calibration of a low-cost 3D laser scanner with 360[degrees] field of view for mobile robots," in 2015 IEEE International Conference on Industrial Technology (ICIT), pp. 149-154, Seville, Spain, March 2015.
 J. Zhang and S. Singh, "Low-drift and real-time lidar odometry and mapping," Autonomous Robots, vol. 41, no. 2, pp. 401-416, 2017.
 M. Bosse, R. Zlot, and P. Flick, "Zebedee: design of a springmounted 3-D range sensor with application to mobile mapping," IEEE Transactions on Robotics, vol. 28, no. 5, pp. 1104-1119, 2012.
 J. Morales, J. Martinez, A. Mandow, A. Reina, A. Pequeno-Boter, and A. Garcia-Cerezo, "Boresight calibration of construction misalignments for 3D scanners built with a 2D laser rangefinder rotating on its optical center," Sensors, vol. 14, no. 11, pp. 20025-20040, 2014.
 P. J. Besl and N. D. McKay, "A method for registration of 3-D shapes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 2, pp. 239-256, 1992.
 A. Nuchter, K. Lingemann, J. Hertzberg, and H. Surmann, "6D SLAM with approximate data association," in ICAR '05. Proceedings., 12th International Conference on Advanced Robotics, 2005, pp. 242-249, Seattle, WA, USA, July 2005.
 T. Stoyanov, M. Magnusson, H. Andreasson, and A. J. Lilienthal, "Fast and accurate scan registration through minimization of the distance between compact 3D NDT representations," The International Journal of Robotics Research, vol. 31, no. 12, pp. 1377-1393, 2012.
 L. Kaul, R. Zlot, and M. Bosse, "Continuous-time three-dimensional mapping for micro aerial vehicles with a passively actuated rotating laser scanner," Journal of Field Robotics, vol. 33, no. 1, pp. 103-132, 2016.
 A. V. Segal, D. Haehnel, and S. Thrun, "Generalized-ICP," in Robotics: Science and Systems V, Seattle, WA, USA, June 2009.
 M. Magnusson, A. Lilienthal, and T. Duckett, "Scan registration for autonomous mining vehicles using 3D-NDT," Journal of Field Robotics, vol. 24, no. 10, pp. 803-827, 2007.
 F. Pomerleau, F. Colas, R. Siegwart, and S. Magnenat, "Comparing ICP variants on real-world data sets: open-source library and experimental protocol," Autonomous Robots, vol. 34, no. 3, pp. 133-148, 2013.
 S. Hong, H. Ko, and J. Kim, "VICP: velocity updating iterative closest point algorithm," in 2010 IEEE International Conference on Robotics and Automation, pp. 1893-1898, Anchorage, AK, USA, May 2010.
 F. Moosmann and C. Stiller, "Velodyne SLAM," in 2011 IEEE Intelligent Vehicles Symposium (IV), pp. 393-398, Baden-Baden, Germany, June 2011.
 S. Scherer, J. Rehder, S. Achar et al., "River mapping from a flying robot: state estimation, river detection, and obstacle mapping," Autonomous Robots, vol. 33, no. 1-2, pp. 189-214, 2012.
 J. Sprickerhof, A. Nuchter, K. Lingemann, and J. Hertzberg, "An explicit loop closing technique for 6D SLAM," in Proceedings of the 4th European Conference on Mobile Robots, ECMR 09, I. Petrovic and A. J. Lilien-thal, Eds., pp. 229-234, KoREMA, Mlini/Dubrovnik, Croatia, 2009.
 R. Kummerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard, "G2o: a general framework for graph optimization," in 2011 IEEE International Conference on Robotics and Automation, pp. 3607-3613, Shanghai, China, May 2011.
 P. Kr, F. Colas, P. Furgale, and R. Siegwart, "Long-term 3D map maintenance in dynamic environments," in 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3712-3719, Hong Kong, China, June 2014.
 J. Sturm, W. Burgard, and D. Cremers, "Evaluating egomotion and structure-from-motion approaches using the TUM RGB-D benchmark," in IEEE/RJS International Conference on Intelligent Robot, Vilamoura, Algarve, Portugal, 2012.
Zheng Fang, (1) Shibo Zhao, (1) Shiguang Wen, (1) and Yu Zhang (2)
(1) Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China
(2) State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310058, China
Correspondence should be addressed to Zheng Fang; email@example.com
Received 5 August 2017; Revised 7 February 2018; Accepted 28 February 2018; Published 16 May 2018
Academic Editor: Paolo Bruschi
Caption: Figure 1: Mechanical structure of the 3D reconstruction system.
Caption: Figure 2: Hardware and functional diagram of the 3D reconstruction system: on the (a) is the hardware system and on the (b) is the functional diagram.
Caption: Figure 3: Software architecture of the 3D reconstruction system.
Caption: Figure 4: (a) Point A in 2D scan plane; (b) after rotating around y-axis to get A'.
Caption: Figure 5: Misalignments between the 2D sensor frame and the 3D sensor frame.
Caption: Figure 6: (a) Finding corresponding edge point and edge line; (b) finding corresponding planar point and planar patch.
Caption: Figure 7: Mapping process.
Caption: Figure 8: g2o pose optimization: each triangle represents laser pose and the blue edge represents the constraint relationship between the pose obtained by the front-end algorithm. By using these constraints, each pose is optimized so as to satisfy the constraints.
Caption: Figure 9: g2o result: (a) the hypothetical pose map optimization; (b) the optimized pose diagram.
Caption: Figure 10: Reconstructed map: the green trajectory is the ground truth from OptiTrack system, and the red trajectory is the estimated trajectory of our SLAM system. The left figure is top view and the right figure is side view.
Caption: Figure 11: Absolute trajectory error of our system. Poses of the trajectory are projected on the XY-plane.
Caption: Figure 12: Motion capture room.
Caption: Figure 13: 3D laser scanner on a turtlebot.
Caption: Figure 14: A comparison between the trajectory estimated by the 3D reconstruction system and the trajectory estimate from the 2D laser SLAM system as ground truth in office environment.
Caption: Figure 15: Indoor loop-closure test: left is before optimization and right is after optimization.
Caption: Figure 16: Comparison of odometry estimation by using scan-to-sweep method with scan-to-local map method: left figures are the results of scan-to-sweep method, which have bigger accumulative drift. Right figures are the results of our method, which have much smaller drift.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Research Article|
|Author:||Fang, Zheng; Zhao, Shibo; Wen, Shiguang; Zhang, Yu|
|Publication:||Journal of Sensors|
|Date:||Jan 1, 2018|
|Previous Article:||Neck Flexion Angle Estimation during Walking.|
|Next Article:||An Optical Fiber Liquid Level Sensor Based on Side Coupling Induction Technology.|