A Machine Learning and Point Cloud Processing based Approach for Object Detection and Pose Estimation: Design, Implementation, and Validation

Nylund, Simon; Bringager, Fredrik

dc.contributor.advisor	Frank Y. Li
dc.contributor.advisor	Geir Jevne
dc.contributor.advisor	Ajit Jha
dc.contributor.author	Nylund, Simon
dc.contributor.author	Bringager, Fredrik
dc.date.accessioned	2022-09-21T16:24:45Z
dc.date.available	2022-09-21T16:24:45Z
dc.date.issued	2022
dc.identifier	no.uia:inspera:106884834:23917883
dc.identifier.uri	https://hdl.handle.net/11250/3020380
dc.description.abstract	This thesis presents an automatic forklift approach for lifting and handling pallets. The project more specifically develops a solution for autonomous object detection and pose es- timation by Machine Learning (ML), point cloud processing, and arithmetic calculations. The project is based on a real-life scenario identified together with the industrial partner Red Rock, which includes a forklift operation, where the machine is supposed to identify, lift, and handle pallets autonomously. A key to achieving this automation is to localize and classify the pallet as well as to estimate the Six Dimensional (6D) pose of the pallet, which include its (x, y, z) position and (pitch, roll, yaw) orientation. Positioned directly in front of the pallet, the pose estimation must be performed around the range of 2-meter distance and 0° to ±45° angle. A systematic solution consisting of two major phases, object detection, and pose estimation, is developed to achieve the project goal. For object detection, the You Only Look Once X (YOLOX)-S ML algorithm is selected and implemented. The algorithm is pre-trained on the COCO dataset. It is, after that transfer, learned on the Logistics Objects in Context (LOCO) dataset to be able to detect pallets in an industrial environment. To improve the detection inference, the algorithm is optimized with the Intel OpenVINO toolkit, resulting in improved inference latency by over 2.5 times on Central Processing Unit (CPU). The output of the YOLOX-S algorithm is a bounding box around the pallet, and a custom struct links object detection and poses estimation together. The pose estimation algorithm converts the Two Dimensional (2D) bounding box data into Three Dimensional (3D) vectors, in which only the relevant points in the point cloud are kept. In contrast, all irrelevant points are filtered out from the environment. A series of arithmetic calculations from the filtered point cloud are applied, including Random Sample Consensus (RANSAC) and vector operations, in which the prior calculates the largest vertical plane of the identified pallet. Based on the object detection output and the pose estimation calculations, a 3D vector and a 3D point resulting in the pallet’s pose is found. Several tests and experiments have been performed to evaluate and validate the developed solution. The tests are based on a developed ground truth setup consisting of an AprilTag marker which provides a robust and precise ground truth measurement. Results from the standstill experiment show that the algorithm can estimate the position within 0.3 and 7.5 millimeters for the x and y axes. Moreover, the z-axis managed to be kept within 1.6 and 28.6 millimeters. The pitch orientation was kept within 3.65° and 5.21°, while the yaw ori- entation managed to be within 0.86° and 2.64°. Overall standstill test results have evaluated the best and worst case, respectively, within 0° and 45° degrees.
dc.description.abstract
dc.language
dc.publisher	University of Agder
dc.title	A Machine Learning and Point Cloud Processing based Approach for Object Detection and Pose Estimation: Design, Implementation, and Validation
dc.type	Master thesis

Tilhørende fil(er)

Filnavn:: no.uia:inspera:106884834:23917 ...
Størrelse:: 27.51Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Master's theses in Information and Communication Technology [508]
MM500, IKT590, IKT591

Vis enkel innførsel