Article
Title: "Annotation-free Generation of Training Data Using Mixed Domains for Segmentation of 3D LiDAR Point Clouds"
Authors: Konrad Cop, Bartosz Sułek, Tomasz Trzciński
Pages: 347-371
DOI: 10.2478/fcds-2025-0013
Abstract:

Semantic segmentation is important for robots navigating with 3D LiDARs, but the generation of training datasets requires tedious manual effort. In this paper, we introduce a set of strategies to efficiently generate large datasets bycombining real and synthetic data samples. More specifically, the method populates

recordedemptysceneswithnavigation-relevantobstaclesgeneratedsynthetically, thus combining two domains: real life and synthetic. Our approach requires no manual annotation, no detailed knowledge about actual data feature distribution, and no real-life data of objects of interest. We validate the proposed method in the underground parkingscenarioandcompareitwithavailableopen-sourcedatasets. Theexperiments show superiority to the off-the-shelf datasets containing similar data characteristics but also highlight the difficulty of achieving the level of manually annotated datasets.We alsos how that combining generatd and annotated data improves the performance visibly, especially for cases with rare occurrences of objects of interest. Our solution is suitable for direct application in robotic systems,

Open access to full text at De Gruyter Online