The online 3D bin packing problem(3D-BPP) is widely used in the logistics industry and is of great practical significance for promoting the intelligent transformation of the industry. The heuristic algorithm relies too much on manual experience to formulate more perfect packing rules. In recent years, many scholars solve 3D-BPP via deep reinforcement learning(DRL) algorithms. However, they ig- nore many skills used in manual packing, one of the most important skill is workers put the item aside if the item is packed improperly. Inspired by this skill, we propose a DRL algorithm with a buffer zone. Firstly, we define the wasted space and the buffer zone. And then, we integrate them into the DRL algorithm framework. Importantly, we compare the bin utilization with different thresholds of wasted space and different buffer zone sizes. Experimental results show that our algorithm outperforms existing heuristic algorithms and DRL algorithms.
|