The increasing adoption of industrial robot arms in advanced manufacturing has heightened the need for flexible trajectory planning methods that go beyond traditional offline programming (OLP) tools, which are often expensive, proprietary, and limiting. This study introduces an OLP-free pipeline designed to generate robot trajectory data and optimize paths for six-degree-of-freedom (6-DOF) robot arms using discrete reinforcement learning. Initially, five-axis NC code derived from CAD/CAM data is transformed into tool center point (TCP) trajectories through coordinate transformations. An analytical inverse kinematics solver then produces multiple joint solutions for each TCP pose, creating a discrete action space from which the learning agent can select feasible joint configurations along the trajectory. A reward function that considers variations in joint velocity and acceleration, as well as pose error, facilitates the simultaneous optimization of motion smoothness and tracking accuracy. The optimized trajectories are validated using an open-source physics simulator, showing enhanced motion stability, accuracy, and collision safety compared to conventional OLP-based paths. This proposed framework provides a flexible and cost-effective alternative to commercial OLP tools and lays a scalable foundation for future applications in automated and collaborative manufacturing systems.
This paper presents a search methodology for the optimal operational path of robots using a genetic algorithm. The work scheduled to be performed using a robot was characterized. Collision avoidance between the robot including the working tool and the target object was considered. In this study, we followed the general steps of data mining. We compared the time taken by the robot moving along the path created by our proposed methodology with the time taken for the robot along the path created by real humans. The results show that the path generated by this study was more efficient than that of humans.