System Architecture - 基於視覺伺服控制之機器人輸送帶追蹤技術及3D多異種物件抓取技術

3.1 Generalized Robot Fetching Architecture

The generalized system architecture and the required su0modules for a robot fetching system are shown in Fig. 3.1. The environment is sensed by 3D sensors that describe the robot’s workspace with the color, the geometry, or both. The sensed environment information is then transmitted to the object recognition module which recognizes the target objects in the workspace. The output result is the pose and the type of the target objects in the workspace. Since the type of the objects in the scene is known, their models can be retrieved from the database for grasp planning. The resulted grasp contains the pose of the end-effector and the configuration of the end-effector. The former one should go through a motion planner which generates a collision=free trajectory for robot arm execution, while the latter one will be sent to the end-effector, which could be a gripper, for actually holding the object.

Fig. 3.1 Generalized robot fetching architecture

This architecture, though dealing with general situation, suffers from high computational complexity in the online grasp planning and motion planning. In an industrial scenario, the efficiency, which is closely related to the cost and the yield rate, is one of the major concerns. We not only require the robot to correctly complete the tasks but also need the tasks to be done quickly. As a result, in this research, we based on the assumptions made in the subsection 2.3 and substitute the online grasp planner and the motion planner modules with the operation database and trajectory interpolator respectively.

3.2 Specialized Robot Fetching Architecture

The main objective in this research is to build up a sensor-integrated system that can grab moving objects with pose modification in the assembly lines. The system architecture is shown in Fig. 3.2.

Why it is called specialized? This means the model functions in specific situation.

Fig. 3.2 Flow chart of specialized robot fetching system architecture.

doi:10.6342/NTU201703382 specialized method is for tracking objects. Poses of robot end-effector are decided by 3D sensor data. The poses will be separated into two conditions. First, we have to collect static poses for each element from visual recognized results. This can be determined by Kinect sensor in the generalized robot fetching architecture. According to the assumption explained in subsection 2.3.1, the workspace will not change when the robot is performing its tasks. As a result, all we have to do is define some trajectory via-points, along which the robot will not hit itself or collide with the fixed environment.

Since the environment is fixed, this trajectory always applies and can be stored in a database for later use. Since the environment is fixed, we can guarantee that this trajectory will never collide with the environment afterward. Therefore, the object type and pose will be sent to trajectory planning to calculate the static grasping pose and command the robot.

The task would become tough after objects are moving on the conveyor. In the same way, the environment is sensed by 3D sensors that describe the condition happening on the conveyor. There are two way to receive information from sensors: eye to hand and eye in hand. In this research, the sensor used in eye to hand is Microsoft Kinect sensor and the other one is webcam camera on the robot tooltip. Actually the moving pose derived from visual is a certain point in the camera coordinate. This will be transferred to the point under Cartesian Space coordinate. After tracking block finishing the computation, the instant state of the object, including its pose and orientation, will be transmitted to the trajectory planning. The robot manipulator then reaches to designated position to get ready to catch the object. In general, information from perception system will update the state of objects to the control center after the end-effector poses have been decided. Therefore, robot arm is able to grasp moving objects successfully.

The method in eye in hand is under the same architecture. The difference is that the end-effector will follow the position in the webcam view to change its position. This behavior seems to be what tracking it is.

The detail of our system is described in the Chapter 6.

3.3 Modified Robot Fetching Architecture

Actually, the main goal in this thesis is to grasp an object before it falls down to the ground. It seems that holding the target is success action in the result. However, to check the condition precisely, the grasping pose has a little difference from the desired command. If there is an offset or deviation angle, the task hasn’t been completed yet. To check out whether the task is successful or not, the webcam is used to record image while the gripper is holding the object. The conceptual structure is shown in Fig. 3.3. To find out any variation of grasping pose, webcam will first store lots of correct pose

Fig. 3.3 System structure for modified robot fetching.

doi:10.6342/NTU201703382 picture in a database. After gathering the poses as a standard level, the mission starts to grasp moving target. In this step, the grasping pose will be compared with desired command. The error model will be told in subsection 6.3.3 in details

在文檔中基於視覺伺服控制之機器人輸送帶追蹤技術及3D多異種物件抓取技術 (頁 28-33)