Sketch-To-Skill

Abstract

Training robotic manipulation policies traditionally requires numerous demonstrations and/or environmental rollouts. While recent Imitation Learning (IL) and Reinforcement Learning (RL) methods have reduced the number of required demonstrations, they still rely on expert knowledge to collect high-quality data, limiting scalability and accessibility. We propose SKETCH-TO-SKILL, a novel framework that leverages human-drawn 2D sketch trajectories to bootstrap and guide RL for robotic manipulation. Our approach extends beyond previous sketch-based methods, which were primarily focused on imitation learning or policy conditioning, limited to specific trained tasks. SKETCH-TO-SKILL employs a Sketch-to-3D Trajectory Generator that translates 2D sketches into 3D trajectories, which are then used to autonomously collect initial demonstrations. We utilize these sketch-generated demonstrations in two ways: to pre-train an initial policy through behavior cloning and to refine this policy through RL with guided exploration. Experimental results demonstrate that SKETCH-TO-SKILL achieves ~96% of the performance of the baseline model that leverages teleoperated demonstration data, while exceeding the performance of a pure reinforcement learning policy by ~170%, only from sketch inputs. This makes robotic manipulation learning more accessible and potentially broadens its applications across various domains.

SKETCH-TO-SKILL: Bootstrapping Robot Learning with human-drawn Trajectory Sketches

Abstract

Overview

Policy Learning

Hardware Setup

Fig: Complete setup for the ButtonPress task in a real-world experiment. The configuration includes a UR3e robot arm equipped with a Robot Hand gripper, and a RealSense D435i camera mounted on the wrist.

Overview

Fig: Environment cameras - corner and corner2, for human-drawn sketches and generating demonstration

Steps Vs Train Score Plot

Fig: The evaluation success rate of the BC policy of ButtonPress task trained on sketch-generated demonstrations

Videos

BC Policy Evaluation (5x)

The video (4x) demonstrates 5 successful task completions by RL policy with randomized button position for each trial.

BibTeX