Once is enough: Helping robots learn quickly in new environments - https://www.eurekalert.org/news-releases/1011122

USC researchers have developed a new algorithm, which dramatically reduces how much data is needed to train robots by allowing anyone to interact with them through language or videos. Using only one video or textual demonstration of a task, RoboCLIP performed two to three times better than other imitation learning (IL) methods. RoboCLIP was inspired by advances in the field of generative AI and video-language models (VLMs), which are pretrained on large amounts of video and textual demonstration “The key innovation here is using the VLM to critically ‘observe’ simulations of the virtual robot babbling around while trying to perform the task, until at some point it starts getting it right – at that point, the VLM will recognize that progress and reward the virtual robot to keep trying in this direction,” said co-author Laurent Itti, a computer science professor. “The VLM can recognize that the virtual robot is getting closer to success when the textual description produced by the VLM observing the robot motions becomes closer to what the user wants,” Itti added. “This new kind of closed-loop interaction is very exciting to me and will likely have many more future applications in other domains.” The paper, titled RoboCLIP: One Demonstration is Enough to Learn Robot Policies, is being presented at the 37th Conference on Neural Information Processing Systems (NeurIPS).


This is a companion discussion for the article “Once is enough: Helping robots learn quickly in new environments - https://www.eurekalert.org/news-releases/1011122” submitted on the community's news feed.
Reply to this topic to share your thoughts on this article.