Robots that write their own Python code

Robots can code physical movements in Python when given instructions by a human.

Google announced a new approach using Large Language Models (LLM) to show how robots can create their own code based on human instructions.

The latest research builds on Google’s PaLM-SayCan model, in which robots understand unlimited human prompts and respond appropriately and safely in physical space. It is also based on OpenAI’s GPT-3 LLM and associated automatic code completion features. B. GitHub’s Copilot feature.

Google’s researchers said, “What if, when given instructions from people, robots could autonomously write their own code to interact with the world?” Google said, The latest generation of language models, such as PaLM, are capable of complex reasoning and have also been trained on millions of lines of code. Given natural language instructions, current language models are highly proficient at writing not only generic code but, as we’ve discovered, code that can control robot actions as well.”

Google Research calls its new development “Code as Policy”, arguing that the Code Writing LLM can be repurposed to create robot policy code that responds to natural language commands.

Google researchers note in a new paper, Code as Policies: Language Model Programs for Embodied Control:

“When provided as input several example language commands (formatted as comments) followed by corresponding policy code (via few-shot prompting), LLMs can take in new commands and autonomously re-compose API calls to generate new policy code respectively,”

In the given example, the user would say “stack the blocks into an empty bowl” or “place the blocks horizontally near the top of the square’s 2D boundary”. A program generated by Google’s language model then writes code in Python to precisely instruct the robot to follow the voice command. It relies on Python programming constructs, but in this case also leverages libraries such as Shapely for spatial geometric thinking.

The improvement Google claims is that language models may be better suited for this task than directly learning the robot’s task to generate natural language actions.

Google Research notes:

“CaP extends our prior work, PaLM-SayCan, by enabling language models to complete even more complex robotic tasks with the full expression of general-purpose Python code. With CaP, we propose using language models to directly write robot code through few-shot prompting.”

In addition to generalizing to new instructions, Google says the model can transform precise values, such as speed, based on vague descriptions such as “faster” or “to the left.” CaP also supports instructions in languages other than English and emojis.

According to Google, the model can write code that tells the robot to push blocks of different colors onto a 2D square, but it lacks a 3D reference, so it’s more like “building a house with blocks”. Complex instructions cannot be converted. .

Also, while CaP gives robots additional flexibility, “the synthesized program can (unless manually checked at each runtime) lead to unintended behavior on the physical hardware, thus reducing potential risks.”

Previous Post

Paul Mescal draws disapproval for not wearing poppies at Graham Norton show