Instructions in Excel

17m

LLMs tried to run a robot in the real world – it didn't go well

Researchers at Andon Labs recently evaluated how well large language models can act as decision-makers in robotic systems. Their study, called Butter-Bench, tested whether modern LLMs ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Feedback

LLMs tried to run a robot in the real world – it didn't go well

Trending now