Mobile ALOHA: Prepare to have your mind blown

26 views
Skip to first unread message

Alan Timm

unread,
Jan 4, 2024, 10:24:31 PM1/4/24
to RSSC-List
You wanna know what the next step in deep learning robotics is going to look like?  Here's a peek.  Sometimes it's a little unclear in the videos what is teleoperated and what is autonomous but wow!


Mobile ALOHA
Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Abstract
Imitation learning from human demonstrations has shown impressive performance in robotics. However, most results focus on table-top manipulation, lacking the mobility and dexterity necessary for generally useful tasks. In this work, we develop a system for imitating mobile manipulation tasks that are bimanual and require whole-body control. We first present Mobile ALOHA, a low-cost and whole-body teleoperation system for data collection. It augments the ALOHA system with a mobile base, and a whole-body teleoperation interface. Using data collected with Mobile ALOHA, we then perform supervised behavior cloning and find that co-training with existing static ALOHA datasets boosts performance on mobile manipulation tasks. With 50 demonstrations for each task, co-training can increase success rates by up to 90%, allowing Mobile ALOHA to autonomously complete complex mobile manipulation tasks such as sauteing and serving a piece of shrimp, opening a two-door wall cabinet to store heavy cooking pots, calling and entering an elevator, and lightly rinsing a used pan using a kitchen faucet.

Alan Downing

unread,
Jan 22, 2024, 10:18:41 PM1/22/24
to Alan Timm, RSSC-List, hbrob...@googlegroups.com
Hi Alan T.,

FYI, you can buy your own Aloha hardware from Trossen:

Alternatively, you can use the pre-trained Aloha model (called Octo) and fine-tune it so that you can use it to control your own robot arm.  The repository for the Octo model is:

I currently have my old WidowX robot arm being controlled by the DeepMind RT-X model:

Unfortunately, without fine-tuning the dataset for my robot arm configuration, the RT-X models with my robot arm don't really succeed at completing the language instructions (like "push the white cup next to the black fork".)  My next step is to gather a small dataset to fine-tune the Octo model (which also is trained using a subset of the RT-X dataset) and see if the arm has a much higher success rate at performing the language instructions.

Thanks,
Alan D.



--
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rssc-list/fbea83af-0b61-45d9-8d6e-50c4750719c4n%40googlegroups.com.

Alan Timm

unread,
Jan 23, 2024, 11:03:22 AM1/23/24
to RSSC-List
Interesting...

Aside from the obvious success of that approach, there was something else that I didn't notice at first.

This was accomplished using bog-standard serial servos.  No BLDCs. 

Yeah these are dynamixels which are pretty expensive, but there are cheap chinese knockoffs that work just the same.

But..

You get:
  • a 12 bit position encoder 0 -1023 = ~0.35 degree resolution, 
  • an inferred force through a high gear ratio gearbox through current measuring.
  • a few settings for stiffness, max speed etc.
That's it.  Maybe we shouldn't be so quick to count out servobots after all...

Gmail

unread,
Jan 23, 2024, 12:53:50 PM1/23/24
to RSSC-List
I don’t see the torque listed. Either way, it’s too rich for my blood!



Thomas

-
Want to learn more about ROBOTS?









On Jan 23, 2024, at 8:03 AM, Alan Timm <gest...@gmail.com> wrote:

Interesting...
Reply all
Reply to author
Forward
0 new messages