Hi All,
I added a new feature to Big Orange, point to a place on the floor and say, “go there,” and Orange will go to that spot. I’m using the Google Mediapipe BlazePose NN human pose detection + find landmarks models running on the Luxonis Oak-D stereo camera which runs software DepthAI. It first adjusts the camera to get the whole body in view. Then from the pose values of the shoulder to the wrist, I form a ray vector and find the intersection with the floor plane, then convert to world space and tell the robot’s NAV goal the 2D coordinates.
Main Video:
And a little humor that an error case made possible:
Orange Go to Goal Presentation from HBRC September meeting:
https://youtu.be/q5bkFvdEoqI?t=4976
Up to date code for high level system is on Github here:
https://github.com/jimdinunzio/big-orange
Other Links:
DepthAI Blazepose
https://github.com/geaxgx/depthai_blazepose
Google Mediapipe Pose using BlazePose
https://google.github.io/mediapipe/solutions/pose
https://google.github.io/mediapipe/solutions/pose.html#models
BlazePose Paper:
https://arxiv.org/abs/2006.10204
Jim
On Oct 2, 2022, at 12:21 PM, MJ Chan <iglo...@gmail.com> wrote:
Nice. Is the NN running locally? Or from the cloud?
On Sun, Oct 2, 2022 at 12:13 PM Jim DiNunzio <j...@dinunzio.com> wrote:Hi All,
I added a new feature to Big Orange, point to a place on the floor and say, “go there,” and Orange will go to that spot. I’m using the Google Mediapipe BlazePose NN human pose detection + find landmarks models running on the Luxonis Oak-D stereo camera which runs software DepthAI. It first adjusts the camera to get the whole body in view. Then from the pose values of the shoulder to the wrist, I form a ray vector and find the intersection with the floor plane, then convert to world space and tell the robot’s NAV goal the 2D coordinates.
Main Video:
And a little humor that an error case made possible:
Orange Go to Goal Presentation from HBRC September meeting:
https://youtu.be/q5bkFvdEoqI?t=4976
Up to date code for high level system is on Github here:
https://github.com/jimdinunzio/big-orange
Other Links:
DepthAI Blazepose
https://github.com/geaxgx/depthai_blazepose
Google Mediapipe Pose using BlazePose
https://google.github.io/mediapipe/solutions/pose
https://google.github.io/mediapipe/solutions/pose.html#models
BlazePose Paper:
--
Jim
You received this message because you are subscribed to the Google Groups "RSSC-List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rssc-list+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rssc-list/005701d8d693%241a34bcb0%244e9e3610%24%40dinunzio.com.
--
Best regards,
MJ
The full feature end to end as I have implemented it still needs some tuning for a reliable live demo, but currently the biggest impact to reliability is automatically tilting the camera to get most of the person including head in view so the NN can detect the skeleton well, and I will revisit that. Reliability of the NN pose model is reasonable for non-edge case or occlusion heavy poses. In short, if NN can see your head and arm it works pretty well.
Jim
Yes, I agree about reliability. It takes a lot of time and is generally not that fun. But that’s why this is a hobby and not a job (for me). I can choose to spend more time on it or not. Getting to live demo capability is usually a ways from completing a reasonably reliable feature. I find that if I take a break after reaching demo capability and do other things, I can come back to work on reliability of a feature with fresh energy and make it better.
Regarding chair legs, Orange uses LiDAR for mapping the room including any chair legs which appear as occupied areas on the map and any path finding during navigation will avoid occupied areas.
Jim
To view this discussion on the web visit https://groups.google.com/d/msgid/rssc-list/97D3F178-8590-4919-81B3-846F20D871E4%40gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rssc-list/03FB1C45-5136-4021-AC92-9D27F1858C10%40gmail.com.
On Oct 4, 2022, at 11:10 AM, Chris Albertson <alberts...@gmail.com> wrote:
To view this discussion on the web visit https://groups.google.com/d/msgid/rssc-list/CABbxVHueDdDThDd%2Bu2LEfPeKcZRKRziO_eQdX0wJWO_TC9kA%2Bg%40mail.gmail.com.
Yes, seems likely to be the best approach outside of specialty
applications where multi-height Lidar becomes a good backup /
verification for safety. I don't find single-plane lidar
interesting at all, and expensive sensors disqualify approaches
for many applications to start with. On the other hand, the lidar
in the newest iPhones is good enough to be interesting.
Try this app: https://apps.apple.com/us/app/3d-scanner-app/id1419913995
And multi-spectral imaging (visible plus IR, UV) could be
interesting to solve visual ambiguities without visible
spotlights.
We need high quality / performance / well trained modules that
are very cheap to detect ground plane, walls, and an occupancy
map, with at least basic type recognition, and take hardly any
power. Totally possible with technology & algorithms (ML
& otherwise) that we already have, assembled, focused, and
optimized into a reusable subsystem. Luckily, we are going to get
it because of the mobile phone race, the self-driving vehicle
tech, AR/VR, etc.
Stephen
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hbrobotics/CABbxVHueDdDThDd%2Bu2LEfPeKcZRKRziO_eQdX0wJWO_TC9kA%2Bg%40mail.gmail.com.
|
Stephen D.
Williams
Founder: VolksDroid, Blue Scholar Foundation |