Re: [HBRobotics] npu tool chains

Marco Walther

unread,

Feb 8, 2025, 12:59:09 AM2/8/25

to hbrob...@googlegroups.com, A J

On 2/7/25 18:37, A J wrote:
> Hi Team,
>
> There are some new chips coming out in 2025 that have potential. But I
> wonder how
>
> this will effect Bot programming if each vendor has their own tool set.
> As chips
>
> generate high double digit TOPS and beyond a notebook will offer a lot
> of compute.
>
> Will they all have unique tool chains or will they have libraries that
> work with ROS
>
> or Python in a friendly manner.
Probably all of the above, at least for the more popular architectures.

Do you know, how many compilers and runtimes can 'hide behind' the
Arduino IDE? I have at least four architectures installed, each
supporting many sub-variants. And I don't claim to have all of them;-)

Usually, you just select the board, you're playing with and the
infrastructure takes care of finding the right pieces.

-- Marco
>
>
> https://www.intel.com/content/www/us/en/products/sku/241747/intel-core-
> ultra-9-processor-285h-24m-cache-up-to-5-40-ghz/specifications.html
>
> --
> You received this message because you are subscribed to the Google
> Groups "HomeBrew Robotics Club" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to hbrobotics+...@googlegroups.com
> <mailto:hbrobotics+...@googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/
> hbrobotics/4fd59c73-af1d-4332-918c-092bf6955269n%40googlegroups.com
> <https://groups.google.com/d/msgid/hbrobotics/4fd59c73-
> af1d-4332-918c-092bf6955269n%40googlegroups.com?
> utm_medium=email&utm_source=footer>.

Chris Albertson

unread,

Feb 8, 2025, 1:00:59 PM2/8/25

to hbrob...@googlegroups.com, A J

With an NPU, the toolchain is in general, this: “(1) Start with an Nvidia GPU, then after the model is trained and tested, (2) compress it to run on an NPU.”

The part that is different for each NPU is the last part where you port the model down to the target hardware. For a hobbyist or amateur, I wonder if this second step is needed. You could always connect the robot to the larger pc that has the NVidia hardware via WiFi. OK, if you want the robot to be self-contained, then you need to move the model. Almost always, the company that makes the NPU will provide the software you need. This compression thing is kind of like running a compiler. You set it up, and it does the work and you don’t have to know how it works inside.

So I see these multiple kinds of NPU and not different conceptually from having multiple brands of CPU. I use different compilers for ESP, ARM or Intel but the process is mechanical. I write the same code and let the compilers handle translating it to different CPUs. I think NPUs are kind of the same, The analogy is close enough.

But one never “programs a model on an NPU”. they are run-only devices.

Without exception, this is what “everyone” does.

Back to the hobby level. Training any AI model is hard because you need data that is labeled and this is very time-consuming to collect. Most of us will go to some place like Google's “model zoo” and browse around and just select a pre trained model. There are lots of on-line “zoos”.

Here are a few places to look

modelzoo.co

Model Zoo — PyTorch/Serve master documentation

pytorch.org

There are more, but I think this makes the point. You don’t train from scratch, youdownload from one of the open source “zoos” and then you move the weights to ther NPU using ther NPU’s tool chain.

I am studying how the new class of robots (cars and some humanoids) work. asically that have multiple models of different kinds working at once. Somer of these are small enough you don’t need an NPU.

To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hbrobotics/eeafb949-2b8b-4912-93cf-f72dffd71632%40gmail.com.

A J

unread,

Feb 9, 2025, 4:17:33 PM2/9/25

to HomeBrew Robotics Club

Thanks Chris & Marco for the great feedback,

The CPU has gotten much more complex with the integrated NPU and GPU.

In the past I have been able to train the example code Torch or Tensorflow code with just CPU cores.

The new Intel chip seems to calculate TOPS on the NPU and GPU. Imagine that the Python code can

select which device to run on.

As a neural network accelerator some vendors might not expose the internal calls for matrix operations or convolution.

With so many compute modules in one chip Bot ML could be run in cores, NPU or GPU .

https://www.hackster.io/tina/tina-running-non-nn-algorithms-on-an-amd-ryzen-npu-0cc58c