npu design

A J

unread,

Feb 19, 2025, 10:43:24 PM2/19/25

to HomeBrew Robotics Club

Hi Team,

This Intel Technology video has some good discussions about their NPU design.

The second half of the video gets into the details how it processes the LLM.

Could a single chip meet all the needs of a Bot beyond 2nm ?

https://youtu.be/_Ig0tBYgr3Y?feature=shared

Chris Albertson

unread,

Feb 20, 2025, 11:24:21 PM2/20/25

to hbrob...@googlegroups.com

On Feb 19, 2025, at 7:43 PM, A J <aj48...@gmail.com> wrote:

Hi Team,

This Intel Technology video has some good discussions about their NPU design.

The second half of the video gets into the details how it processes the LLM.

Could a single chip meet all the needs of a Bot beyond 2nm ?

SIngle chip? I think it depends on how the chip is connected to I/O and the sensors. How many video frames per second can you push into the RAM that is connected to the chip? Some of the “hats” that go. with the Pi5 are connected via I2C that is limited to 1 megabit per second. You are not going to push video down I2C no matter how powerful the chip.

For video. you need either the PC’s PCI bus, which is very fast or Apple's Unified RAM where speed is a non-issue.

Is there already a fast-enough chip to run an entire Robot? A single Nvidia H100 can do 4,000 TFLOPS (8-bit)

While Tesla does own over 100,000 of these H100 chips, they don’t use anything so powerful as that in their cars. The car has abiout 8 cameras can can self-drive in city traffic

https://youtu.be/_Ig0tBYgr3Y?feature=shared

--
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hbrobotics/08f96b2a-cb28-4b4d-9427-7b44e6259b61n%40googlegroups.com.

Marco Walther

unread,

Feb 21, 2025, 3:49:25 AM2/21/25

to hbrob...@googlegroups.com, Chris Albertson

On 2/20/25 20:24, Chris Albertson wrote:
>
>
>> On Feb 19, 2025, at 7:43 PM, A J <aj48...@gmail.com> wrote:
>>
>> Hi Team,
>>
>> This Intel Technology video has some good discussions about their NPU
>> design.
>>
>> The second half of the video gets into the details how it processes
>> the LLM.
>>
>> Could a single chip meet all the needs of a Bot beyond 2nm ?
>
> SIngle chip? I think it depends on how the chip is connected to I/O
> and the sensors. How many video frames per second can you push into
> the RAM that is connected to the chip? Some of the “hats” that go. with
> the Pi5 are connected via I2C that is limited to 1 megabit per second.
> You are not going to push video down I2C no matter how powerful the chip.

The AI 'hats' all use the 1-lane-PCI of the Pi5! It's still not as fast
as the PC connections, but much faster than the I2C;-)

-- Marco

>
> For video. you need either the PC’s PCI bus, which is very fast or
> Apple's Unified RAM where speed is a non-issue.
>
> Is there already a fast-enough chip to run an entire Robot? A single
> Nvidia H100 can do 4,000 TFLOPS (8-bit)
>
>
> While Tesla does own over 100,000 of these H100 chips, they don’t use
> anything so powerful as that in their cars. The car has abiout 8
> cameras can can self-drive in city traffic
>
>
>>
>>
>> https://youtu.be/_Ig0tBYgr3Y?feature=shared
>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "HomeBrew Robotics Club" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to hbrobotics+...@googlegroups.com

>> <mailto:hbrobotics+...@googlegroups.com>.

>> To view this discussion visit https://groups.google.com/d/msgid/
>> hbrobotics/08f96b2a-cb28-4b4d-9427-7b44e6259b61n%40googlegroups.com

>> <https://groups.google.com/d/msgid/hbrobotics/08f96b2a-
>> cb28-4b4d-9427-7b44e6259b61n%40googlegroups.com?
>> utm_medium=email&utm_source=footer>.

>
> --
> You received this message because you are subscribed to the Google
> Groups "HomeBrew Robotics Club" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to hbrobotics+...@googlegroups.com

> <mailto:hbrobotics+...@googlegroups.com>.

> To view this discussion visit https://groups.google.com/d/msgid/

> hbrobotics/9A4272E8-9E35-4CAB-B57B-D17EFA0B47C2%40gmail.com <https://
> groups.google.com/d/msgid/hbrobotics/9A4272E8-9E35-4CAB-B57B-
> D17EFA0B47C2%40gmail.com?utm_medium=email&utm_source=footer>.

Chris Albertson

unread,

Feb 21, 2025, 10:48:32 AM2/21/25

to Marco Walther, hbrob...@googlegroups.com

On Feb 21, 2025, at 12:49 AM, Marco Walther <marc...@gmail.com> wrote:

The AI 'hats' all use the 1-lane-PCI of the Pi5! It's still not as fast as the PC connections, but much faster than the I2C;-)

OK, I’ve not used the new ones. What speeds can you get through the interface and how much RAM is on the device?

Marco Walther

unread,

Feb 21, 2025, 11:42:32 AM2/21/25

to Chris Albertson, hbrob...@googlegroups.com

Google AI says:

The Raspberry Pi 5's PCIe interface has a default speed of 5
Gigatransfers/sec (GT/sec) per lane, or PCIe Gen 2.0. The external
connector can be configured to run at PCIe Gen 3.0 speeds (8 GT/sec).
Explanation

* The Raspberry Pi 5 has five active PCIe lanes.
* The internal lanes are always set to Gen 2 speed.
* The external connector can be configured to run at Gen 3 speeds.
* The Raspberry Pi 5 is not certified for Gen 3.0 speeds, so connections
to PCIe devices at these speeds may be unstable. (MW: that mght have
changed in the meantime, the AI hats use Gen3)
* The Pi5 NVMe PIP (PCIe Peripheral Board) is a PCIe adapter board
designed for NVMe solid-state drives.
* The NVMe Base for Raspberry Pi 5 is an add-on board that connects to
the Pi's PCIe interface. It offers read speeds of up to 800MB/s and
write speeds of 450MB/s

----------------------------------------------------------------
I could not find any info on 'on hat/on chip' memory:-(
The 26TOPS hat version uses the HAILO-8 chip:
https://hailo.ai/products/ai-accelerators/hailo-8-ai-accelerator

-- Marco

Message has been deleted

Chris Albertson

unread,

Feb 22, 2025, 4:07:43 PM2/22/25

to hbrob...@googlegroups.com

On Feb 21, 2025, at 10:58 AM, A J <aj48...@gmail.com> wrote:

The RP5 seems fast, I just clocked a usb3 Linux SSD at 450mb/s. Have seen some
blogs that usb3.2 is 1000mb/s and the new usb4 faster.

Apple seems to be the first with integrated RAM, but Intel and AMD also have new chips
with RAM and GPU in one chip.

Apple’s trick was not putting the RAM and GPU on the same chip. What they have is “unified” RAM. There is just one block of RAM and it is accessible by all the CPU and GPU cores. This way no data has to move. So, PCI bus speed does not matter because there is no bus.

Putting the RAM in the GPU chip is not the same thing at all. You still have to move the data. Even a faster bus is slower than no-bus

But, back to the question, if there will be a one-chip solution. I think so because even today, the problem is not a lack of computing power. I’ve yet to hear anyone say “My robot would be doing human-level jobs if only the computer were faster.” Compute speed is not what is holding us back.

None of the companies buying robots is complaining about the lack of computing power in the robot.

Intel is having some challenges with the PCie 6 chipset for the consumer motherboard.
But when this comes out for laptops and RPi's it will be fast.

Some people have attached discrete GPU 's to the RPi.

--
You received this message because you are subscribed to the Google Groups "HomeBrew Robotics Club" group.

To unsubscribe from this group and stop receiving emails from it, send an email to hbrobotics+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hbrobotics/cd7ae287-41fa-4c11-953e-d23c04d2cbc1n%40googlegroups.com.

Reply all

Reply to author

Forward