Is a de facto standard memory map helpful or harmful?

189 views
Skip to first unread message

Alex Bradbury

unread,
Jul 13, 2016, 8:48:55 AM7/13/16
to sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
One issue that's come up while chatting to people here at the RISC-V
workshop in Boston is avoiding needless arbitrary differences between
platforms. Having different memory maps is an example of this (e.g.
having the PLIC instantiated with a different base offset). However,
there also seems to be the view that some RISC-V implementers will be
unable or unwilling to conform to a shared standard memory layout.
Given this is the case, do we risk actually increasing fragmentation
in the community by exposing memory map details to OS porters and
low-level system programmers as they may come to rely on them? Are
there any other advantages in retaining full flexibility for hardware
implementers to modify their memory map at will?

Should if be RISC-V best practice that any OS porting work doesn't
rely on a fixed memory map, and instead finds it at boot time through
a device tree or device tree-like description? In cases where this
doesn't make case an appropriate C header could be generated from that
description, but the principle that the map isn't hardcoded in the OS
codebase remains. I'm of course thinking beyond Linux here - it would
seem a shame if it ended up that the seL4, FreeRTOS, ... ports did not
work out of the box with any RISC-V implementation.

Best,

Alex

Krste Asanovic

unread,
Jul 13, 2016, 8:59:32 AM7/13/16
to Alex Bradbury, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
Config string is supposed to provide this information. We have code to parse and package this for Linux. OS ports should all use this and the SBI to avoid binding to absolute physical addressees.

Krste (on iPhone, forgive terseness)
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CA%2BwH295tdX7Qxkb8OJ7DD%2BKFBfvi47sm9c1U0Nozowfgh%2B6g4Q%40mail.gmail.com.

Alex Bradbury

unread,
Jul 13, 2016, 9:16:27 AM7/13/16
to Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
On 13 July 2016 at 13:59, Krste Asanovic <kr...@berkeley.edu> wrote:
> Config string is supposed to provide this information. We have code to parse and package this for Linux. OS ports should all use this and the SBI to avoid binding to absolute physical addressees.

Great, I suspected we are thinking along similar lines. What actually
triggered this email is I saw the SiFive "U5 Coreplex Series Manual"
explicitly details the memory map. I've thought previously about doing
this with our current lowRISC implementation (and indeed working with
other groups to try to agree on this), but then wondered whether in a
world of device tree and similar systems such as the proposed RISC-V
configuration string this is necessary or beneficial. Perhaps the
SiFive document could contain a note indicating that best practice for
software developers is to not rely on hardcoded values indicated in
this map?

Best,

Alex

Arnd Bergmann

unread,
Jul 13, 2016, 9:43:31 AM7/13/16
to sw-...@groups.riscv.org, Alex Bradbury, lowri...@lists.lowrisc.org
The main problem for operating systems is to have the kernel run at
a nonconstant physical address. On ARM Linux, we work around this by
patching every call to phys_to_virt() and virt_to_phys() at early
boot time as soon as we have detected the start of RAM.

This works fine for the most part, but there are a couple of downsides:

- It's hard to debug that early code since a lot of features (including
the system console) may not be around.

- It prevents running a kernel from readonly memory such as a NOR
flash or a virtual machine sharing its kernel image with other VMs.

- On machines without an MMU, the kernel actually ends up running
from an unknown physical address, so it either has to be fully
relocatable (which Linux currently doesn't do), or you have to
link the kernel to the actual RAM address, meaning you cannot have
the same kernel run across different machines that don't start from
the same address.

On 64-bit ARM, there is another problem arising from how some people
decided to make their RAM addressing extremely sparse, meaning we
cannot use a linear mapping of physical to virtual addresses with
three-level page tables (the kernel can use either three-level or
four-level tables otherwise).

Arnd

Alex Bradbury

unread,
Jul 13, 2016, 10:04:03 AM7/13/16
to Arnd Bergmann, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
On 13 July 2016 at 14:38, Arnd Bergmann <ar...@arndb.de> wrote:
> The main problem for operating systems is to have the kernel run at
> a nonconstant physical address. On ARM Linux, we work around this by
> patching every call to phys_to_virt() and virt_to_phys() at early
> boot time as soon as we have detected the start of RAM.
>
> This works fine for the most part, but there are a couple of downsides:
>
> (snip)

Thanks for sharing your experience here Arnd - I'm glad someone with
considerable kernel experience has chimed in. Would having a CSR
(RISC-V control register) that contains the base physical address be
an effective solution to avoid the need for function patching?

Alex

Arnd Bergmann

unread,
Jul 13, 2016, 11:00:38 AM7/13/16
to Alex Bradbury, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
The patching is done purely for performance reasons, the implementation
of virt_to_phys() is expected to be a single instruction adding an
immediate value on ARM rather than loading a global 32-bit variable and
adding that.

This means using a CSR only helps if that makes it substantially faster
than reading from cached memory.

I'm actually not convinced that there is much need for the patching
to start with, the most common operations that need the translation
(page table updates, DMA setup, task switching, ...) are not in the
extreme hot path, and it may have been done just to ensure that there
could not be a performance regression at the point when we started
allowing a single kernel binary to run on variable RAM addresses.

Then again, all other architectures seem to use a constant value,
so maybe the runtime overhead is bigger than I think.

Arnd

Michael Clark

unread,
Jul 13, 2016, 11:56:17 AM7/13/16
to Arnd Bergmann, Alex Bradbury, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
Yes a standard memory map is harmful on reflection.

This could be solved with in-kernel JIT perhaps? I'm curious. I will dig deeper and may take a look at the ARM virt_to_phys and phys_to_virt patching (we don't like patching and self modifying code). RISC-V doesn't have large immediates as the maximum is 20-bits. Someone wise must have decided this.

I'm looking at binary translation. Absolute references make things much harder for static translation (as we can't easily distinguish between constants and addresses). I'm investigating how much RISC-V can be statically translated and am having some initial success with a tiny hello world and the tiny newlib crt.

A target that can be statically translated is attractive for many reasons.

If we had some constraints in the compiler such that LUI was used for constant building and AUIPC for address building it would make the ISA a much better virtual target (however I think I may have seen cases where LUI is used for both addresses and constants). The ELF structure helps although Linux does lots of interesting things with custom sections.

My feedback is that it would nice to be able to statically differentiate an integral from an address in the machine code. Of course there is a bootstrapping problem where one has to convert an initial set of constants into addresses.

If the ISA is a nice IR then we could JIT the kernel (and not do any patching). A -pie kernel. -pie should be de facto perhaps.

I'm sure someone wise has already thought this all through and I am picking this up via inference.

Sent from my iPhone
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/15271183.bf2SjugNUD%40wuerfel.

Samuel Falvo II

unread,
Jul 13, 2016, 12:38:04 PM7/13/16
to Michael Clark, sw-...@groups.riscv.org, Alex Bradbury, Arnd Bergmann, lowri...@lists.lowrisc.org

For JITing purposes, I'd look to use WebAssembly instead of the ISA as-is.  It's designed just for this purpose.

Karsten Merker

unread,
Jul 13, 2016, 2:26:54 PM7/13/16
to Alex Bradbury, Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
On Wed, Jul 13, 2016 at 02:16:24PM +0100, Alex Bradbury wrote:

> On 13 July 2016 at 13:59, Krste Asanovic <kr...@berkeley.edu> wrote:

> > Config string is supposed to provide this information. We have code to
> > parse and package this for Linux. OS ports should all use this and the
> > SBI to avoid binding to absolute physical addressees.
>
> Great, I suspected we are thinking along similar lines. What actually
> triggered this email is I saw the SiFive "U5 Coreplex Series Manual"
> explicitly details the memory map. I've thought previously about doing
> this with our current lowRISC implementation (and indeed working with
> other groups to try to agree on this), but then wondered whether in a
> world of device tree and similar systems such as the proposed RISC-V
> configuration string this is necessary or beneficial.

Hello,

is there some documentation available regarding this "RISC-V
configuration string"? Alex' comment sounds like it is intended
as an alternative to device-tree, in which case I wonder a bit
whether adding yet-another-hardware-description-format instead of
using device-tree makes sense respectively which are the
advantages of a new hardware description format compared to
device-tree.

Most popular non-UEFI+ACPI-using Linux platforms either have
converged towards using device-tree or are in the process of
doing so; even UEFI supports using device-tree instead of ACPI
tables. U-boot has moved to device-tree for its device model and
in the BSD world to my knowledge at least FreeBSD has device-tree
support (I don't know about the other BSD variants).

Regards,
Karsten
--
Gem. Par. 28 Abs. 4 Bundesdatenschutzgesetz widerspreche ich der Nutzung
sowie der Weitergabe meiner personenbezogenen Daten für Zwecke der
Werbung sowie der Markt- oder Meinungsforschung.

Arnd Bergmann

unread,
Jul 13, 2016, 3:50:38 PM7/13/16
to sw-...@groups.riscv.org, Alex Bradbury, Krste Asanovic, lowri...@lists.lowrisc.org
In my earlier reply I was commenting purely on the issue of finding
the location of RAM, not MMIO devices, and these are two very different
things:

For finding RAM, almost all platforms rely on hardcoding the start of
RAM and the fact that all RAM is contigous (with perhaps a small
hole for 32-bit PCI MMIO), and ARM is an exception to that which
requires a bit of a hack as I explained.

For MMIO areas such as PCI memory space, any operating system
should be prepared to use arbitrary addresses. In the best case you
can simply query the hardware through an architected method (special
CPU instructions or special purpose registers), and when that is
not available, a flattened DT is the normal way to pass that
information.

In previous information that I have seen about RISC-V, the idea
was always to completely avoid MMIO and have discoverable hardware,
which would nicely sidestep the problem entirely, but the information
in the U5 Coreplex Series Manual gives up on that and just uses
MMIO, presumably because otherwise you cannot easily reuse existing
IP blocks. This means at least the Linux port will have to use
a software device tree to describe what is actually there.

Unfortunately, the memory map described on page 12 of that document
makes the same mistake as some ARM64 chips and makes the RAM
location extremely sparse, with the first 14GB starting close to
zero, but all RAM beyond that starting at 0x1_0000_0000_0000
(256TB), which I guess requires using an extra level of page tables
for the kind of linear mapping that Linux has. It's probably too
late to change that, but I'd suggest that future implementations
do it differently.

Arnd

Michael Clark

unread,
Jul 13, 2016, 7:32:15 PM7/13/16
to Samuel Falvo II, sw-...@groups.riscv.org, Alex Bradbury, Arnd Bergmann, lowri...@lists.lowrisc.org
WebAssembly is not designed for server side OS ABI JIT. Intel Pin for RISC-V is closer to what I'm looking for. Easy JIT on the client is not my end goal. I want to exploit the bit complexity and may swizzle new ISAs on the fly. Currently working on a BMI2 PEXT/PDEP codec for RISC-V. I need 112 bits of ABI entropy for my application.

Sent from my iPhone

Samuel Falvo II

unread,
Jul 13, 2016, 7:38:11 PM7/13/16
to Michael Clark, sw-...@groups.riscv.org, Alex Bradbury, Arnd Bergmann, lowri...@lists.lowrisc.org
On Wed, Jul 13, 2016 at 4:31 PM, Michael Clark <michae...@mac.com> wrote:
> WebAssembly is not designed for server side OS ABI JIT. Intel Pin for RISC-V

WebAssembly is an intermediate format with no particular application
domain at all. (And yes, it is intended for server-side applications
as well, considering the popularity of Node.js). It's designed to be
generic and scalable from web UI to embedded applications. Though it
*targets* web UI/UX as its first application area, it's not restricted
to just that. It's abstraction level is on par with raw assembly
language. Indeed, the only real abstraction it does offer is that you
don't need to name CPU registers explicitly, but that's about it.

At some point in the Kestrel's future, I fully intend on providing
WebAssembly support at the OS level.

--
Samuel A. Falvo II

kr...@berkeley.edu

unread,
Jul 14, 2016, 8:44:23 AM7/14/16
to Karsten Merker, Alex Bradbury, Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org

>>>>> On Wed, 13 Jul 2016 20:21:22 +0200, Karsten Merker <mer...@debian.org> said:
| Hello,

| is there some documentation available regarding this "RISC-V
| configuration string"? Alex' comment sounds like it is intended
| as an alternative to device-tree, in which case I wonder a bit
| whether adding yet-another-hardware-description-format instead of
| using device-tree makes sense respectively which are the
| advantages of a new hardware description format compared to
| device-tree.

It's documented in a chapter in the 1.9 draft. It is certainly open
for discussion. We discuss this reasons for not using device tree in
commentary in chapter. In short, we wanted to avoid a binary
encoding, particularly one that is a poor fit for standard RISC-V Unix
machines (little endian and > 32 bits.

We couldn't see any advantage to adding another layer of cruft of top
of a poorly thought out standard.

Our config string is a simple plain printable UTF-8 string, and we've
already provided a library for managing this in the Linux port. We
will relicense this code under BSD.

Krste

kr...@berkeley.edu

unread,
Jul 14, 2016, 8:52:31 AM7/14/16
to Arnd Bergmann, sw-...@groups.riscv.org, Alex Bradbury, Krste Asanovic, lowri...@lists.lowrisc.org

Hi Arnd,

>>>>> On Wed, 13 Jul 2016 21:50:35 +0200, Arnd Bergmann <ar...@arndb.de> said:
| In previous information that I have seen about RISC-V, the idea
| was always to completely avoid MMIO and have discoverable hardware,
| which would nicely sidestep the problem entirely, but the information
| in the U5 Coreplex Series Manual gives up on that and just uses
| MMIO, presumably because otherwise you cannot easily reuse existing
| IP blocks. This means at least the Linux port will have to use
| a software device tree to describe what is actually there.

The fixed memory map is there for hardware implementers. Software
should always use the config string information to find where things
are on a platform.

| Unfortunately, the memory map described on page 12 of that document
| makes the same mistake as some ARM64 chips and makes the RAM
| location extremely sparse, with the first 14GB starting close to
| zero, but all RAM beyond that starting at 0x1_0000_0000_0000
| (256TB), which I guess requires using an extra level of page tables
| for the kind of linear mapping that Linux has. It's probably too
| late to change that, but I'd suggest that future implementations
| do it differently.

Platforms only need populate a single linear RAM address region. The
memory map describes the options where this could be put.

The reason we, and many others, adopt this style of layout is to
reduce hardware costs for smaller implementations which do not need to
provide all physical address bits and to support various DIMM sizes
without adding complexity/delay to memory accesses and coherence
systems.

Krste


| Arnd

| --
| You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
| To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
| To post to this group, send email to sw-...@groups.riscv.org.
| Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
| To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/5062579.hprnEmXlYz%40wuerfel.

kr...@berkeley.edu

unread,
Jul 14, 2016, 8:54:24 AM7/14/16
to Samuel Falvo II, Michael Clark, sw-...@groups.riscv.org, Alex Bradbury, Arnd Bergmann, lowri...@lists.lowrisc.org

I haven't had time to digest whole thread, but RISC-V was designed to
provide PIC at no cost, and kernel always runs in translated space.

Not sure the complexity being described here is needed.

Krste
| --
| You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
| To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
| To post to this group, send email to sw-...@groups.riscv.org.
| Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
| To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAEz%3Dso%3DS_Z_3LJ37G1HaJ_kR%3D0PidL49NBLa%3Dg4X%3D2%2Bj9JwRDA%40mail.gmail.com.

Arnd Bergmann

unread,
Jul 14, 2016, 8:55:41 AM7/14/16
to kr...@berkeley.edu, sw-...@groups.riscv.org, Alex Bradbury, lowri...@lists.lowrisc.org
On Thursday, July 14, 2016 5:52:28 AM CEST kr...@berkeley.edu wrote:

> | Unfortunately, the memory map described on page 12 of that document
> | makes the same mistake as some ARM64 chips and makes the RAM
> | location extremely sparse, with the first 14GB starting close to
> | zero, but all RAM beyond that starting at 0x1_0000_0000_0000
> | (256TB), which I guess requires using an extra level of page tables
> | for the kind of linear mapping that Linux has. It's probably too
> | late to change that, but I'd suggest that future implementations
> | do it differently.
>
> Platforms only need populate a single linear RAM address region. The
> memory map describes the options where this could be put.
>
> The reason we, and many others, adopt this style of layout is to
> reduce hardware costs for smaller implementations which do not need to
> provide all physical address bits and to support various DIMM sizes
> without adding complexity/delay to memory accesses and coherence
> systems.

Ok, that makes sense, and only leaves the problem of relocating
the kernel at boot time to the correct physical address.

Arnd

Arnd Bergmann

unread,
Jul 14, 2016, 9:01:34 AM7/14/16
to sw-...@groups.riscv.org, kr...@berkeley.edu, Karsten Merker, Alex Bradbury, lowri...@lists.lowrisc.org
I think such code is unlikely to get merged into the Linux kernel
though, there will be significant resistance for subsystem maintainers
in adding yet another boot loader interface.

However, you could have your own interface for the first-stage bootloader,
and then load something like the pxa-impedence-matcher[1] that converts
the data into normal DTB format before starting the kernel.

Arnd

[1] https://github.com/zonque/pxa-impedance-matcher

Alex Bradbury

unread,
Jul 19, 2016, 12:30:06 PM7/19/16
to Arnd Bergmann, sw-...@groups.riscv.org, Krste Asanovic, Karsten Merker, lowri...@lists.lowrisc.org, wes...@sifive.com
I've been playing with the config string and Linux today, and I'm
really starting to think that if device-tree is seen as unsuitable as
the standard, then at least the config string should always be
translated to devicetree at boot. I've CCed Wesley who has written the
initial Linux config-string patches and perhaps may comment on what
the longer term vision is.

As things are, I'm finding the configuration string rather awkward
when attempting to work with devices with existing drivers. All
existing Linux devices can take extra configuration from either a
struct passed to them (platdata), or read from device tree. The
config-string code
(https://github.com/riscv/riscv-linux/commit/a08621d4851e681a6b3ba65fa1ffb3b8814eb422)
will recognise resources such as irqs and addresses but of course many
drivers need more parameters than that. Modifying any driver that
might be used as part of a RISC-V platform so it can use
config_string_u64 and config_string_str as well as of_property_read_*
and the platdata approach seems like a non-starter.

Does anyone have thoughts on this issue or plans to share regarding
config-string+device-tree?

Thanks,

Alex

Wesley Terpstra

unread,
Jul 19, 2016, 1:43:28 PM7/19/16
to sw-...@groups.riscv.org
I am resending this message after subscribing to the list...

On Tue, Jul 19, 2016 at 9:30 AM, Alex Bradbury <a...@asbradbury.org> wrote:
> On 14 July 2016 at 14:01, Arnd Bergmann <ar...@arndb.de> wrote:
>> On Thursday, July 14, 2016 5:44:20 AM CEST kr...@berkeley.edu wrote:
>>> Our config string is a simple plain printable UTF-8 string, and we've
>>> already provided a library for managing this in the Linux port. We
>>> will relicense this code under BSD.
>>
>> I think such code is unlikely to get merged into the Linux kernel
>> though, there will be significant resistance for subsystem maintainers
>> in adding yet another boot loader interface.

Why? How a system boots and how the hardware self reports is not
something that is within the scope of the linux kernel. The linux
kernel must work on what the hardware and BIOS-equivalent provide. If
the maintainers rejected boot loaders because they were baroque, linux
would not support x86.

I think that whatever decision we make wrt. how the system describes
its contained hardware should not consider whether a hypothetical
kernel developer will reject a merge of the architecture, because this
is not within the scope of kernel software. That said, if it makes
porting many drivers to RISC-V a pain, then we might want to take that
pain into consideration.

>> However, you could have your own interface for the first-stage bootloader,
>> and then load something like the pxa-impedence-matcher[1] that converts
>> the data into normal DTB format before starting the kernel.
>
> I've been playing with the config string and Linux today, and I'm
> really starting to think that if device-tree is seen as unsuitable as
> the standard, then at least the config string should always be
> translated to devicetree at boot.

I am definitely opposed to a translation. You would then have
information in two incompatible formats. In my experience, there is
always a mismatch in the information when you go down this route. If
we do device tree, we should go all the way and use it natively, OR we
should stick with the config string. I think a half measure smells
like a legacy compatibility layer... and yet we are starting with a
clean slate.

> As things are, I'm finding the configuration string rather awkward
> when attempting to work with devices with existing drivers. All
> existing Linux devices can take extra configuration from either a
> struct passed to them (platdata), or read from device tree.

Even if you said that all existing linux PLATFORM device drivers take
their configuration from plat data or device tree, the above would
still be inaccurate. Firstly, the vast majority of linux device
drivers are not for platform devices. They are for devices connected
to some other bus, like PCI or USB. Those work unmodified on RISC-V.
Secondly, many platform devices do not use device tree, but instead
simply use the 'resource's that are probed for them. We are already
compatible with this interface. Finally, many architectures (including
x86) do not use device tree, but another mechanism, so RISC-V is not
unusual in this regard.

That said, it is definitely true that there are many existing
_platform_ drivers that use device tree.

> The config-string code
> will recognise resources such as irqs and addresses but of course many
> drivers need more parameters than that.

Yes. See, for example, gpio-riscv.c which executes:
> rv->chip.ngpio = config_string_u64(pdev, "ngpio");
... which does not seem so cumbersome.

> Modifying any driver that
> might be used as part of a RISC-V platform so it can use
> config_string_u64 and config_string_str as well as of_property_read_*
> and the platdata approach seems like a non-starter.

Well, that depends.

How many existing devices are we planning on integrating directly into
the SoC? Again, any device that connects via another bus standard will
work unmodified. I would hope that we aim to include open source
hardware in the SoC, which would need a new driver anyway.

Platform device drivers that correspond to well established IP already
do exactly what you are advocating against. See
drivers/pci/host/pcie-{qcom,his}.c, for example. Those are thin C
wrappers that add probing via device tree for a specific platform to
the general-purpose DesignWare driver. Notice that they had to write
distinct drivers for two different platforms, despite the use of
device tree. We could very easily add another C file that did the
probing via the config string.

In summary: I am not necessarily advocating for the config string, but
I don't find the config string approach to be much of a show stopper.
The porting effort is minimal per device. This only leaves open the
question of how many devices would we need to port. The only impacted
devices are those with existing (but not already widely ported)
drivers that we would want to integrate directly into the SoC.

Alex Bradbury

unread,
Jul 19, 2016, 2:14:45 PM7/19/16
to Wesley Terpstra, Arnd Bergmann, sw-...@groups.riscv.org, Krste Asanovic, Karsten Merker, lowri...@lists.lowrisc.org
On 19 July 2016 at 18:30, Wesley Terpstra <wes...@sifive.com> wrote:
> I think that whatever decision we make wrt. how the system describes
> its contained hardware should not consider whether a hypothetical
> kernel developer will reject a merge of the architecture, because this
> is not within the scope of kernel software. That said, if it makes
> porting many drivers to RISC-V a pain, then we might want to take that
> pain into consideration.

I disagree, with RISC-V we have the opportunity for much greater
collaboration between hardware and software developers. These
decisions should be made ins consultation with software developers
rather than thrown over the wall (publishing drafts of the privileged
spec before they are adopted by the RISC-V Foundation has of course
been useful for this).

> That said, it is definitely true that there are many existing
> _platform_ drivers that use device tree.

You're right, I was overly broad - thanks for the clarification.

>> The config-string code
>> will recognise resources such as irqs and addresses but of course many
>> drivers need more parameters than that.
>
> Yes. See, for example, gpio-riscv.c which executes:
>> rv->chip.ngpio = config_string_u64(pdev, "ngpio");
> ... which does not seem so cumbersome.

Yes I saw that. It's not cumbersome to add it for any given device,
but it does seem a shame if we're now going to end up with three
configuration codepaths for every platform device that might want to
be used on RISC-V as well as other devicetree supporting platforms.

>> Modifying any driver that
>> might be used as part of a RISC-V platform so it can use
>> config_string_u64 and config_string_str as well as of_property_read_*
>> and the platdata approach seems like a non-starter.
>
> Well, that depends.
>
> How many existing devices are we planning on integrating directly into
> the SoC? Again, any device that connects via another bus standard will
> work unmodified. I would hope that we aim to include open source
> hardware in the SoC, which would need a new driver anyway.

Sometimes you want to instantiate existing IP, e.g. Xilinx SPI. Some
open source IP is already supported in the kernel (e.g. some devices
from open cores). Additionally, it seems unnecessary to assume that
newly-created open source IP will be RISC-V only. People may choose to
instantiate it with other cores (ARM, MIPS, proprietary, OpenRISC,
OpenSPARC, ...). Although at lowRISC we aim for open-source
peripherals, many other RISC-V implementers are likely to use existing
in-house IP or commercially licensed IP.

> Platform device drivers that correspond to well established IP already
> do exactly what you are advocating against. See
> drivers/pci/host/pcie-{qcom,his}.c, for example. Those are thin C
> wrappers that add probing via device tree for a specific platform to
> the general-purpose DesignWare driver. Notice that they had to write
> distinct drivers for two different platforms, despite the use of
> device tree. We could very easily add another C file that did the
> probing via the config string.

The fact that devicetree doesn't solve all driver-related issues seems
somewhat orthogonal to config-string vs devicetree. If I were to
submit a patch to add config_string_str calls to a driver that already
supports of_property_read_* I think it would be hard to convince
myself I'm actually making the kernel better.

> In summary: I am not necessarily advocating for the config string, but
> I don't find the config string approach to be much of a show stopper.
> The porting effort is minimal per device. This only leaves open the
> question of how many devices would we need to port. The only impacted
> devices are those with existing (but not already widely ported)
> drivers that we would want to integrate directly into the SoC.

Devicetree can also be used for configuring on-board devices as used
with devices like Beaglebone Black and Raspberry Pi through
device-tree overlays. e.g. you may load a certain device tree overlay
when an audio 'hat' or 'cape' is connected to enable I2S on the
desired pins. For use cases like this, it doesn't seem it would be
worth totally stripping the system of DT and instead you'd end up
using either a mixture of config-string and device-tree or device-tree
throughout (by e.g. having the bootloader convert the config-string).
I'm far from an expert here though, so do correct me if you think I'm
missing the point.

Alex

Karsten Merker

unread,
Jul 19, 2016, 4:10:11 PM7/19/16
to Alex Bradbury, Wesley Terpstra, Arnd Bergmann, Krste Asanovic, sw-...@groups.riscv.org, Karsten Merker, lowri...@lists.lowrisc.org
On Tue, Jul 19, 2016 at 07:14:40PM +0100, Alex Bradbury wrote:
> On 19 July 2016 at 18:30, Wesley Terpstra <wes...@sifive.com> wrote:

> > I think that whatever decision we make wrt. how the system describes
> > its contained hardware should not consider whether a hypothetical
> > kernel developer will reject a merge of the architecture, because this
> > is not within the scope of kernel software. That said, if it makes
> > porting many drivers to RISC-V a pain, then we might want to take that
> > pain into consideration.
>
> I disagree, with RISC-V we have the opportunity for much greater
> collaboration between hardware and software developers. These
> decisions should be made ins consultation with software developers
> rather than thrown over the wall (publishing drafts of the privileged
> spec before they are adopted by the RISC-V Foundation has of course
> been useful for this).

Ack.

> >> The config-string code
> >> will recognise resources such as irqs and addresses but of course many
> >> drivers need more parameters than that.
> >
> > Yes. See, for example, gpio-riscv.c which executes:
> >> rv->chip.ngpio = config_string_u64(pdev, "ngpio");
> > ... which does not seem so cumbersome.
>
> Yes I saw that. It's not cumbersome to add it for any given device,
> but it does seem a shame if we're now going to end up with three
> configuration codepaths for every platform device that might want to
> be used on RISC-V as well as other devicetree supporting platforms.

Devicetree is already supported out of the box in a large part of
the existing platform drivers, so with using devicetree there
wouldn't be any need to port specific drivers at all in many
cases - they could be used as is. As Wesley has pointed out,
there _are_ legacy drivers onto which device-tree support has
been grafted on in a not-so-ideal way, but that isn't the default
case.

> >> Modifying any driver that
> >> might be used as part of a RISC-V platform so it can use
> >> config_string_u64 and config_string_str as well as of_property_read_*
> >> and the platdata approach seems like a non-starter.
> >
> > Well, that depends.
> >
> > How many existing devices are we planning on integrating directly into
> > the SoC? Again, any device that connects via another bus standard will
> > work unmodified. I would hope that we aim to include open source
> > hardware in the SoC, which would need a new driver anyway.
>
> Sometimes you want to instantiate existing IP, e.g. Xilinx SPI. Some
> open source IP is already supported in the kernel (e.g. some devices
> from open cores). Additionally, it seems unnecessary to assume that
> newly-created open source IP will be RISC-V only. People may choose to
> instantiate it with other cores (ARM, MIPS, proprietary, OpenRISC,
> OpenSPARC, ...). Although at lowRISC we aim for open-source
> peripherals, many other RISC-V implementers are likely to use existing
> in-house IP or commercially licensed IP.

Another important field for platform driver support are
peripherals on non-discoverable busses such as I2C and SPI, for
which we have a plethora of existing drivers with device-tree
support. Those can be instantiated with just a few lines in the
dts and don't need any driver porting work on device-tree-using
systems. This encompasses things such as power management ICs,
SPI flash memory, ADCs, DACs, lots of sensors (magnetometer,
accelerometer, orientation, temperature, pressure, etc.), tons of
different touchscreens, SPI-driven graphical LCDs for embedded
use, various forms of I/O controllers, etc.

> > In summary: I am not necessarily advocating for the config string, but
> > I don't find the config string approach to be much of a show stopper.
> > The porting effort is minimal per device. This only leaves open the
> > question of how many devices would we need to port. The only impacted
> > devices are those with existing (but not already widely ported)
> > drivers that we would want to integrate directly into the SoC.

This is not really true, see the paragraph above.

> Devicetree can also be used for configuring on-board devices as used
> with devices like Beaglebone Black and Raspberry Pi through
> device-tree overlays. e.g. you may load a certain device tree overlay
> when an audio 'hat' or 'cape' is connected to enable I2S on the
> desired pins. For use cases like this, it doesn't seem it would be
> worth totally stripping the system of DT and instead you'd end up
> using either a mixture of config-string and device-tree or device-tree
> throughout (by e.g. having the bootloader convert the config-string).
> I'm far from an expert here though, so do correct me if you think I'm
> missing the point.

I fully agree.

As a practical example why I consider mixing different hardware
description systems a bad idea: I have already been exposed way
more than I would like to a mixup between two hardware description
systems on the Allwinner sunxi platform. Allwinner has implemented
their own proprietary hardware description system called "FEX" in
their vendor bootloader and in their vendor kernels. They have
patched the drivers they consider important to support their FEX
system and ignore everything else. This results in an absolute
mess if you want to use some device that they didn't consider
important and therefore haven't added FEX support for, as the
standard device drivers expect devicetree configuration, but e.g.
their pinctrl driver expects everything to be handled by FEX. In
contrast to the vendor code, the sunxi platform support in mainline
is purely devicetree-based and there wouldn't have been even the
slightest chance of the Allwinner FEX system being included in the
mainline kernel and u-boot.

Wesley Terpstra

unread,
Jul 19, 2016, 4:55:27 PM7/19/16
to Karsten Merker, Alex Bradbury, Arnd Bergmann, Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
On Tue, Jul 19, 2016 at 1:10 PM, Karsten Merker <mer...@debian.org> wrote:
> On Tue, Jul 19, 2016 at 07:14:40PM +0100, Alex Bradbury wrote:
>> On 19 July 2016 at 18:30, Wesley Terpstra <wes...@sifive.com> wrote:
>
>> > I think that whatever decision we make wrt. how the system describes
>> > its contained hardware should not consider whether a hypothetical
>> > kernel developer will reject a merge of the architecture, because this
>> > is not within the scope of kernel software. That said, if it makes
>> > porting many drivers to RISC-V a pain, then we might want to take that
>> > pain into consideration.
>>
>> I disagree, with RISC-V we have the opportunity for much greater
>> collaboration between hardware and software developers. These
>> decisions should be made ins consultation with software developers
>> rather than thrown over the wall (publishing drafts of the privileged
>> spec before they are adopted by the RISC-V Foundation has of course
>> been useful for this).

I am not saying that we should not collaborate with software
developers. Certainly we want to reach a solution that is good for
both the hardware and software implementations. Rather, I don't think
we should be making a decision based on the fear of not getting
whatever support is necessary into the kernel. If the software is
needed for the platform, it will get merged.

>> Yes I saw that. It's not cumbersome to add it for any given device,
>> but it does seem a shame if we're now going to end up with three
>> configuration codepaths for every platform device that might want to
>> be used on RISC-V as well as other devicetree supporting platforms.

Right. I just question how many such devices there will be.

> Devicetree is already supported out of the box in a large part of
> the existing platform drivers, so with using devicetree there
> wouldn't be any need to port specific drivers at all in many
> cases - they could be used as is.

I agree that this is a desirable feature.

When I first saw the config string, my response was also: why did we
not use an existing standard. All I want is that we don't end up with
two orthogonal description languages in a new platform.

>> Sometimes you want to instantiate existing IP, e.g. Xilinx SPI. Some
>> open source IP is already supported in the kernel (e.g. some devices
>> from open cores). Additionally, it seems unnecessary to assume that
>> newly-created open source IP will be RISC-V only.

All fair points.

> Another important field for platform driver support are
> peripherals on non-discoverable busses such as I2C and SPI, for
> which we have a plethora of existing drivers with device-tree
> support.

I agree that you need some way to compensate for deficient busses.

>> Devicetree can also be used for configuring on-board devices as used
>> with devices like Beaglebone Black and Raspberry Pi through
>> device-tree overlays. e.g. you may load a certain device tree overlay
>> when an audio 'hat' or 'cape' is connected to enable I2S on the
>> desired pins.

If you attach an external bus described by DT, I agree you will want
to keep using that format.

I also agree mixing standards is bad. I just want to avoid that there
are two different description languages used at different layers of
the system.

Ultimately, though, you need to convince the actors who advocate for
the config string. They argue that DT does not include the fields
needed for RISC-V and would need to be extended, anyway.

I wonder if one could address most of the concerns about the config
string by implementing DT API-compatible methods for the config
string.

David Chisnall

unread,
Jul 20, 2016, 5:16:59 AM7/20/16
to Wesley Terpstra, Karsten Merker, Alex Bradbury, Arnd Bergmann, Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
On 19 Jul 2016, at 21:55, Wesley Terpstra <wes...@sifive.com> wrote:
>
> Ultimately, though, you need to convince the actors who advocate for
> the config string. They argue that DT does not include the fields
> needed for RISC-V and would need to be extended, anyway.

What fields are these? FDT is just a serialisation format, aside from a small amount in the header, it’s entirely up to the producer and consumer what the semantics of the various fields are.

David

Arnd Bergmann

unread,
Jul 20, 2016, 5:32:15 AM7/20/16
to Wesley Terpstra, Karsten Merker, Alex Bradbury, Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org
On Tuesday, July 19, 2016 1:55:25 PM CEST Wesley Terpstra wrote:
> On Tue, Jul 19, 2016 at 1:10 PM, Karsten Merker <mer...@debian.org> wrote:
> > On Tue, Jul 19, 2016 at 07:14:40PM +0100, Alex Bradbury wrote:
> >> On 19 July 2016 at 18:30, Wesley Terpstra <wes...@sifive.com> wrote:
> >
> >> > I think that whatever decision we make wrt. how the system describes
> >> > its contained hardware should not consider whether a hypothetical
> >> > kernel developer will reject a merge of the architecture, because this
> >> > is not within the scope of kernel software. That said, if it makes
> >> > porting many drivers to RISC-V a pain, then we might want to take that
> >> > pain into consideration.
> >>
> >> I disagree, with RISC-V we have the opportunity for much greater
> >> collaboration between hardware and software developers. These
> >> decisions should be made ins consultation with software developers
> >> rather than thrown over the wall (publishing drafts of the privileged
> >> spec before they are adopted by the RISC-V Foundation has of course
> >> been useful for this).
>
> I am not saying that we should not collaborate with software
> developers. Certainly we want to reach a solution that is good for
> both the hardware and software implementations. Rather, I don't think
> we should be making a decision based on the fear of not getting
> whatever support is necessary into the kernel. If the software is
> needed for the platform, it will get merged.

Note that you should always be able to describe a full system
with a flattened device tree, even if it started out using some
other format (atags, fex, bios, ...), since the DT is basically
a superset of information that you can have elsewhere.

> >> Yes I saw that. It's not cumbersome to add it for any given device,
> >> but it does seem a shame if we're now going to end up with three
> >> configuration codepaths for every platform device that might want to
> >> be used on RISC-V as well as other devicetree supporting platforms.
>
> Right. I just question how many such devices there will be.

There is already an ongoing unification with the "device properties"
interface that allows you to query generic properties from a
'struct device' in Linux, regardless of whether it was instantiated
from architecture code (no legacy machines), using devicetree
(on most modern embedded systems) or from ACPI (on embedded x86),
since all of them have a way to attach numbers or strings to named
keys.

The same thing can of course be used for querying config strings
that are embedded in a RISC-V SoC, but it quickly becomes awkward
when you need to have cross-references to other nodes, as those
cannot easily be generalized.

> > Another important field for platform driver support are
> > peripherals on non-discoverable busses such as I2C and SPI, for
> > which we have a plethora of existing drivers with device-tree
> > support.
>
> I agree that you need some way to compensate for deficient busses.

The device tree is used for several things beyond that today, each
of which needs a solution:

- passing configuration from the bootloader, including amount of
installed memory, root console settings and the kernel command
line. These are often handled in some architecture specific
way and I assume there is something in place already

- describing on-chip components: this is annoying because the
SoC should already know what it is and have a way to convey
that information to an OS without the need of describing it
in software. x86 doesn't need this because the hardware is
known in advance: everything is either at a fixed address
or it shows up logically as a discoverable PCI device.
Today's SoCs almost all get that wrong, and that's why we
describe them in DT.
If the config strings can be embedded in each on-chip device
in a way that they are always accessible without the need
for a boot loader interface in the way that PCI is discoverable,
that means we don't need DT for this and can just have a
single DT node for the SoC itself, and make that present itself
as a combined irqchip/gpiochip/pwm/led/clk/reset/mmio/...
provider.

- describing off-chip non-discoverable buses, there really
isn't much of an alternative to DT, as these can be rather
complex, each device can both be a consumer and a provider
of resources (e.g. gpios or regulators controlled over i2c),
and we don't want to describe individual machines in source
code any more. embedded x86 machines with ACPI can get around
some of these problems by sticking data that is compatible
with the DT format into ACPI tables.

- adding missing information to discoverable buses: this
happens surprisingly often: sdio is discoverable in theory,
but many wifi adapters need to read their MAC address and
connect to a clock controller, a reset controller, a regulator
or an external interrupt. USB ethernet is similar, when people
are cheap and leave out the eprom that stores device information.
Even on PCI we have the need to pass additional data like
tuning for wireless network adapters, or you can have a
SDIO-on-USB-on-PCI adapter and get all of the above.

> >> Devicetree can also be used for configuring on-board devices as used
> >> with devices like Beaglebone Black and Raspberry Pi through
> >> device-tree overlays. e.g. you may load a certain device tree overlay
> >> when an audio 'hat' or 'cape' is connected to enable I2S on the
> >> desired pins.
>
> If you attach an external bus described by DT, I agree you will want
> to keep using that format.
>
> I also agree mixing standards is bad. I just want to avoid that there
> are two different description languages used at different layers of
> the system.
>
> Ultimately, though, you need to convince the actors who advocate for
> the config string. They argue that DT does not include the fields
> needed for RISC-V and would need to be extended, anyway.
>
> I wonder if one could address most of the concerns about the config
> string by implementing DT API-compatible methods for the config
> string.

My reading of the config string interface is that it assumes that a
type of information can always be expressed as a string or a single
integer number, while the DT format for most subsystems defines the
format as using a reference to a node providing the resource, followed
by an arbitrary number of integers in a format that is defined by
the provider.

It's probably possible to map one into the other, but it's not nice.
In a system-wide view, having global numbers or names as identifiers
for resources does not scale, because you can have additional
providers for things like irqs, gpios, dma, leds, ... on external buses,
and an OS like Linux already contains infrastructure to handle
those but requires them to be registered as separate devices that
each define their own local addressing.

Having a single number as an identifier works most of the time, but
there are plenty of examples where it doesn't:

- an irqchip needing to pass local hwirq number plus flags (polarity,
type of interrupt) that the consumer driver doesn't know.
- dmaengine requiring request number plus dma master identifier
- gpio passing some settings (bank-nr, polarity, drive strength, ...)
- pwm period and flags

If you only have a scalar identifier, then you need a global
lookup table somewhere reachable by the driver. Using a string
obviously works for this too, but requires a parser in each driver.

Arnd

Wesley Terpstra

unread,
Jul 22, 2016, 5:14:59 PM7/22/16
to Arnd Bergmann, Karsten Merker, Alex Bradbury, Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org, Andrew Waterman
Since no one else responded, I end up again defending the config
string. Here we go...

On Wed, Jul 20, 2016 at 2:32 AM, Arnd Bergmann <ar...@arndb.de> wrote:
> My reading of the config string interface is that it assumes that a
> type of information can always be expressed as a string or a single
> integer number

That's how the convenience methods I wrote work, but actually, the
config string allows an arbitrary number of integers or strings
following a key. I was going to add an additional 'nth' parameter to
those methods.

> In a system-wide view, having global numbers or names as identifiers
> for resources does not scale, because you can have additional
> providers for things like irqs, gpios, dma, leds, ... on external buses,

The config string is 'XML-like' in that you have recursive
descriptions, so the complete path need be globally unique.

> - an irqchip needing to pass local hwirq number plus flags (polarity,
> type of interrupt) that the consumer driver doesn't know.
> - dmaengine requiring request number plus dma master identifier
> - gpio passing some settings (bank-nr, polarity, drive strength, ...)
> - pwm period and flags

I don't see a problem with most of these. You can have hierarchical
information at any given path in the config string, and drivers attach
at the point in the hierarchy where they assume responsibility.

> Note that you should always be able to describe a full system
> with a flattened device tree, even if it started out using some
> other format (atags, fex, bios, ...), since the DT is basically
> a superset of information that you can have elsewhere.

Right. Except when you translate between formats things might end up
either missing or in an unexpected location.

> There is already an ongoing unification with the "device properties"
> interface that allows you to query generic properties from a
> 'struct device' in Linux, regardless of whether it was instantiated
> from architecture code (no legacy machines), using devicetree
> (on most modern embedded systems) or from ACPI (on embedded x86),
> since all of them have a way to attach numbers or strings to named
> keys.

If we stick with the config string, I will definitely explore adding
the config-string meta-data to this interface.

> The same thing can of course be used for querying config strings
> that are embedded in a RISC-V SoC, but it quickly becomes awkward
> when you need to have cross-references to other nodes, as those
> cannot easily be generalised.

This is one of my main concerns. It's true that device tree has a lot
of cruft, but it's also a widely used spec, so some of what looks
overly complex actually solves a problem you didn't know existed.
Case-and-point: cross references. This is something the config-string
does not have, although I am sure its advocates would say: "Nothing
prevents you from including a string that refers to another path in
the hierarchy". However, unless that cross-reference format has been
specified somewhere, you won't get the sort of reuse that makes a
general-purpose parser and convenience functions available.

> The device tree is used for several things beyond that today, each
> of which needs a solution:
>
> - passing configuration from the bootloader, including amount of
> installed memory, root console settings and the kernel command
> line. These are often handled in some architecture specific
> way and I assume there is something in place already

Currently you get this information from a combination of SBI
(essentially the BIOS) calls and the config string. We were planning
to reduce the number of SBI calls to the bare minimum needed to hand
off the config string, and then put all the information you need in
there.

> - describing on-chip components: this is annoying because the
> SoC should already know what it is and have a way to convey
> that information to an OS without the need of describing it
> in software. x86 doesn't need this because the hardware is
> known in advance: everything is either at a fixed address
> or it shows up logically as a discoverable PCI device.
> Today's SoCs almost all get that wrong, and that's why we
> describe them in DT.

Right. This is the problem the config string advocates are trying to solve.

> If the config strings can be embedded in each on-chip device
> in a way that they are always accessible without the need
> for a boot loader interface in the way that PCI is discoverable,
> that means we don't need DT for this and can just have a
> single DT node for the SoC itself, and make that present itself
> as a combined irqchip/gpiochip/pwm/led/clk/reset/mmio/...
> provider.

Currently, the idea was to bake the config string for the SoC into the
boot ROM for that SoC. That means one ROM for the whole SoC, not one
ROM per device.

> - describing off-chip non-discoverable buses, there really
> isn't much of an alternative to DT, as these can be rather
> complex, each device can both be a consumer and a provider
> of resources (e.g. gpios or regulators controlled over i2c),
> and we don't want to describe individual machines in source
> code any more. embedded x86 machines with ACPI can get around
> some of these problems by sticking data that is compatible
> with the DT format into ACPI tables.

This is a definite gap in the current scheme. Probably the best answer
with the current system is that the boot loader for that board would
inject additional data describing the i2c components on the board into
the config string. However, DT already has a scheme for overlaying
data like this, and IMO it seems rather foolish to reinvent the wheel
with the config string.

> - adding missing information to discoverable buses: this
> happens surprisingly often: sdio is discoverable in theory,
> but many wifi adapters need to read their MAC address and
> connect to a clock controller, a reset controller, a regulator
> or an external interrupt. USB ethernet is similar, when people
> are cheap and leave out the eprom that stores device information.
> Even on PCI we have the need to pass additional data like
> tuning for wireless network adapters, or you can have a
> SDIO-on-USB-on-PCI adapter and get all of the above.

This was not even on our radar. Thank you for pointing this scenario out.

Arnd Bergmann

unread,
Jul 28, 2016, 10:39:02 AM7/28/16
to Wesley Terpstra, Karsten Merker, Alex Bradbury, Krste Asanovic, sw-...@groups.riscv.org, lowri...@lists.lowrisc.org, Andrew Waterman
On Friday, July 22, 2016 2:14:57 PM CEST Wesley Terpstra wrote:
> Since no one else responded, I end up again defending the config
> string. Here we go...
>
> On Wed, Jul 20, 2016 at 2:32 AM, Arnd Bergmann <ar...@arndb.de> wrote:
> > My reading of the config string interface is that it assumes that a
> > type of information can always be expressed as a string or a single
> > integer number
>
> That's how the convenience methods I wrote work, but actually, the
> config string allows an arbitrary number of integers or strings
> following a key. I was going to add an additional 'nth' parameter to
> those methods.

Ok.

> > In a system-wide view, having global numbers or names as identifiers
> > for resources does not scale, because you can have additional
> > providers for things like irqs, gpios, dma, leds, ... on external buses,
>
> The config string is 'XML-like' in that you have recursive
> descriptions, so the complete path need be globally unique.

What I mean is that you may need to represent a device that has
multiple interrupts that are handled by different controllers,
so each one needs to be described by

- name of the irq as seen from the slave ("rx data", "tx data",
"error")
- path to the controller
- identifier as interpreted by that irqchip
- additional data needed by the irqchip

In devicetree we went through several additions here until we
could describe all hardware. Open Firmware started out assuming
there would be only one irqchip in the system, possibly with
just a single number for the irq. Then the "#interrupt-cells"
property was added to describe complex irq-chips, then
"interrupt-parent" was added to allow each device have its
interrupts get directed to a particular irqchip, and "interrupt-map"
was added to allow devices to have interrupts on multiple buses.
This was all part of ieee-1275, and after we introduced the flattened
format for Linux, we extended it futher with "interrupt-names"
to allow compatibility with Linux drivers that refer to each
interrupt line by a string (e.g. "rxdata") instead of picking
the first or second interrupt from an array, and finally we added
"interrupts-extended" because the syntax of the old "interrupt-map"
property was considered just too impractical.

For things that came later (dma, clocks, iommu), we tried to avoid
those mistakes and made something that always works like
"interrupts-extended", but that still requires the separate
"-names" property.

I'd suggest that if you require a reference to another node,
you follow a similar model to what we ended up with dt, with
a set of "local name", "provider", "identifier" and "extra
attributes", to learn from the mistakes we made in the past.

> > - an irqchip needing to pass local hwirq number plus flags (polarity,
> > type of interrupt) that the consumer driver doesn't know.
> > - dmaengine requiring request number plus dma master identifier
> > - gpio passing some settings (bank-nr, polarity, drive strength, ...)
> > - pwm period and flags
>
> I don't see a problem with most of these. You can have hierarchical
> information at any given path in the config string, and drivers attach
> at the point in the hierarchy where they assume responsibility.

Ok.

> > Note that you should always be able to describe a full system
> > with a flattened device tree, even if it started out using some
> > other format (atags, fex, bios, ...), since the DT is basically
> > a superset of information that you can have elsewhere.
>
> Right. Except when you translate between formats things might end up
> either missing or in an unexpected location.

I meant a hand-written dts file translated into a dtb here, not
an automatic conversion.

> > There is already an ongoing unification with the "device properties"
> > interface that allows you to query generic properties from a
> > 'struct device' in Linux, regardless of whether it was instantiated
> > from architecture code (no legacy machines), using devicetree
> > (on most modern embedded systems) or from ACPI (on embedded x86),
> > since all of them have a way to attach numbers or strings to named
> > keys.
>
> If we stick with the config string, I will definitely explore adding
> the config-string meta-data to this interface.

ok.

> > The same thing can of course be used for querying config strings
> > that are embedded in a RISC-V SoC, but it quickly becomes awkward
> > when you need to have cross-references to other nodes, as those
> > cannot easily be generalised.
>
> This is one of my main concerns. It's true that device tree has a lot
> of cruft, but it's also a widely used spec, so some of what looks
> overly complex actually solves a problem you didn't know existed.
> Case-and-point: cross references. This is something the config-string
> does not have, although I am sure its advocates would say: "Nothing
> prevents you from including a string that refers to another path in
> the hierarchy". However, unless that cross-reference format has been
> specified somewhere, you won't get the sort of reuse that makes a
> general-purpose parser and convenience functions available.

Right. In DT, the cross-references are what makes things work, since
originally the assumption was that a system could be represented
as a tree. This was correct 32 years ago, but no longer suffices
to describe any modern machine. A lot of nodes are slaves to multiple
buses (i2c, mmio, ...), take references from nodes elsewhere in
the tree (gpio, irq, dmarq, ...) and are masters on other buses
that may be further up in the hierarchy.

> > The device tree is used for several things beyond that today, each
> > of which needs a solution:
> >
> > - passing configuration from the bootloader, including amount of
> > installed memory, root console settings and the kernel command
> > line. These are often handled in some architecture specific
> > way and I assume there is something in place already
>
> Currently you get this information from a combination of SBI
> (essentially the BIOS) calls and the config string. We were planning
> to reduce the number of SBI calls to the bare minimum needed to hand
> off the config string, and then put all the information you need in
> there.

Hmm, that sounds dangerous, as it repeats what has been criticized
a lot for DT, that it mixes up runtime configuration with
hardware description too much. If you continue down that path,
you may end up with a similar convolut of data that is in DT,
just incompatible.

I think ideally a new hardware description format would be done
in a way that it only describes what is within the SoC, and
then you generate that data along with the HDL in a way that makes
it impossible to diverge from what is actually in present in
the hardware.

The way Xilinx handles it for their FPGAs (if I understand
it correctly) is that they generate a .dtsi file along with
the hardware model and have all the data in that file synchronized
with the hardware that is used, and a board design would include
that .dtsi file in its .dts file to get a full system description.

> > If the config strings can be embedded in each on-chip device
> > in a way that they are always accessible without the need
> > for a boot loader interface in the way that PCI is discoverable,
> > that means we don't need DT for this and can just have a
> > single DT node for the SoC itself, and make that present itself
> > as a combined irqchip/gpiochip/pwm/led/clk/reset/mmio/...
> > provider.
>
> Currently, the idea was to bake the config string for the SoC into the
> boot ROM for that SoC. That means one ROM for the whole SoC, not one
> ROM per device.

Ok, that should work just as well, and makes it easier to reuse
existing IP blocks that don't follow a particular scheme of
where to store the data.

> > - describing off-chip non-discoverable buses, there really
> > isn't much of an alternative to DT, as these can be rather
> > complex, each device can both be a consumer and a provider
> > of resources (e.g. gpios or regulators controlled over i2c),
> > and we don't want to describe individual machines in source
> > code any more. embedded x86 machines with ACPI can get around
> > some of these problems by sticking data that is compatible
> > with the DT format into ACPI tables.
>
> This is a definite gap in the current scheme. Probably the best answer
> with the current system is that the boot loader for that board would
> inject additional data describing the i2c components on the board into
> the config string. However, DT already has a scheme for overlaying
> data like this, and IMO it seems rather foolish to reinvent the wheel
> with the config string.

Right, and it's related to what I think is the biggest question:

How much do you actually want to describe in the config strings?

I can immediately see the use for describing on-chip components,
and that would be manageable, you just need a couple of subsystem
specific handlers for describing clocks, regulators, irqchips, dma,
mmio etc. Inside of the chip you can probably make some
simplifications by assuming there is only one type of each of
those, or by enforcing some structure within the subsystem that
is not valid across other architectures.

Doing system-level configuration requires more stuff, but it's
mostly architecture specific.

Describing off-chip devices however would need another huge set
of subsystem specific changes in pinmux, gpio, led, pwm, chipselect,
i2c, spi, pci, usb, ...

Once you try to do all of the above, you will have the same
amount of complexity that we have in DT, and have annoyed all
subsystem maintainers that don't care the least about the
architecture but now have to worry about yet another incompatible
representation. Doing the initial DT support for each subsystem
met significant resistence from subsystem maintainers because
of this, for embedded ACPI it got worse, and adding a third way
to describe the same things will be even harder, no matter whether
that method is actually nicer or not.

Arnd
Reply all
Reply to author
Forward
0 new messages