Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH 2/3] OF: Introduce DT overlay support.

183 views
Skip to first unread message

Pantelis Antoniou

unread,
Nov 5, 2013, 1:50:01 PM11/5/13
to
Introduce DT overlay support.
Using this functionality it is possible to dynamically overlay a part of
the kernel's tree with another tree that's been dynamically loaded.
It is also possible to remove node and properties.

Note that the I2C client devices are 'special', as in they're not platform
devices. They need to be registered with an I2C specific method.

In general I2C is just no good platform device citizen and needs
special casing.

Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
---
Documentation/devicetree/overlay-notes.txt | 179 ++++++
drivers/of/Kconfig | 10 +
drivers/of/Makefile | 1 +
drivers/of/overlay.c | 869 +++++++++++++++++++++++++++++
include/linux/of.h | 113 ++++
5 files changed, 1172 insertions(+)
create mode 100644 Documentation/devicetree/overlay-notes.txt
create mode 100644 drivers/of/overlay.c

diff --git a/Documentation/devicetree/overlay-notes.txt b/Documentation/devicetree/overlay-notes.txt
new file mode 100644
index 0000000..5289cbb
--- /dev/null
+++ b/Documentation/devicetree/overlay-notes.txt
@@ -0,0 +1,179 @@
+Device Tree Overlay Notes
+-------------------------
+
+This document describes the implementation of the in-kernel
+device tree overlay functionality residing in drivers/of/overlay.c and is a
+companion document to Documentation/devicetree/dt-object-internal.txt[1] &
+Documentation/devicetree/dynamic-resolution-notes.txt[2]
+
+How overlays work
+-----------------
+
+A Device Tree's overlay purpose is to modify the kernel's live tree, and
+have the modification affecting the state of the the kernel in a way that
+is reflecting the changes.
+Since the kernel mainly deals with devices, any new device node that result
+in an active device should have it created while if the device node is either
+disabled or removed all together, the affected device should be deregistered.
+
+Lets take an example where we have a foo board with the following base tree
+which is taken from [1].
+
+---- foo.dts -----------------------------------------------------------------
+ /* FOO platform */
+ / {
+ compatible = "corp,foo";
+
+ /* shared resources */
+ res: res {
+ };
+
+ /* On chip peripherals */
+ ocp: ocp {
+ /* peripherals that are always instantiated */
+ peripheral1 { ... };
+ }
+ };
+---- foo.dts -----------------------------------------------------------------
+
+The overlay bar.dts, when loaded (and resolved as described in [2]) should
+
+---- bar.dts -----------------------------------------------------------------
+/plugin/; /* allow undefined label references and record them */
+/ {
+ .... /* various properties for loader use; i.e. part id etc. */
+ fragment@0 {
+ target = <&ocp>;
+ __overlay__ {
+ /* bar peripheral */
+ bar {
+ compatible = "corp,bar";
+ ... /* various properties and child nodes */
+ }
+ };
+ };
+};
+---- bar.dts -----------------------------------------------------------------
+
+result in foo+bar.dts
+
+---- foo+bar.dts -------------------------------------------------------------
+ /* FOO platform + bar peripheral */
+ / {
+ compatible = "corp,foo";
+
+ /* shared resources */
+ res: res {
+ };
+
+ /* On chip peripherals */
+ ocp: ocp {
+ /* peripherals that are always instantiated */
+ peripheral1 { ... };
+
+ /* bar peripheral */
+ bar {
+ compatible = "corp,bar";
+ ... /* various properties and child nodes */
+ }
+ }
+ };
+---- foo+bar.dts -------------------------------------------------------------
+
+As a result of the the overlay, a new device node (bar) has been created
+so a bar platform device will be registered and if a matching device driver
+is loaded the device will be created as expected.
+
+Overlay in-kernel API
+---------------------
+
+The steps typically required to get an overlay to work are as follows:
+
+1. Use of_build_overlay_info() to create an array of initialized and
+ready to use of_overlay_info structures.
+2. Call of_overlay() to apply the overlays declared in the array.
+3. If the overlay needs to be removed, call of_overlay_revert().
+4. Finally release the memory taken by the overlay info array by
+of_free_overlay_info().
+
+/**
+ * of_build_overlay_info - Build an overlay info array
+ * @tree: Device node containing all the overlays
+ * @cntp: Pointer to where the overlay info count will be help
+ * @ovinfop: Pointer to the pointer of an overlay info structure.
+ *
+ * Helper function that given a tree containing overlay information,
+ * allocates and builds an overlay info array containing it, ready
+ * for use using of_overlay.
+ *
+ * Returns 0 on success with the @cntp @ovinfop pointers valid,
+ * while on error a negative error value is returned.
+ */
+int of_build_overlay_info(struct device_node *tree,
+ int *cntp, struct of_overlay_info **ovinfop);
+
+/**
+ * of_free_overlay_info - Free an overlay info array
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to free
+ *
+ * Releases the memory of a previously allocate ovinfo array
+ * by of_build_overlay_info.
+ * Returns 0, or an error if the arguments are bogus.
+ */
+int of_free_overlay_info(int count, struct of_overlay_info *ovinfo_tab);
+
+/**
+ * of_overlay - Apply @count overlays pointed at by @ovinfo_tab
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to apply
+ *
+ * Applies the overlays given, while handling all error conditions
+ * appropriately. Either the operation succeeds, or if it fails the
+ * live tree is reverted to the state before the attempt.
+ * Returns 0, or an error if the overlay attempt failed.
+ */
+int of_overlay(int count, struct of_overlay_info *ovinfo_tab);
+
+/**
+ * of_overlay_revert - Revert a previously applied overlay
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to apply
+ *
+ * Revert a previous overlay. The state of the live tree
+ * is reverted to the one before the overlay.
+ * Returns 0, or an error if the overlay table is not given.
+ */
+int of_overlay_revert(int count, struct of_overlay_info *ovinfo_tab);
+
+Overlay DTS Format
+------------------
+
+The DTS of an overlay should have the following format:
+
+{
+ /* ignored properties by the overlay */
+
+ fragment@0 { /* first child node */
+ target=<phandle>; /* target of the overlay */
+ __overlay__ {
+ property-a; /* add property-a to the target */
+ -property-b; /* remove property-b from target */
+ node-a { /* add to an existing, or create a node-a */
+ ...
+ };
+ -node-b { /* remove an existing node-b */
+ ...
+ };
+ };
+ }
+ fragment@1 { /* second child node */
+ ...
+ };
+ /* more fragments follow */
+}
+
+It should be noted that the DT overlay format described is the one expected
+by the of_build_overlay_info() function, which is a helper function. There
+is nothing stopping someone coming up with his own DTS format and that will
+end up filling in the fields of the of_overlay_info array.
diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index 2a00ae5..c2d5596 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -83,4 +83,14 @@ config OF_RESOLVE
Enable OF dynamic resolution support. This allows you to
load Device Tree object fragments are run time.

+config OF_OVERLAY
+ bool "OF overlay support"
+ depends on OF
+ select OF_DYNAMIC
+ select OF_DEVICE
+ select OF_RESOLVE
+ help
+ OpenFirmware overlay support. Allows you to modify on runtime the
+ live tree using overlays.
+
endmenu # OF
diff --git a/drivers/of/Makefile b/drivers/of/Makefile
index 93da457..ca466e4 100644
--- a/drivers/of/Makefile
+++ b/drivers/of/Makefile
@@ -10,3 +10,4 @@ obj-$(CONFIG_OF_PCI) += of_pci.o
obj-$(CONFIG_OF_PCI_IRQ) += of_pci_irq.o
obj-$(CONFIG_OF_MTD) += of_mtd.o
obj-$(CONFIG_OF_RESOLVE) += resolver.o
+obj-$(CONFIG_OF_OVERLAY) += overlay.o
diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
new file mode 100644
index 0000000..5bed1f4
--- /dev/null
+++ b/drivers/of/overlay.c
@@ -0,0 +1,869 @@
+/*
+ * Functions for working with device tree overlays
+ *
+ * Copyright (C) 2012 Pantelis Antoniou <pa...@antoniou-consulting.com>
+ * Copyright (C) 2012 Texas Instruments Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ */
+#undef DEBUG
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/i2c.h>
+#include <linux/string.h>
+#include <linux/ctype.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/err.h>
+
+/**
+ * Apply a single overlay node recursively.
+ *
+ * Property or node names that start with '-' signal that
+ * the property/node is to be removed.
+ *
+ * All the property notifiers are appropriately called.
+ * Note that the in case of an error the target node is left
+ * in a inconsistent state. Error recovery should be performed
+ * by recording the modification using the of notifiers.
+ */
+static int of_overlay_apply_one(struct device_node *target,
+ const struct device_node *overlay)
+{
+ const char *pname, *cname;
+ struct device_node *child, *tchild;
+ struct property *prop, *propn, *tprop;
+ int remove;
+ char *full_name;
+ const char *suffix;
+ int ret;
+
+ /* sanity checks */
+ if (target == NULL || overlay == NULL)
+ return -EINVAL;
+
+ for_each_property_of_node(overlay, prop) {
+
+ /* don't touch, 'name' */
+ if (of_prop_cmp(prop->name, "name") == 0)
+ continue;
+
+ /* default is add */
+ remove = 0;
+ pname = prop->name;
+ if (*pname == '-') { /* skip, - notes removal */
+ pname++;
+ remove = 1;
+ propn = NULL;
+ } else {
+ propn = __of_copy_property(prop,
+ GFP_KERNEL);
+ if (propn == NULL)
+ return -ENOMEM;
+ }
+
+ tprop = of_find_property(target, pname, NULL);
+
+ /* found? */
+ if (tprop != NULL) {
+ if (propn != NULL)
+ ret = of_update_property(target, propn);
+ else
+ ret = of_remove_property(target, tprop);
+ } else {
+ if (propn != NULL)
+ ret = of_add_property(target, propn);
+ else
+ ret = 0;
+ }
+ if (ret != 0)
+ return ret;
+ }
+
+ __for_each_child_of_node(overlay, child) {
+
+ /* default is add */
+ remove = 0;
+ cname = child->name;
+ if (*cname == '-') { /* skip, - notes removal */
+ cname++;
+ remove = 1;
+ }
+
+ /* special case for nodes with a suffix */
+ suffix = strrchr(child->full_name, '@');
+ if (suffix != NULL) {
+ cname = kbasename(child->full_name);
+ WARN_ON(cname == NULL); /* sanity check */
+ if (cname == NULL)
+ continue;
+ if (*cname == '-')
+ cname++;
+ }
+
+ tchild = of_get_child_by_name(target, cname);
+ if (tchild != NULL) {
+
+ if (!remove) {
+
+ /* apply overlay recursively */
+ ret = of_overlay_apply_one(tchild, child);
+ of_node_put(tchild);
+
+ if (ret != 0)
+ return ret;
+
+ } else {
+
+ ret = of_detach_node(tchild);
+ of_node_put(tchild);
+ }
+
+ } else {
+
+ if (!remove) {
+ full_name = kasprintf(GFP_KERNEL, "%s/%s",
+ target->full_name, cname);
+ if (full_name == NULL)
+ return -ENOMEM;
+
+ /* create empty tree as a target */
+ tchild = __of_create_empty_node(cname,
+ child->type, full_name,
+ child->phandle, GFP_KERNEL);
+
+ /* free either way */
+ kfree(full_name);
+
+ if (tchild == NULL)
+ return -ENOMEM;
+
+ /* point to parent */
+ tchild->parent = target;
+
+ ret = of_attach_node(tchild);
+ if (ret != 0)
+ return ret;
+
+ /* apply the overlay */
+ ret = of_overlay_apply_one(tchild, child);
+ if (ret != 0) {
+ __of_free_tree(tchild);
+ return ret;
+ }
+ }
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * Lookup an overlay device entry
+ */
+struct of_overlay_device_entry *of_overlay_device_entry_lookup(
+ struct of_overlay_info *ovinfo, struct device_node *node)
+{
+ struct of_overlay_device_entry *de;
+
+ /* no need for locks, we'de under the ovinfo->lock */
+ list_for_each_entry(de, &ovinfo->de_list, node) {
+ if (de->np == node)
+ return de;
+ }
+ return NULL;
+}
+
+/**
+ * Add an overlay log entry
+ */
+static int of_overlay_log_entry_entry_add(struct of_overlay_info *ovinfo,
+ unsigned long action, struct device_node *dn,
+ struct property *prop)
+{
+ struct of_overlay_log_entry *le;
+
+ /* check */
+ if (ovinfo == NULL || dn == NULL)
+ return -EINVAL;
+
+ le = kzalloc(sizeof(*le), GFP_KERNEL);
+ if (le == NULL) {
+ pr_err("%s: Failed to allocate\n", __func__);
+ return -ENOMEM;
+ }
+
+ /* get a reference to the node */
+ le->action = action;
+ le->np = of_node_get(dn);
+ le->prop = prop;
+
+ if (action == OF_RECONFIG_UPDATE_PROPERTY && prop)
+ le->old_prop = of_find_property(dn, prop->name, NULL);
+
+ list_add_tail(&le->node, &ovinfo->le_list);
+
+ return 0;
+}
+
+/**
+ * Add an overlay device entry
+ */
+static void of_overlay_device_entry_entry_add(struct of_overlay_info *ovinfo,
+ struct device_node *node,
+ struct platform_device *pdev, struct i2c_client *client,
+ int prevstate, int state)
+{
+ struct of_overlay_device_entry *de;
+ int fresh;
+
+ /* check */
+ if (ovinfo == NULL)
+ return;
+
+ fresh = 0;
+ de = of_overlay_device_entry_lookup(ovinfo, node);
+ if (de == NULL) {
+ de = kzalloc(sizeof(*de), GFP_KERNEL);
+ if (de == NULL) {
+ pr_err("%s: Failed to allocate\n", __func__);
+ return;
+ }
+ fresh = 1;
+ de->prevstate = -1;
+ }
+
+ if (de->np == NULL)
+ de->np = of_node_get(node);
+ if (de->pdev == NULL && pdev)
+ de->pdev = of_dev_get(pdev);
+ if (de->client == NULL && client)
+ de->client = i2c_use_client(client);
+ if (fresh)
+ de->prevstate = prevstate;
+ de->state = state;
+
+ if (fresh)
+ list_add_tail(&de->node, &ovinfo->de_list);
+}
+
+/**
+ * Overlay OF notifier
+ *
+ * Called every time there's a property/node modification
+ * Every modification causes a log entry addition, while
+ * any modification that causes a node's state to change
+ * from/to disabled to/from enabled causes a device entry
+ * addition.
+ */
+static int of_overlay_notify(struct notifier_block *nb,
+ unsigned long action, void *arg)
+{
+ struct of_overlay_info *ovinfo;
+ struct device_node *node;
+ struct property *prop, *sprop, *cprop;
+ struct of_prop_reconfig *pr;
+ struct platform_device *pdev;
+ struct i2c_client *client;
+ struct device_node *tnode;
+ int depth;
+ int prevstate, state;
+ int err = 0;
+
+ ovinfo = container_of(nb, struct of_overlay_info, notifier);
+
+ /* prep vars */
+ switch (action) {
+ case OF_RECONFIG_ATTACH_NODE:
+ case OF_RECONFIG_DETACH_NODE:
+ node = arg;
+ if (node == NULL)
+ return notifier_from_errno(-EINVAL);
+ prop = NULL;
+ break;
+ case OF_RECONFIG_ADD_PROPERTY:
+ case OF_RECONFIG_REMOVE_PROPERTY:
+ case OF_RECONFIG_UPDATE_PROPERTY:
+ pr = arg;
+ if (pr == NULL)
+ return notifier_from_errno(-EINVAL);
+ node = pr->dn;
+ if (node == NULL)
+ return notifier_from_errno(-EINVAL);
+ prop = pr->prop;
+ if (prop == NULL)
+ return notifier_from_errno(-EINVAL);
+ break;
+ default:
+ return notifier_from_errno(0);
+ }
+
+ /* add to the log */
+ err = of_overlay_log_entry_entry_add(ovinfo, action, node, prop);
+ if (err != 0)
+ return notifier_from_errno(err);
+
+ /* come up with the device entry (if any) */
+ pdev = NULL;
+ client = NULL;
+ state = 0;
+ prevstate = 0;
+
+ /* determine the state the node will end up */
+ switch (action) {
+ case OF_RECONFIG_ATTACH_NODE:
+ /* we demand that a compatible node is present */
+ state = of_find_property(node, "compatible", NULL) &&
+ of_device_is_available(node);
+ break;
+ case OF_RECONFIG_DETACH_NODE:
+ prevstate = of_find_property(node, "compatible", NULL) &&
+ of_device_is_available(node);
+ state = 0;
+ pdev = of_find_device_by_node(node);
+ client = of_find_i2c_device_by_node(node);
+ break;
+ case OF_RECONFIG_ADD_PROPERTY:
+ case OF_RECONFIG_REMOVE_PROPERTY:
+ case OF_RECONFIG_UPDATE_PROPERTY:
+ /* either one cause a change in state */
+ if (strcmp(prop->name, "status") != 0 &&
+ strcmp(prop->name, "compatible") != 0)
+ return notifier_from_errno(0);
+
+ if (strcmp(prop->name, "status") == 0) {
+ /* status */
+ cprop = of_find_property(node, "compatible", NULL);
+ sprop = action != OF_RECONFIG_REMOVE_PROPERTY ?
+ prop : NULL;
+ } else {
+ /* compatible */
+ sprop = of_find_property(node, "status", NULL);
+ cprop = action != OF_RECONFIG_REMOVE_PROPERTY ?
+ prop : NULL;
+ }
+
+ prevstate = of_find_property(node, "compatible", NULL) &&
+ of_device_is_available(node);
+ state = cprop && cprop->length > 0 &&
+ (!sprop || (sprop->length > 0 &&
+ (strcmp(sprop->value, "okay") == 0 ||
+ strcmp(sprop->value, "ok") == 0)));
+ break;
+
+ default:
+ return notifier_from_errno(0);
+ }
+
+ /* find depth */
+ depth = 1;
+ tnode = node;
+ while (tnode != NULL && tnode != ovinfo->target) {
+ tnode = tnode->parent;
+ depth++;
+ }
+
+ /* respect overlay's maximum depth */
+ if (ovinfo->device_depth != 0 && depth > ovinfo->device_depth) {
+ pr_debug("OF: skipping device creation for node=%s depth=%d\n",
+ node->name, depth);
+ goto out;
+ }
+
+ if (state == 0) {
+ pdev = of_find_device_by_node(node);
+ client = of_find_i2c_device_by_node(node);
+ }
+
+ of_overlay_device_entry_entry_add(ovinfo, node, pdev, client,
+ prevstate, state);
+out:
+
+ return notifier_from_errno(err);
+}
+
+/**
+ * Prepare for the overlay, for now it just registers the
+ * notifier.
+ */
+static int of_overlay_prep_one(struct of_overlay_info *ovinfo)
+{
+ int err;
+
+ err = of_reconfig_notifier_register(&ovinfo->notifier);
+ if (err != 0) {
+ pr_err("%s: failed to register notifier for '%s'\n",
+ __func__, ovinfo->target->full_name);
+ return err;
+ }
+ return 0;
+}
+
+static int of_overlay_device_entry_change(struct of_overlay_info *ovinfo,
+ struct of_overlay_device_entry *de, int revert)
+{
+ struct i2c_adapter *adap = NULL;
+ struct i2c_client *client;
+ struct platform_device *pdev, *parent_pdev = NULL;
+ int state;
+
+ state = !!de->state ^ !!revert;
+
+ if (de->np && de->np->parent) {
+ pr_debug("%s: parent is %s\n",
+ __func__, de->np->parent->full_name);
+ adap = of_find_i2c_adapter_by_node(de->np->parent);
+ if (adap == NULL)
+ parent_pdev = of_find_device_by_node(de->np->parent);
+ }
+
+ if (state) {
+
+ /* try to see if it's an I2C client node */
+ if (adap == NULL) {
+
+ pr_debug("%s: creating new platform device "
+ "new_node='%s' %p\n",
+ __func__, de->np->full_name, de->np);
+
+ pdev = of_platform_device_create(de->np, NULL,
+ parent_pdev ? &parent_pdev->dev : NULL);
+ if (pdev == NULL) {
+ pr_warn("%s: Failed to create platform device "
+ "for '%s'\n",
+ __func__, de->np->full_name);
+ } else
+ de->pdev = pdev;
+
+ } else {
+
+ client = of_find_i2c_device_by_node(de->np);
+ if (client != NULL) {
+ /* bus already created the device; do nothing */
+ put_device(&client->dev);
+ } else {
+ pr_debug("%s: creating new i2c_client device "
+ "new_node='%s' %p\n",
+ __func__, de->np->full_name, de->np);
+
+ client = of_i2c_register_device(adap, de->np);
+
+ if (client == NULL) {
+ pr_warn("%s: Failed to create i2c client device "
+ "for '%s'\n",
+ __func__, de->np->full_name);
+ } else
+ de->client = client;
+ }
+ }
+
+ } else {
+
+ if (de->pdev) {
+ pr_debug("%s: removing pdev %s\n", __func__,
+ dev_name(&de->pdev->dev));
+ platform_device_unregister(de->pdev);
+ de->pdev = NULL;
+ }
+
+ if (de->client) {
+ pr_debug("%s: removing i2c client %s\n", __func__,
+ dev_name(&de->client->dev));
+ i2c_unregister_device(de->client);
+ de->client = NULL;
+ }
+ }
+
+ if (adap)
+ put_device(&adap->dev);
+
+ if (parent_pdev)
+ of_dev_put(parent_pdev);
+
+ return 0;
+}
+
+/**
+ * Revert one overlay
+ * Either due to an error, or due to normal overlay removal.
+ * Using the log entries, we revert any change to the live tree.
+ * In the same manner, using the device entries we enable/disable
+ * the platform devices appropriately.
+ */
+static void of_overlay_revert_one(struct of_overlay_info *ovinfo)
+{
+ struct of_overlay_device_entry *de, *den;
+ struct of_overlay_log_entry *le, *len;
+ struct property *prop, **propp;
+ int ret;
+ unsigned long flags;
+
+ if (!ovinfo || !ovinfo->target || !ovinfo->overlay)
+ return;
+
+ pr_debug("%s: Reverting overlay on '%s'\n", __func__,
+ ovinfo->target->full_name);
+
+ /* overlay applied correctly, now create/destroy pdevs */
+ list_for_each_entry_safe_reverse(de, den, &ovinfo->de_list, node) {
+ of_overlay_device_entry_change(ovinfo, de, 1);
+ of_node_put(de->np);
+ list_del(&de->node);
+ kfree(de);
+ }
+
+ list_for_each_entry_safe_reverse(le, len, &ovinfo->le_list, node) {
+
+ ret = 0;
+ switch (le->action) {
+ case OF_RECONFIG_ATTACH_NODE:
+ pr_debug("Reverting ATTACH_NODE %s\n",
+ le->np->full_name);
+ ret = of_detach_node(le->np);
+ break;
+
+ case OF_RECONFIG_DETACH_NODE:
+ pr_debug("Reverting DETACH_NODE %s\n",
+ le->np->full_name);
+ ret = of_attach_node(le->np);
+ break;
+
+ case OF_RECONFIG_ADD_PROPERTY:
+ pr_debug("Reverting ADD_PROPERTY %s %s\n",
+ le->np->full_name, le->prop->name);
+ ret = of_remove_property(le->np, le->prop);
+ break;
+
+ case OF_RECONFIG_REMOVE_PROPERTY:
+ case OF_RECONFIG_UPDATE_PROPERTY:
+
+ pr_debug("Reverting %s_PROPERTY %s %s\n",
+ le->action == OF_RECONFIG_REMOVE_PROPERTY ?
+ "REMOVE" : "UPDATE",
+ le->np->full_name, le->prop->name);
+
+ /* property is possibly on deadprops (avoid alloc) */
+ raw_spin_lock_irqsave(&devtree_lock, flags);
+ prop = le->action == OF_RECONFIG_REMOVE_PROPERTY ?
+ le->prop : le->old_prop;
+ propp = &le->np->deadprops;
+ while (*propp != NULL) {
+ if (*propp == prop)
+ break;
+ propp = &(*propp)->next;
+ }
+ if (*propp != NULL) {
+ /* remove it from deadprops */
+ (*propp)->next = prop->next;
+ raw_spin_unlock_irqrestore(&devtree_lock,
+ flags);
+ } else {
+ raw_spin_unlock_irqrestore(&devtree_lock,
+ flags);
+ /* not found, just make a copy */
+ prop = __of_copy_property(prop, GFP_KERNEL);
+ if (prop == NULL) {
+ pr_err("%s: Failed to copy property\n",
+ __func__);
+ break;
+ }
+ }
+
+ if (le->action == OF_RECONFIG_REMOVE_PROPERTY)
+ ret = of_add_property(le->np, prop);
+ else
+ ret = of_update_property(le->np, prop);
+ break;
+
+ default:
+ /* nothing */
+ break;
+ }
+
+ if (ret != 0)
+ pr_err("%s: revert on node %s Failed!\n",
+ __func__, le->np->full_name);
+
+ of_node_put(le->np);
+
+ list_del(&le->node);
+
+ kfree(le);
+ }
+}
+
+/**
+ * Perform the post overlay work.
+ *
+ * We unregister the notifier, and in the case on an error we
+ * revert the overlay.
+ * If the overlay applied correctly, we iterare over the device entries
+ * and create/destroy the platform devices appropriately.
+ */
+static int of_overlay_post_one(struct of_overlay_info *ovinfo, int err)
+{
+ struct of_overlay_device_entry *de, *den;
+
+ of_reconfig_notifier_unregister(&ovinfo->notifier);
+
+ if (err != 0) {
+ /* revert this (possible partially applied) overlay */
+ of_overlay_revert_one(ovinfo);
+ return 0;
+ }
+
+ /* overlay applied correctly, now create/destroy pdevs */
+ list_for_each_entry_safe(de, den, &ovinfo->de_list, node) {
+
+ /* no state change? just remove this entry */
+ if (de->prevstate == de->state) {
+ of_node_put(de->np);
+ list_del(&de->node);
+ kfree(de);
+ } else {
+ of_overlay_device_entry_change(ovinfo, de, 0);
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * of_overlay - Apply @count overlays pointed at by @ovinfo_tab
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to apply
+ *
+ * Applies the overlays given, while handling all error conditions
+ * appropriately. Either the operation succeeds, or if it fails the
+ * live tree is reverted to the state before the attempt.
+ * Returns 0, or an error if the overlay attempt failed.
+ */
+int of_overlay(int count, struct of_overlay_info *ovinfo_tab)
+{
+ struct of_overlay_info *ovinfo;
+ int i, err;
+
+ if (!ovinfo_tab)
+ return -EINVAL;
+
+ /* first we apply the overlays atomically */
+ for (i = 0; i < count; i++) {
+
+ ovinfo = &ovinfo_tab[i];
+
+ mutex_lock(&ovinfo->lock);
+
+ err = of_overlay_prep_one(ovinfo);
+ if (err == 0)
+ err = of_overlay_apply_one(ovinfo->target,
+ ovinfo->overlay);
+ of_overlay_post_one(ovinfo, err);
+
+ mutex_unlock(&ovinfo->lock);
+
+ if (err != 0) {
+ pr_err("%s: overlay failed '%s'\n",
+ __func__, ovinfo->target->full_name);
+ goto err_fail;
+ }
+ }
+
+ return 0;
+
+err_fail:
+ while (--i >= 0) {
+ ovinfo = &ovinfo_tab[i];
+
+ mutex_lock(&ovinfo->lock);
+ of_overlay_revert_one(ovinfo);
+ mutex_unlock(&ovinfo->lock);
+ }
+
+ return err;
+}
+
+/**
+ * of_overlay_revert - Revert a previously applied overlay
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to apply
+ *
+ * Revert a previous overlay. The state of the live tree
+ * is reverted to the one before the overlay.
+ * Returns 0, or an error if the overlay table is not given.
+ */
+int of_overlay_revert(int count, struct of_overlay_info *ovinfo_tab)
+{
+ struct of_overlay_info *ovinfo;
+ int i;
+
+ if (!ovinfo_tab)
+ return -EINVAL;
+
+ /* revert the overlays in reverse */
+ for (i = count - 1; i >= 0; i--) {
+
+ ovinfo = &ovinfo_tab[i];
+
+ mutex_lock(&ovinfo->lock);
+ of_overlay_revert_one(ovinfo);
+ mutex_unlock(&ovinfo->lock);
+
+ }
+
+ return 0;
+}
+
+/**
+ * of_init_overlay_info - Initialize a single of_overlay_info structure
+ * @ovinfo: Pointer to the overlay info structure to initialize
+ *
+ * Initialize a single overlay info structure.
+ */
+void of_init_overlay_info(struct of_overlay_info *ovinfo)
+{
+ memset(ovinfo, 0, sizeof(ovinfo));
+ mutex_init(&ovinfo->lock);
+ INIT_LIST_HEAD(&ovinfo->de_list);
+ INIT_LIST_HEAD(&ovinfo->le_list);
+
+ ovinfo->notifier.notifier_call = of_overlay_notify;
+}
+
+/**
+ * of_fill_overlay_info - Fill an overlay info structure
+ * @info_node: Device node containing the overlay
+ * @ovinfo: Pointer to the overlay info structure to fill
+ *
+ * Fills an overlay info structure with the overlay information
+ * from a device node. This device node must have a target property
+ * which contains a phandle of the overlay target node, and an
+ * __overlay__ child node which has the overlay contents.
+ * Both ovinfo->target & ovinfo->overlay have their references taken.
+ *
+ * Returns 0 on success, or a negative error value.
+ */
+int of_fill_overlay_info(struct device_node *info_node,
+ struct of_overlay_info *ovinfo)
+{
+ u32 val;
+ int ret;
+
+ if (!info_node || !ovinfo)
+ return -EINVAL;
+
+ ret = of_property_read_u32(info_node, "target", &val);
+ if (ret != 0)
+ goto err_fail;
+
+ ovinfo->target = of_find_node_by_phandle(val);
+ if (ovinfo->target == NULL)
+ goto err_fail;
+
+ ovinfo->overlay = of_get_child_by_name(info_node, "__overlay__");
+ if (ovinfo->overlay == NULL)
+ goto err_fail;
+
+ ret = of_property_read_u32(info_node, "depth", &val);
+ if (ret == 0)
+ ovinfo->device_depth = val;
+ else
+ ovinfo->device_depth = 0;
+
+
+ return 0;
+
+err_fail:
+ of_node_put(ovinfo->target);
+ of_node_put(ovinfo->overlay);
+
+ memset(ovinfo, 0, sizeof(*ovinfo));
+ return -EINVAL;
+}
+
+/**
+ * of_build_overlay_info - Build an overlay info array
+ * @tree: Device node containing all the overlays
+ * @cntp: Pointer to where the overlay info count will be help
+ * @ovinfop: Pointer to the pointer of an overlay info structure.
+ *
+ * Helper function that given a tree containing overlay information,
+ * allocates and builds an overlay info array containing it, ready
+ * for use using of_overlay.
+ *
+ * Returns 0 on success with the @cntp @ovinfop pointers valid,
+ * while on error a negative error value is returned.
+ */
+int of_build_overlay_info(struct device_node *tree,
+ int *cntp, struct of_overlay_info **ovinfop)
+{
+ struct device_node *node;
+ struct of_overlay_info *ovinfo;
+ int cnt, err;
+
+ if (tree == NULL || cntp == NULL || ovinfop == NULL)
+ return -EINVAL;
+
+ /* worst case; every child is a node */
+ cnt = 0;
+ for_each_child_of_node(tree, node)
+ cnt++;
+
+ ovinfo = kzalloc(cnt * sizeof(*ovinfo), GFP_KERNEL);
+ if (ovinfo == NULL)
+ return -ENOMEM;
+
+ cnt = 0;
+ for_each_child_of_node(tree, node) {
+
+ of_init_overlay_info(&ovinfo[cnt]);
+ err = of_fill_overlay_info(node, &ovinfo[cnt]);
+ if (err == 0)
+ cnt++;
+ }
+
+ /* if nothing filled, return error */
+ if (cnt == 0) {
+ kfree(ovinfo);
+ return -ENODEV;
+ }
+
+ *cntp = cnt;
+ *ovinfop = ovinfo;
+
+ return 0;
+}
+
+/**
+ * of_free_overlay_info - Free an overlay info array
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to free
+ *
+ * Releases the memory of a previously allocate ovinfo array
+ * by of_build_overlay_info.
+ * Returns 0, or an error if the arguments are bogus.
+ */
+int of_free_overlay_info(int count, struct of_overlay_info *ovinfo_tab)
+{
+ struct of_overlay_info *ovinfo;
+ int i;
+
+ if (!ovinfo_tab || count < 0)
+ return -EINVAL;
+
+ /* do it in reverse */
+ for (i = count - 1; i >= 0; i--) {
+ ovinfo = &ovinfo_tab[i];
+
+ of_node_put(ovinfo->target);
+ of_node_put(ovinfo->overlay);
+ }
+ kfree(ovinfo_tab);
+
+ return 0;
+}
+
diff --git a/include/linux/of.h b/include/linux/of.h
index 22d42e5..d9c8130 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -23,6 +23,7 @@
#include <linux/spinlock.h>
#include <linux/topology.h>
#include <linux/notifier.h>
+#include <linux/i2c.h>

#include <asm/byteorder.h>
#include <asm/errno.h>
@@ -738,4 +739,116 @@ static inline int of_resolve(struct device_node *resolve)

#endif

+/**
+ * Overlay support
+ */
+
+/**
+ * struct of_overlay_log_entry - Holds a DT log entry
+ * @node: list_head for the log list
+ * @action: notifier action
+ * @np: pointer to the device node affected
+ * @prop: pointer to the property affected
+ * @old_prop: hold a pointer to the original property
+ *
+ * Every modification of the device tree during application of the
+ * overlay is held in a list of of_overlay_log_entry structures.
+ * That way we can recover from a partial application, or we can
+ * revert the overlay properly.
+ */
+struct of_overlay_log_entry {
+ struct list_head node;
+ unsigned long action;
+ struct device_node *np;
+ struct property *prop;
+ struct property *old_prop;
+};
+
+/**
+ * struct of_overlay_device_entry - Holds an overlay device entry
+ * @node: list_head for the device list
+ * @np: device node pointer to the device node affected
+ * @pdev: pointer to the platform device affected
+ * @client: pointer to the I2C client device affected
+ * @state: new device state
+ * @prevstate: previous device state
+ *
+ * When the overlay results in a device node's state to change this
+ * fact is recorded in a list of device entries. After the overlay
+ * is applied we can create/destroy the platform devices according
+ * to the new state of the live tree.
+ */
+struct of_overlay_device_entry {
+ struct list_head node;
+ struct device_node *np;
+ struct platform_device *pdev;
+ struct i2c_client *client;
+ int prevstate;
+ int state;
+};
+
+/**
+ * struct of_overlay_info - Holds a single overlay info
+ * @target: target of the overlay operation
+ * @overlay: pointer to the overlay contents node
+ * @lock: Lock to hold when accessing the lists
+ * @le_list: List of the overlay logs
+ * @de_list: List of the overlay records
+ * @notifier: of reconfiguration notifier
+ *
+ * Holds a single overlay state, including all the overlay logs &
+ * records.
+ */
+struct of_overlay_info {
+ struct device_node *target;
+ struct device_node *overlay;
+ struct mutex lock;
+ struct list_head le_list;
+ struct list_head de_list;
+ struct notifier_block notifier;
+ int device_depth;
+};
+
+#ifdef CONFIG_OF_OVERLAY
+
+int of_overlay(int count, struct of_overlay_info *ovinfo_tab);
+int of_overlay_revert(int count, struct of_overlay_info *ovinfo_tab);
+
+int of_fill_overlay_info(struct device_node *info_node,
+ struct of_overlay_info *ovinfo);
+int of_build_overlay_info(struct device_node *tree,
+ int *cntp, struct of_overlay_info **ovinfop);
+int of_free_overlay_info(int cnt, struct of_overlay_info *ovinfo);
+
+#else
+
+static inline int of_overlay(int count, struct of_overlay_info *ovinfo_tab)
+{
+ return -ENOTSUPP;
+}
+
+static inline int of_overlay_revert(int count, struct of_overlay_info *ovinfo_tab)
+{
+ return -ENOTSUPP;
+}
+
+static inline int of_fill_overlay_info(struct device_node *info_node,
+ struct of_overlay_info *ovinfo)
+{
+ return -ENOTSUPP;
+}
+
+static inline int of_build_overlay_info(struct device_node *tree,
+ int *cntp, struct of_overlay_info **ovinfop)
+{
+ return -ENOTSUPP;
+}
+
+int of_free_overlay_info(int cnt, struct of_overlay_info *ovinfo)
+{
+ return -ENOTSUPP;
+}
+
+#endif
+
#endif /* _LINUX_OF_H */
--
1.7.12

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Pantelis Antoniou

unread,
Nov 5, 2013, 1:50:02 PM11/5/13
to
Introduce support for dynamic device tree resolution.
Using it, it is possible to prepare a device tree that's
been loaded on runtime to be modified and inserted at the kernel
live tree.

Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
---
.../devicetree/dynamic-resolution-notes.txt | 25 ++
drivers/of/Kconfig | 9 +
drivers/of/Makefile | 1 +
drivers/of/resolver.c | 395 +++++++++++++++++++++
include/linux/of.h | 17 +
5 files changed, 447 insertions(+)
create mode 100644 Documentation/devicetree/dynamic-resolution-notes.txt
create mode 100644 drivers/of/resolver.c

diff --git a/Documentation/devicetree/dynamic-resolution-notes.txt b/Documentation/devicetree/dynamic-resolution-notes.txt
new file mode 100644
index 0000000..0b396c4
--- /dev/null
+++ b/Documentation/devicetree/dynamic-resolution-notes.txt
@@ -0,0 +1,25 @@
+Device Tree Dynamic Resolver Notes
+----------------------------------
+
+This document describes the implementation of the in-kernel
+Device Tree resolver, residing in drivers/of/resolver.c and is a
+companion document to Documentation/devicetree/dt-object-internal.txt[1]
+
+How the resolver works
+----------------------
+
+The resolver is given as an input an arbitrary tree compiled with the
+proper dtc option and having a /plugin/ tag. This generates the
+appropriate __fixups__ & __local_fixups__ nodes as described in [1].
+
+In sequence the resolver works by the following steps:
+
+1. Get the maximum device tree phandle value from the live tree + 1.
+2. Adjust all the local phandles of the tree to resolve by that amount.
+3. Using the __local__fixups__ node information adjust all local references
+ by the same amount.
+4. For each property in the __fixups__ node locate the node it references
+ in the live tree. This is the label used to tag the node.
+5. Retrieve the phandle of the target of the fixup.
+5. For each fixup in the property locate the node:property:offset location
+ and replace it with the phandle value.
diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index 78cc760..2a00ae5 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -74,4 +74,13 @@ config OF_MTD
depends on MTD
def_bool y

+config OF_RESOLVE
+ bool "OF Dynamic resolution support"
+ depends on OF
+ select OF_DYNAMIC
+ select OF_DEVICE
+ help
+ Enable OF dynamic resolution support. This allows you to
+ load Device Tree object fragments are run time.
+
endmenu # OF
diff --git a/drivers/of/Makefile b/drivers/of/Makefile
index 9bc6d8c..93da457 100644
--- a/drivers/of/Makefile
+++ b/drivers/of/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_OF_MDIO) += of_mdio.o
obj-$(CONFIG_OF_PCI) += of_pci.o
obj-$(CONFIG_OF_PCI_IRQ) += of_pci_irq.o
obj-$(CONFIG_OF_MTD) += of_mtd.o
+obj-$(CONFIG_OF_RESOLVE) += resolver.o
diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c
new file mode 100644
index 0000000..da8cd34
--- /dev/null
+++ b/drivers/of/resolver.c
@@ -0,0 +1,395 @@
+/*
+ * Functions for dealing with DT resolution
+ *
+ * Copyright (C) 2012 Pantelis Antoniou <pa...@antoniou-consulting.com>
+ * Copyright (C) 2012 Texas Instruments Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/of_fdt.h>
+#include <linux/string.h>
+#include <linux/ctype.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+
+/**
+ * Find a subtree's maximum phandle value.
+ */
+static phandle __of_get_tree_max_phandle(struct device_node *node,
+ phandle max_phandle)
+{
+ struct device_node *child;
+
+ if (node->phandle != 0 && node->phandle != OF_PHANDLE_ILLEGAL &&
+ node->phandle > max_phandle)
+ max_phandle = node->phandle;
+
+ __for_each_child_of_node(node, child)
+ max_phandle = __of_get_tree_max_phandle(child, max_phandle);
+
+ return max_phandle;
+}
+
+/**
+ * Find live tree's maximum phandle value.
+ */
+static phandle of_get_tree_max_phandle(void)
+{
+ struct device_node *node;
+ phandle phandle;
+ unsigned long flags;
+
+ /* get root node */
+ node = of_find_node_by_path("/");
+ if (node == NULL)
+ return OF_PHANDLE_ILLEGAL;
+
+ /* now search recursively */
+ raw_spin_lock_irqsave(&devtree_lock, flags);
+ phandle = __of_get_tree_max_phandle(node, 0);
+ raw_spin_unlock_irqrestore(&devtree_lock, flags);
+
+ of_node_put(node);
+
+ return phandle;
+}
+
+/**
+ * Adjust a subtree's phandle values by a given delta.
+ * Makes sure not to just adjust the device node's phandle value,
+ * but modify the phandle properties values as well.
+ */
+static void __of_adjust_tree_phandles(struct device_node *node,
+ int phandle_delta)
+{
+ struct device_node *child;
+ struct property *prop;
+ phandle phandle;
+
+ /* first adjust the node's phandle direct value */
+ if (node->phandle != 0 && node->phandle != OF_PHANDLE_ILLEGAL)
+ node->phandle += phandle_delta;
+
+ /* now adjust phandle & linux,phandle values */
+ for_each_property_of_node(node, prop) {
+
+ /* only look for these two */
+ if (of_prop_cmp(prop->name, "phandle") != 0 &&
+ of_prop_cmp(prop->name, "linux,phandle") != 0)
+ continue;
+
+ /* must be big enough */
+ if (prop->length < 4)
+ continue;
+
+ /* read phandle value */
+ phandle = be32_to_cpu(*(uint32_t *)prop->value);
+ if (phandle == OF_PHANDLE_ILLEGAL) /* unresolved */
+ continue;
+
+ /* adjust */
+ *(uint32_t *)prop->value = cpu_to_be32(node->phandle);
+ }
+
+ /* now do the children recursively */
+ __for_each_child_of_node(node, child)
+ __of_adjust_tree_phandles(child, phandle_delta);
+}
+
+/**
+ * Adjust the local phandle references by the given phandle delta.
+ * Assumes the existances of a __local_fixups__ node at the root
+ * of the tree. Does not take any devtree locks so make sure you
+ * call this on a tree which is at the detached state.
+ */
+static int __of_adjust_tree_phandle_references(struct device_node *node,
+ int phandle_delta)
+{
+ phandle phandle;
+ struct device_node *refnode, *child;
+ struct property *rprop, *sprop;
+ char *propval, *propcur, *propend, *nodestr, *propstr, *s;
+ int offset, propcurlen;
+ int err;
+
+ /* locate the symbols & fixups nodes on resolve */
+ __for_each_child_of_node(node, child)
+ if (of_node_cmp(child->name, "__local_fixups__") == 0)
+ break;
+
+ /* no local fixups */
+ if (child == NULL)
+ return 0;
+
+ /* find the local fixups property */
+ for_each_property_of_node(child, rprop) {
+
+ /* skip properties added automatically */
+ if (of_prop_cmp(rprop->name, "name") == 0)
+ continue;
+
+ /* make a copy */
+ propval = kmalloc(rprop->length, GFP_KERNEL);
+ if (propval == NULL) {
+ pr_err("%s: Could not copy value of '%s'\n",
+ __func__, rprop->name);
+ return -ENOMEM;
+ }
+ memcpy(propval, rprop->value, rprop->length);
+
+ propend = propval + rprop->length;
+ for (propcur = propval; propcur < propend;
+ propcur += propcurlen + 1) {
+
+ propcurlen = strlen(propcur);
+
+ nodestr = propcur;
+ s = strchr(propcur, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol entry '%s' (1)\n",
+ __func__, propcur);
+ err = -EINVAL;
+ goto err_fail;
+ }
+ *s++ = '\0';
+
+ propstr = s;
+ s = strchr(s, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol entry '%s' (2)\n",
+ __func__, (char *)rprop->value);
+ err = -EINVAL;
+ goto err_fail;
+ }
+
+ *s++ = '\0';
+ offset = simple_strtoul(s, NULL, 10);
+
+ /* look into the resolve node for the full path */
+ refnode = __of_find_node_by_full_name(node, nodestr);
+ if (refnode == NULL) {
+ pr_warn("%s: Could not find refnode '%s'\n",
+ __func__, (char *)rprop->value);
+ continue;
+ }
+
+ /* now find the property */
+ for_each_property_of_node(refnode, sprop) {
+ if (of_prop_cmp(sprop->name, propstr) == 0)
+ break;
+ }
+
+ if (sprop == NULL) {
+ pr_err("%s: Could not find property '%s'\n",
+ __func__, (char *)rprop->value);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ phandle = be32_to_cpu(*(uint32_t *)
+ (sprop->value + offset));
+ *(uint32_t *)(sprop->value + offset) =
+ cpu_to_be32(phandle + phandle_delta);
+ }
+
+ kfree(propval);
+ }
+
+ return 0;
+
+err_fail:
+ kfree(propval);
+ return err;
+}
+
+/**
+ * of_resolve - Resolve the given node against the live tree.
+ *
+ * @resolve: Node to resolve
+ *
+ * Perform dynamic Device Tree resolution against the live tree
+ * to the given node to resolve. This depends on the live tree
+ * having a __symbols__ node, and the resolve node the __fixups__ &
+ * __local_fixups__ nodes (if needed).
+ * The result of the operation is a resolve node that it's contents
+ * are fit to be inserted or operate upon the live tree.
+ * Returns 0 on success or a negative error value on error.
+ */
+int of_resolve(struct device_node *resolve)
+{
+ struct device_node *child, *refnode;
+ struct device_node *root_sym, *resolve_sym, *resolve_fix;
+ struct property *rprop, *sprop;
+ const char *refpath;
+ char *propval, *propcur, *propend, *nodestr, *propstr, *s;
+ int offset, propcurlen;
+ phandle phandle, phandle_delta;
+ int err;
+
+ /* the resolve node must exist, and be detached */
+ if (resolve == NULL ||
+ !of_node_check_flag(resolve, OF_DETACHED)) {
+ return -EINVAL;
+ }
+
+ /* first we need to adjust the phandles */
+ phandle_delta = of_get_tree_max_phandle() + 1;
+ __of_adjust_tree_phandles(resolve, phandle_delta);
+ err = __of_adjust_tree_phandle_references(resolve, phandle_delta);
+ if (err != 0)
+ return err;
+
+ root_sym = NULL;
+ resolve_sym = NULL;
+ resolve_fix = NULL;
+
+ /* this may fail (if no fixups are required) */
+ root_sym = of_find_node_by_path("/__symbols__");
+
+ /* locate the symbols & fixups nodes on resolve */
+ __for_each_child_of_node(resolve, child) {
+
+ if (resolve_sym == NULL &&
+ of_node_cmp(child->name, "__symbols__") == 0)
+ resolve_sym = child;
+
+ if (resolve_fix == NULL &&
+ of_node_cmp(child->name, "__fixups__") == 0)
+ resolve_fix = child;
+
+ /* both found, don't bother anymore */
+ if (resolve_sym != NULL && resolve_fix != NULL)
+ break;
+ }
+
+ /* we do allow for the case where no fixups are needed */
+ if (resolve_fix == NULL)
+ goto merge_sym;
+
+ /* we need to fixup, but no root symbols... */
+ if (root_sym == NULL)
+ return -EINVAL;
+
+ for_each_property_of_node(resolve_fix, rprop) {
+
+ /* skip properties added automatically */
+ if (of_prop_cmp(rprop->name, "name") == 0)
+ continue;
+
+ err = of_property_read_string(root_sym,
+ rprop->name, &refpath);
+ if (err != 0) {
+ pr_err("%s: Could not find symbol '%s'\n",
+ __func__, rprop->name);
+ goto err_fail;
+ }
+
+ refnode = of_find_node_by_path(refpath);
+ if (refnode == NULL) {
+ pr_err("%s: Could not find node by path '%s'\n",
+ __func__, refpath);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ phandle = refnode->phandle;
+ of_node_put(refnode);
+
+ pr_debug("%s: %s phandle is 0x%08x\n",
+ __func__, rprop->name, phandle);
+
+ /* make a copy */
+ propval = kmalloc(rprop->length, GFP_KERNEL);
+ if (propval == NULL) {
+ pr_err("%s: Could not copy value of '%s'\n",
+ __func__, rprop->name);
+ err = -ENOMEM;
+ goto err_fail;
+ }
+
+ memcpy(propval, rprop->value, rprop->length);
+
+ propend = propval + rprop->length;
+ for (propcur = propval; propcur < propend;
+ propcur += propcurlen + 1) {
+ propcurlen = strlen(propcur);
+
+ nodestr = propcur;
+ s = strchr(propcur, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol "
+ "entry '%s' (1)\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -EINVAL;
+ goto err_fail;
+ }
+ *s++ = '\0';
+
+ propstr = s;
+ s = strchr(s, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol "
+ "entry '%s' (2)\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -EINVAL;
+ goto err_fail;
+ }
+
+ *s++ = '\0';
+ offset = simple_strtoul(s, NULL, 10);
+
+ /* look into the resolve node for the full path */
+ refnode = __of_find_node_by_full_name(resolve,
+ nodestr);
+ if (refnode == NULL) {
+ pr_err("%s: Could not find refnode '%s'\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ /* now find the property */
+ for_each_property_of_node(refnode, sprop) {
+ if (of_prop_cmp(sprop->name, propstr) == 0)
+ break;
+ }
+
+ if (sprop == NULL) {
+ pr_err("%s: Could not find property '%s'\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ *(uint32_t *)(sprop->value + offset) =
+ cpu_to_be32(phandle);
+ }
+
+ kfree(propval);
+ }
+
+merge_sym:
+
+ of_node_put(root_sym);
+
+ return 0;
+
+err_fail:
+
+ if (root_sym != NULL)
+ of_node_put(root_sym);
+
+ return err;
+}
diff --git a/include/linux/of.h b/include/linux/of.h
index 9d69bd2..22d42e5 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -721,4 +721,21 @@ static inline int of_multi_prop_cmp(const struct property *prop, const char *val

#endif /* !CONFIG_OF */

+
+/* illegal phandle value (set when unresolved) */
+#define OF_PHANDLE_ILLEGAL 0xdeadbeef
+
+#ifdef CONFIG_OF_RESOLVE
+
+int of_resolve(struct device_node *resolve);
+
+#else
+
+static inline int of_resolve(struct device_node *resolve)

Pantelis Antoniou

unread,
Nov 5, 2013, 1:50:02 PM11/5/13
to
The following patchset introduces Device Tree overlays, a method
of dynamically altering the kernel's live Device Tree, along with
a generic interface to use it in a board agnostic manner.

It is against mainline as of today, Nov 5 2013:
be408cd3e1fef73e9408b196a79b9934697fe3b1
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

It relies on the following previously submitted patches/patchsets:

The DTC compiler patch required to get symbol resolution working

"dtc: Dynamic symbols & fixup support (v2)"

The I2C device registration patch

"of: i2c: Export single device registration method"

And finally a patchset exporting various OF fixes

"OF: Fixes in preperation of DT overlays"

Note that although this patchset allows DT overlay removal
on my preferred platform (omap), platform device registration
fails; another patchset that deals with this issue has been posted:

"omap_device removal fixups"

I should also note that the /proc interface is an anachronism, but
it is the place where /proc/device-tree is also located, so it makes
sense to group /proc/device-tree & /proc/device-tree-overlay* together.
I am open to suggestions about where the generic interface should reside.
A suggestion has been made about configfs but I'd like to get this out
as a basis of discussion.

This low level interface should be used as a starting point for platforms
with discoverable hardware on the daughtercard level since they should use the
facilities provided to implement their own policy, dealing with things like
conflicts, high level resource allocation and so on.

Changes since V1:

* Removal of any bits related to a specific board (beaglebone).
* Introduced a platform agnostic interface using /proc/device-tree-overlay
* Various bug fixes related to i2c device handling have been squashed in.

Pantelis Antoniou (3):
OF: Introduce Device Tree resolve support.
OF: Introduce DT overlay support.
DT: proc: Add runtime overlay interface in /proc

.../devicetree/dynamic-resolution-notes.txt | 25 +
Documentation/devicetree/overlay-notes.txt | 179 +++++
drivers/of/Kconfig | 19 +
drivers/of/Makefile | 2 +
drivers/of/overlay.c | 869 +++++++++++++++++++++
drivers/of/resolver.c | 395 ++++++++++
fs/proc/proc_devtree.c | 278 +++++++
include/linux/of.h | 130 +++
8 files changed, 1897 insertions(+)
create mode 100644 Documentation/devicetree/dynamic-resolution-notes.txt
create mode 100644 Documentation/devicetree/overlay-notes.txt
create mode 100644 drivers/of/overlay.c
create mode 100644 drivers/of/resolver.c

Pantelis Antoniou

unread,
Nov 5, 2013, 1:50:02 PM11/5/13
to
Add a runtime interface to /proc to enable generic device tree overlay
usage.

Two new /proc files are added:

/proc/device-tree-overlay & /proc/device-tree-overlay-status

/proc/device-tree-overlay accepts a stream of a device tree objects and
applies it to the running kernel's device tree.

$ cat ~/BB-UART2-00A0.dtbo >device-tree-overlay
overlay_proc_release: Applied #2 overlay segments @0

/proc/device-tree-overlay-status displays the the overlays added using
the /proc interface

$ cat device-tree-overlay-status
0: 861 bytes BB-UART2:00A0

The format of the status line is
<ID>: <SIZE> bytes <part-number>:<version>

<ID> is the id of the overlay
<SIZE> is the size of the overlay in bytes
<part-number>, <version> are (optional) root level properties of the DTBO

You can remove an overlay by echoing the <ID> number of the overlay
precedded with a '-'

So
$ echo "-0" >device-tree-overlay-status

Removes the overlay.

Note that this seldom works on most platforms since platform_device
removal is something that almost never works without extra patches.

Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
---
fs/proc/proc_devtree.c | 278 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 278 insertions(+)

diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c
index 106a835..48f7925 100644
--- a/fs/proc/proc_devtree.c
+++ b/fs/proc/proc_devtree.c
@@ -16,6 +16,9 @@
#include <linux/slab.h>
#include <asm/prom.h>
#include <asm/uaccess.h>
+#include <linux/of_fdt.h>
+#include <linux/idr.h>
+
#include "internal.h"

static inline void set_node_proc_entry(struct device_node *np,
@@ -28,6 +31,275 @@ static inline void set_node_proc_entry(struct device_node *np,

static struct proc_dir_entry *proc_device_tree;

+#if defined(CONFIG_OF_OVERLAY)
+static struct proc_dir_entry *proc_device_tree_overlay;
+static struct proc_dir_entry *proc_device_tree_overlay_status;
+static DEFINE_MUTEX(overlay_lock);
+static DEFINE_IDR(overlay_idr);
+
+struct proc_overlay_data {
+ void *buf;
+ size_t alloc;
+ size_t size;
+
+ int id;
+ struct device_node *overlay;
+ int ovinfo_cnt;
+ struct of_overlay_info *ovinfo;
+ unsigned int failed : 1;
+ unsigned int applied : 1;
+ unsigned int removing : 1;
+};
+
+static int overlay_proc_open(struct inode *inode, struct file *file)
+{
+ struct proc_overlay_data *od;
+
+ od = kzalloc(sizeof(*od), GFP_KERNEL);
+ if (od == NULL)
+ return -ENOMEM;
+
+ od->id = -1;
+
+ /* save it */
+ file->private_data = od;
+
+ return 0;
+}
+
+static ssize_t overlay_proc_write(struct file *file, const char __user *buf, size_t size, loff_t *ppos)
+{
+ struct proc_overlay_data *od = file->private_data;
+ void *new_buf;
+
+ /* need to alloc? */
+ if (od->size + size > od->alloc) {
+
+ /* start at 256K at first */
+ if (od->alloc == 0)
+ od->alloc = SZ_256K / 2;
+
+ /* double buffer */
+ od->alloc <<= 1;
+ new_buf = kzalloc(od->alloc, GFP_KERNEL);
+ if (new_buf == NULL) {
+ pr_err("%s: failed to grow buffer\n", __func__);
+ od->failed = 1;
+ return -ENOMEM;
+ }
+
+ /* copy all we had previously */
+ memcpy(new_buf, od->buf, od->size);
+
+ /* free old buffer and assign new */
+ kfree(od->buf);
+ od->buf = new_buf;
+ }
+
+ if (unlikely(copy_from_user(od->buf + od->size, buf, size))) {
+ pr_err("%s: fault copying from userspace\n", __func__);
+ return -EFAULT;
+ }
+
+ od->size += size;
+ *ppos += size;
+
+ return size;
+}
+
+static int overlay_proc_release(struct inode *inode, struct file *file)
+{
+ struct proc_overlay_data *od = file->private_data;
+ int id;
+ int err = 0;
+
+ /* perfectly normal when not loading */
+ if (od == NULL)
+ return 0;
+
+ if (od->failed)
+ goto out_free;
+
+ of_fdt_unflatten_tree(od->buf, &od->overlay);
+ if (od->overlay == NULL) {
+ pr_err("%s: failed to unflatten tree\n", __func__);
+ err = -EINVAL;
+ goto out_free;
+ }
+ pr_debug("%s: unflattened OK\n", __func__);
+
+ /* mark it as detached */
+ of_node_set_flag(od->overlay, OF_DETACHED);
+
+ /* perform resolution */
+ err = of_resolve(od->overlay);
+ if (err != 0) {
+ pr_err("%s: Failed to resolve tree\n", __func__);
+ goto out_free;
+ }
+ pr_debug("%s: resolved OK\n", __func__);
+
+ /* now build an overlay info array */
+ err = of_build_overlay_info(od->overlay,
+ &od->ovinfo_cnt, &od->ovinfo);
+ if (err != 0) {
+ pr_err("%s: Failed to build overlay info\n", __func__);
+ goto out_free;
+ }
+
+ pr_debug("%s: built %d overlay segments\n", __func__,
+ od->ovinfo_cnt);
+
+ err = of_overlay(od->ovinfo_cnt, od->ovinfo);
+ if (err != 0) {
+ pr_err("%s: Failed to apply overlay\n", __func__);
+ goto out_free;
+ }
+
+ od->applied = 1;
+
+ mutex_lock(&overlay_lock);
+ idr_preload(GFP_KERNEL);
+ id = idr_alloc(&overlay_idr, od, 0, -1, GFP_KERNEL);
+ idr_preload_end();
+ mutex_unlock(&overlay_lock);
+
+ if (id < 0) {
+ err = id;
+ pr_err("%s: failed to get id for overlay\n", __func__);
+ goto out_free;
+ }
+ od->id = id;
+
+ pr_info("%s: Applied #%d overlay segments @%d\n", __func__,
+ od->ovinfo_cnt, od->id);
+
+ return 0;
+
+out_free:
+ if (od->id != -1)
+ idr_remove(&overlay_idr, od->id);
+ if (od->applied)
+ of_overlay_revert(od->ovinfo_cnt, od->ovinfo);
+ if (od->ovinfo)
+ of_free_overlay_info(od->ovinfo_cnt, od->ovinfo);
+ /* release memory */
+ kfree(od->buf);
+ kfree(od);
+
+ return 0;
+}
+
+static const struct file_operations overlay_proc_fops = {
+ .owner = THIS_MODULE,
+ .open = overlay_proc_open,
+ .write = overlay_proc_write,
+ .release = overlay_proc_release,
+};
+
+/*
+ * Supply data on a read from /proc/device-tree-overlay-status
+ */
+static int overlay_status_proc_show(struct seq_file *m, void *v)
+{
+ int err, id;
+ struct proc_overlay_data *od;
+ const char *part_number, *version;
+
+ rcu_read_lock();
+ idr_for_each_entry(&overlay_idr, od, id) {
+ seq_printf(m, "%d: %d bytes", id, od->size);
+ /* TODO Make this standardized? */
+ err = of_property_read_string(od->overlay, "part-number",
+ &part_number);
+ if (err != 0)
+ part_number = NULL;
+ err = of_property_read_string(od->overlay, "version",
+ &version);
+ if (err != 0)
+ version = NULL;
+ if (part_number) {
+ seq_printf(m, " %s", part_number);
+ if (version)
+ seq_printf(m, ":%s", version);
+ }
+ seq_printf(m, "\n");
+ }
+ rcu_read_unlock();
+
+ return 0;
+}
+
+static int overlay_status_proc_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, overlay_status_proc_show, __PDE_DATA(inode));
+}
+
+static ssize_t overlay_status_proc_write(struct file *file, const char __user *buf,
+ size_t size, loff_t *ppos)
+{
+ struct proc_overlay_data *od;
+ char buffer[PROC_NUMBUF + 1];
+ char *ptr;
+ int id, err, count;
+
+ memset(buffer, 0, sizeof(buffer));
+ count = size;
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+
+ if (copy_from_user(buffer, buf, count)) {
+ err = -EFAULT;
+ goto out;
+ }
+
+ ptr = buffer;
+ if (*ptr != '-') { /* only removal supported */
+ err = -EINVAL;
+ goto out;
+ }
+ ptr++;
+ err = kstrtoint(strstrip(ptr), 0, &id);
+ if (err)
+ goto out;
+
+ /* find it */
+ mutex_lock(&overlay_lock);
+ od = idr_find(&overlay_idr, id);
+ if (od == NULL) {
+ mutex_unlock(&overlay_lock);
+ err = -EINVAL;
+ goto out;
+ }
+ /* remove it */
+ idr_remove(&overlay_idr, id);
+ mutex_unlock(&overlay_lock);
+
+ err = of_overlay_revert(od->ovinfo_cnt, od->ovinfo);
+ if (err != 0) {
+ pr_err("%s: of_overlay_revert failed\n", __func__);
+ goto out;
+ }
+ /* release memory */
+ of_free_overlay_info(od->ovinfo_cnt, od->ovinfo);
+ kfree(od->buf);
+ kfree(od);
+
+ pr_info("%s: Removed overlay with id %d\n", __func__, id);
+out:
+ return err < 0 ? err : count;
+}
+
+static const struct file_operations overlay_status_proc_fops = {
+ .owner = THIS_MODULE,
+ .open = overlay_status_proc_open,
+ .read = seq_read,
+ .write = overlay_status_proc_write,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+#endif
+
/*
* Supply data on a read from /proc/device-tree/node/property.
*/
@@ -239,5 +511,11 @@ void __init proc_device_tree_init(void)
return;
}
proc_device_tree_add_node(root, proc_device_tree);
+#if defined(CONFIG_OF_OVERLAY)
+ proc_device_tree_overlay = proc_create_data("device-tree-overlay",
+ S_IWUSR, NULL, &overlay_proc_fops, NULL);
+ proc_device_tree_overlay_status = proc_create_data("device-tree-overlay-status",
+ S_IRUSR| S_IWUSR, NULL, &overlay_status_proc_fops, NULL);
+#endif
of_node_put(root);

Guenter Roeck

unread,
Nov 5, 2013, 2:10:02 PM11/5/13
to
On Tue, Nov 05, 2013 at 08:41:35PM +0200, Pantelis Antoniou wrote:
> The following patchset introduces Device Tree overlays, a method
> of dynamically altering the kernel's live Device Tree, along with
> a generic interface to use it in a board agnostic manner.
>
Hi Pantelis,

great to see this going. I'll apply the patches to our tree and test.

I have a couple of patches on top of yours:
- export overlay functions
- declare of_free_overlay_info as static inline if OF is undefined.
- ignore PCI devices when inserting or removing overlays
[ Based on the idea that PCI/PCIe ports are handled by the PCI subsystem ]

I'll rebase those to the new version of your patches and send it out
after I get it all working.

Thanks,
Guenter

Dinh Nguyen

unread,
Nov 5, 2013, 7:20:01 PM11/5/13
to
Hi Pantelis,

On Tue, 2013-11-05 at 11:06 -0800, Guenter Roeck wrote:
> On Tue, Nov 05, 2013 at 08:41:35PM +0200, Pantelis Antoniou wrote:
> > The following patchset introduces Device Tree overlays, a method
> > of dynamically altering the kernel's live Device Tree, along with
> > a generic interface to use it in a board agnostic manner.
> >
> Hi Pantelis,
>
> great to see this going. I'll apply the patches to our tree and test.

+1. Alan and I will also test this with the FPGA framework.

Thanks,
Dinh
>
> I have a couple of patches on top of yours:
> - export overlay functions
> - declare of_free_overlay_info as static inline if OF is undefined.
> - ignore PCI devices when inserting or removing overlays
> [ Based on the idea that PCI/PCIe ports are handled by the PCI subsystem ]
>
> I'll rebase those to the new version of your patches and send it out
> after I get it all working.
>
> Thanks,
> Guenter
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in

Alexander Sverdlin

unread,
Nov 6, 2013, 4:00:02 AM11/6/13
to
Hi Pantelis,

On 05/11/13 19:41, ext Pantelis Antoniou wrote:
> The following patchset introduces Device Tree overlays, a method
> of dynamically altering the kernel's live Device Tree, along with
> a generic interface to use it in a board agnostic manner.

I'm glad to see it again! Will test the whole series on our MIPS64 platforms
(currently old patchset from bbxm tree is successfully used for a while).

--
Best regards,
Alexander Sverdlin.

Pantelis Antoniou

unread,
Nov 6, 2013, 5:00:02 AM11/6/13
to
Hi Ionut,

On Nov 6, 2013, at 11:51 AM, Ionut Nicu wrote:

> Hi,
>
> On 05.11.2013 19:41, ext Pantelis Antoniou wrote:
>> Add a runtime interface to /proc to enable generic device tree overlay
>> usage.
>>
>> Two new /proc files are added:
>>
>> /proc/device-tree-overlay & /proc/device-tree-overlay-status
>>
>> /proc/device-tree-overlay accepts a stream of a device tree objects and
>> applies it to the running kernel's device tree.
>>
>> $ cat ~/BB-UART2-00A0.dtbo >device-tree-overlay
>> overlay_proc_release: Applied #2 overlay segments @0
>>
>> /proc/device-tree-overlay-status displays the the overlays added using
>> the /proc interface
>>
>> $ cat device-tree-overlay-status
>> 0: 861 bytes BB-UART2:00A0
>>
>> The format of the status line is
>> <ID>: <SIZE> bytes <part-number>:<version>
>>
>> <ID> is the id of the overlay
>> <SIZE> is the size of the overlay in bytes
>> <part-number>, <version> are (optional) root level properties of the DTBO
>>
>> You can remove an overlay by echoing the <ID> number of the overlay
>> precedded with a '-'
>>
>> So
>> $ echo "-0" >device-tree-overlay-status
>>
>
> Wouldn't it be easier if echo "-BB-UART2-00A0" > device-tree-overlay-status was
> supported also? That way one doesn't need to know the order in which the
> overlays were applied or parse the status file to get the <ID>.
>

Unfortunately no since this is a raw bytestream interface; there is no file
information. The patchset does display any root level part-number & version properties
but that's completely options.

We could standardize in a named root property to use on each overlay, and the part-number &
version are as good as any.



>> Removes the overlay.
>>
>> Note that this seldom works on most platforms since platform_device
>> removal is something that almost never works without extra patches.
>>
>> Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
>
> It would be very helpful to me if I would have a notification mechanism
> for overlay add and remove operations based on blocking_notifier_call_chain(),.
> This way other drivers can be notified when the dt changes.
>

Yes. Some people expressed interest in something similar.

> But I guess that could be added in the future with another patch.
>

Yep.

> Thanks,
> Ionut

Let's get the basic support in now, and we can fix all of that on later patches.

Regards

-- Pantelis

Ionut Nicu

unread,
Nov 6, 2013, 5:00:03 AM11/6/13
to
Hi,

On 05.11.2013 19:41, ext Pantelis Antoniou wrote:
> Add a runtime interface to /proc to enable generic device tree overlay
> usage.
>
> Two new /proc files are added:
>
> /proc/device-tree-overlay & /proc/device-tree-overlay-status
>
> /proc/device-tree-overlay accepts a stream of a device tree objects and
> applies it to the running kernel's device tree.
>
> $ cat ~/BB-UART2-00A0.dtbo >device-tree-overlay
> overlay_proc_release: Applied #2 overlay segments @0
>
> /proc/device-tree-overlay-status displays the the overlays added using
> the /proc interface
>
> $ cat device-tree-overlay-status
> 0: 861 bytes BB-UART2:00A0
>
> The format of the status line is
> <ID>: <SIZE> bytes <part-number>:<version>
>
> <ID> is the id of the overlay
> <SIZE> is the size of the overlay in bytes
> <part-number>, <version> are (optional) root level properties of the DTBO
>
> You can remove an overlay by echoing the <ID> number of the overlay
> precedded with a '-'
>
> So
> $ echo "-0" >device-tree-overlay-status
>

Wouldn't it be easier if echo "-BB-UART2-00A0" > device-tree-overlay-status was
supported also? That way one doesn't need to know the order in which the
overlays were applied or parse the status file to get the <ID>.

> Removes the overlay.
>
> Note that this seldom works on most platforms since platform_device
> removal is something that almost never works without extra patches.
>
> Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>

It would be very helpful to me if I would have a notification mechanism
for overlay add and remove operations based on blocking_notifier_call_chain(),.
This way other drivers can be notified when the dt changes.

But I guess that could be added in the future with another patch.

Thanks,
Ionut

Alexander Sverdlin

unread,
Nov 6, 2013, 11:10:01 AM11/6/13
to
Hi!

On 05/11/13 19:41, ext Pantelis Antoniou wrote:
> Introduce support for dynamic device tree resolution.
> Using it, it is possible to prepare a device tree that's
> been loaded on runtime to be modified and inserted at the kernel
> live tree.
>
> Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>

Reviewed-by: Alexander Sverdlin <alexander...@nsn.com>
Best regards,
Alexander Sverdlin.

Alexander Sverdlin

unread,
Nov 6, 2013, 11:10:02 AM11/6/13
to
Hi!

On 05/11/13 19:41, ext Pantelis Antoniou wrote:
> Introduce DT overlay support.
> Using this functionality it is possible to dynamically overlay a part of
> the kernel's tree with another tree that's been dynamically loaded.
> It is also possible to remove node and properties.
>
> Note that the I2C client devices are 'special', as in they're not platform
> devices. They need to be registered with an I2C specific method.
>
> In general I2C is just no good platform device citizen and needs
> special casing.
>
> Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>

Reviewed-by: Alexander Sverdlin <alexander...@nsn.com>
Best regards,
Alexander Sverdlin.

Ionut Nicu

unread,
Nov 6, 2013, 11:30:02 AM11/6/13
to
Hi,

On 06.11.2013 17:00, Alexander Sverdlin wrote:
> Hi!
>
> On 05/11/13 19:41, ext Pantelis Antoniou wrote:
>> Introduce DT overlay support.
>> Using this functionality it is possible to dynamically overlay a part of
>> the kernel's tree with another tree that's been dynamically loaded.
>> It is also possible to remove node and properties.
>>
>> Note that the I2C client devices are 'special', as in they're not platform
>> devices. They need to be registered with an I2C specific method.
>>
>> In general I2C is just no good platform device citizen and needs
>> special casing.
>>
>> Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
>
> Reviewed-by: Alexander Sverdlin <alexander...@nsn.com>
>

I'm using this patch for a few months without any problems, so:

Tested-by: Ionut Nicu <ioan.n...@nsn.com>

Ionut Nicu

unread,
Nov 6, 2013, 11:30:02 AM11/6/13
to
Hi,

On 06.11.2013 16:59, Alexander Sverdlin wrote:
> Hi!
>
> On 05/11/13 19:41, ext Pantelis Antoniou wrote:
>> Introduce support for dynamic device tree resolution.
>> Using it, it is possible to prepare a device tree that's
>> been loaded on runtime to be modified and inserted at the kernel
>> live tree.
>>
>> Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
>
> Reviewed-by: Alexander Sverdlin <alexander...@nsn.com>
>

I'm using this patch for a few months without any problems, so:

Tested-by: Ionut Nicu <ioan.n...@nsn.com>


Pantelis Antoniou

unread,
Nov 6, 2013, 2:10:01 PM11/6/13
to
Hi Sebastian,

On Nov 6, 2013, at 9:01 PM, Sebastian Andrzej Siewior wrote:

> On 05.11.13, Pantelis Antoniou wrote:
>> The following patchset introduces Device Tree overlays, a method
>> of dynamically altering the kernel's live Device Tree, along with
>> a generic interface to use it in a board agnostic manner.
>
> In case this has been discussed and I missed it: Why are we doing this?
> Isn't it possible to d o the overlay thingy in u-boot and pass a
> complete device tree to the kernel?
> Are you trying to do something like hotplug-PCI where the PCI card can
> be replaced at runtime?
>

It has been discussed.

We are doing it because

a) We tried to do it in u-boot and it has been a complete disaster.
Regular users just can't handle bootloader updates.

b) It is similar to that. It was originally created for the beaglebone,
which has a concept of capes (similar to Arduino shields).
http://circuitco.com/support/index.php?title=BeagleBone_Capes
Turns out it's really useful to anyone doing reconfigurable hardware too,
so that's why FPGA people are thinking of using it.

c) There are people that want to tinker with Linux based hardware boards
but are not kernel developers. This gives them a way to do so without
having to recompile the kernel and/or reboot while tinkering.

> Sebastian

Regards

-- Pantelis

Sebastian Andrzej Siewior

unread,
Nov 6, 2013, 2:10:01 PM11/6/13
to
On 05.11.13, Pantelis Antoniou wrote:
> The following patchset introduces Device Tree overlays, a method
> of dynamically altering the kernel's live Device Tree, along with
> a generic interface to use it in a board agnostic manner.

In case this has been discussed and I missed it: Why are we doing this?
Isn't it possible to d o the overlay thingy in u-boot and pass a
complete device tree to the kernel?
Are you trying to do something like hotplug-PCI where the PCI card can
be replaced at runtime?

Sebastian

Rob Herring

unread,
Nov 6, 2013, 2:20:02 PM11/6/13
to
On Tue, Nov 5, 2013 at 12:41 PM, Pantelis Antoniou
<pa...@antoniou-consulting.com> wrote:
> Add a runtime interface to /proc to enable generic device tree overlay
> usage.
>
> Two new /proc files are added:
>
> /proc/device-tree-overlay & /proc/device-tree-overlay-status

I think we really want all this to live under sysfs. Grant did patches
to move /proc/device-tree to /sys, but it never went upstream:

v2: https://lkml.org/lkml/2013/3/21/215
v1: https://lkml.org/lkml/2013/3/20/311

> /proc/device-tree-overlay accepts a stream of a device tree objects and
> applies it to the running kernel's device tree.
>
> $ cat ~/BB-UART2-00A0.dtbo >device-tree-overlay
> overlay_proc_release: Applied #2 overlay segments @0
>
> /proc/device-tree-overlay-status displays the the overlays added using
> the /proc interface
>
> $ cat device-tree-overlay-status
> 0: 861 bytes BB-UART2:00A0

Is the size useful information?

>
> The format of the status line is
> <ID>: <SIZE> bytes <part-number>:<version>
>
> <ID> is the id of the overlay
> <SIZE> is the size of the overlay in bytes
> <part-number>, <version> are (optional) root level properties of the DTBO
>
> You can remove an overlay by echoing the <ID> number of the overlay
> precedded with a '-'
>
> So
> $ echo "-0" >device-tree-overlay-status
>
> Removes the overlay.

This interface seems racy. Could the id change on you between reading
the status and echoing to remove the overlay?

I would rather see a file created for each overlay and simply echo 0
or "remove" to remove the overlay. Or possibly it needs to be a
directory per overlay with several files for info and control. This
would be more inline with typical sysfs design.

Rob

Pantelis Antoniou

unread,
Nov 6, 2013, 2:30:02 PM11/6/13
to
Hi Rob,

On Nov 6, 2013, at 9:10 PM, Rob Herring wrote:

> On Tue, Nov 5, 2013 at 12:41 PM, Pantelis Antoniou
> <pa...@antoniou-consulting.com> wrote:
>> Add a runtime interface to /proc to enable generic device tree overlay
>> usage.
>>
>> Two new /proc files are added:
>>
>> /proc/device-tree-overlay & /proc/device-tree-overlay-status
>
> I think we really want all this to live under sysfs. Grant did patches
> to move /proc/device-tree to /sys, but it never went upstream:
>
> v2: https://lkml.org/lkml/2013/3/21/215
> v1: https://lkml.org/lkml/2013/3/20/311
>

Yes, I'm aware; the location of this control interface in /proc is
unusual, but had to go somewhere. It should be easy enough to move it to
/sys.

>> /proc/device-tree-overlay accepts a stream of a device tree objects and
>> applies it to the running kernel's device tree.
>>
>> $ cat ~/BB-UART2-00A0.dtbo >device-tree-overlay
>> overlay_proc_release: Applied #2 overlay segments @0
>>
>> /proc/device-tree-overlay-status displays the the overlays added using
>> the /proc interface
>>
>> $ cat device-tree-overlay-status
>> 0: 861 bytes BB-UART2:00A0
>
> Is the size useful information?
>

If the overlay doesn't contain part-number/version properties there is nothing
to differentiate each one loaded. No file information, it is just a byte stream
interface.

>>
>> The format of the status line is
>> <ID>: <SIZE> bytes <part-number>:<version>
>>
>> <ID> is the id of the overlay
>> <SIZE> is the size of the overlay in bytes
>> <part-number>, <version> are (optional) root level properties of the DTBO
>>
>> You can remove an overlay by echoing the <ID> number of the overlay
>> precedded with a '-'
>>
>> So
>> $ echo "-0" >device-tree-overlay-status
>>
>> Removes the overlay.
>
> This interface seems racy. Could the id change on you between reading
> the status and echoing to remove the overlay?
>
> I would rather see a file created for each overlay and simply echo 0
> or "remove" to remove the overlay. Or possibly it needs to be a
> directory per overlay with several files for info and control. This
> would be more inline with typical sysfs design.
>

It was suggested to use a configfs interface. IIRC configfs can do what you
propose.

Something like

/config/dto/add <- load by cat overlay.dtbo >/config/dto/load
/config/dto/0/remove <- unload by echo 1 >/config/dto/0/remove
/config/dto/0/${prop} <- root level properties that are ignore by the overlay
mechanism


> Rob

Regards

-- Pantelis

Guenter Roeck

unread,
Nov 6, 2013, 2:40:02 PM11/6/13
to
On Wed, Nov 06, 2013 at 08:01:52PM +0100, Sebastian Andrzej Siewior wrote:
> On 05.11.13, Pantelis Antoniou wrote:
> > The following patchset introduces Device Tree overlays, a method
> > of dynamically altering the kernel's live Device Tree, along with
> > a generic interface to use it in a board agnostic manner.
>
> In case this has been discussed and I missed it: Why are we doing this?
> Isn't it possible to d o the overlay thingy in u-boot and pass a
> complete device tree to the kernel?
> Are you trying to do something like hotplug-PCI where the PCI card can
> be replaced at runtime?
>

At least that is our use case. u-boot doesn't know which cards are going to be
inserted at runtime. Even PCIe hotplug itself is insufficient, as the PCIe
configuration differs per card, and the cards support a variety of i2c devices
as well as other card specific devices such LEDs and multi-function FPGAs.

Guenter

Matt Porter

unread,
Nov 6, 2013, 3:20:01 PM11/6/13
to
In a configfs it makes more sense to mkdir. FWIW, USB gadget configfs
is a good example of this.

mkdir /config/dto/0

which would cause the kernel to create the attribute under that
directory:

/config/dto/0/load

Which you use to load as noted above.

Only problem is that configfs doesn't support binary attributes like
sysfs. If it is a agreed that overlays are configuration then that would
be a strong argument to bring over the binary attribute feature.


> /config/dto/0/remove <- unload by echo 1 >/config/dto/0/remove

rmdir /config/dto/0

> /config/dto/0/${prop} <- root level properties that are ignore by the overlay
> mechanism

-Matt

Sebastian Andrzej Siewior

unread,
Nov 6, 2013, 3:40:02 PM11/6/13
to
On 06.11.13, Guenter Roeck wrote:
> At least that is our use case. u-boot doesn't know which cards are going to be
> inserted at runtime. Even PCIe hotplug itself is insufficient, as the PCIe
> configuration differs per card, and the cards support a variety of i2c devices
> as well as other card specific devices such LEDs and multi-function FPGAs.

So you have your FPGA behind PCIe and you use the DT to describe the
chips behind i2c? And then you update your FPGA and want update the
devices in DT without reboot?

>
> Guenter

Sebastian

Sebastian Andrzej Siewior

unread,
Nov 6, 2013, 3:40:04 PM11/6/13
to
On 06.11.13, Pantelis Antoniou wrote:
> Hi Sebastian,
Hi Pantelis,

> It has been discussed.
>
> We are doing it because
>
> a) We tried to do it in u-boot and it has been a complete disaster.
> Regular users just can't handle bootloader updates.

How so? The "additional" dtb piece was deleted by accident as part of
the u-boot upgrade? Do you maybe a link which describes the disaster?

> b) It is similar to that. It was originally created for the beaglebone,
> which has a concept of capes (similar to Arduino shields).
> http://circuitco.com/support/index.php?title=BeagleBone_Capes
> Turns out it's really useful to anyone doing reconfigurable hardware too,
> so that's why FPGA people are thinking of using it.

I am aware of this. My understanding is that those capes have hardware
information encoded in an eeprom behind i2c _and_ you can't or should
not replace capes at runtime.
Naive as I am I *assume* it should be easy to read this eeprom in u-boot
as part of the boot setup and extend the dtb before passing it to the
kernel. In case this works well, the problem here is a) ?

> c) There are people that want to tinker with Linux based hardware boards
> but are not kernel developers. This gives them a way to do so without
> having to recompile the kernel and/or reboot while tinkering.

I understand that they want to avoid reboot. But they still need to
recompile the device tree, don't they? Or does this allow to change the
HW description without knowing the syntax of .dts?

> Regards
>
> -- Pantelis

Sebastian

Sebastian Andrzej Siewior

unread,
Nov 6, 2013, 3:50:02 PM11/6/13
to
On 06.11.13, Sebastian Andrzej Siewior wrote:
> > It has been discussed.
> >
> > We are doing it because

Please don't get me wrong. I am not against this, I am just curious why
this needs to be done at runtime and bootloader time is not an option.

Dinh Nguyen

unread,
Nov 6, 2013, 3:50:02 PM11/6/13
to
Hi Pantelis,
This needs a static inline, otherwise I get a build failure if
CONFIG_OF_OVERLAY is not set.

Dinh
> + return -ENOTSUPP;
> +}
> +
> +#endif
> +
> #endif /* _LINUX_OF_H */

--

Guenter Roeck

unread,
Nov 6, 2013, 4:20:03 PM11/6/13
to
On Wed, Nov 06, 2013 at 09:38:21PM +0100, Sebastian Andrzej Siewior wrote:
> On 06.11.13, Guenter Roeck wrote:
> > At least that is our use case. u-boot doesn't know which cards are going to be
> > inserted at runtime. Even PCIe hotplug itself is insufficient, as the PCIe
> > configuration differs per card, and the cards support a variety of i2c devices
> > as well as other card specific devices such LEDs and multi-function FPGAs.
>
> So you have your FPGA behind PCIe and you use the DT to describe the
> chips behind i2c? And then you update your FPGA and want update the
> devices in DT without reboot?
>
No.

We have a variety of boards with different functionality. Some of it is PCIe,
some of it isn't. Those boards can be inserted and removed at runtime.
The connector to those boards includes PCIe interfaces, i2c interfaces,
and other functionality such as power control/status pins and (proprietary)
serdes links.

Functionality on the boards include, but is not limited to, various i2c devices
(eg power sequencers, temperature sensors, ideeproms, other eeproms, SFP/SFP+),
ASICs, FPGAs, PCIe switches, or 10/40/100GE controllers. Not each board has
the same configuration. Some of the i2c devices may be connected directly
to the i2c bus on the connector, some may be connected through an FPGA
on the board. Sometimes PCIe devices are connected directly (if there is only
one), sometimes there is a PCIe switch on the board with devices behind it.
FPGA configuration is not likely to change at runtime, though it would
technically be possible. If there is an FPGA on a board, it is typically
a multi-function FPGA providing such functionality as GPIO pins, LED control,
and i2c master controllers.

We use DT overlays to describe the hardware on those boards and, if necessary,
its configuration. For example, if there is a PCIe switch, the overlay would
describe its memory and bus number configuration.

Guenter

Pantelis Antoniou

unread,
Nov 7, 2013, 2:20:02 AM11/7/13
to
Hi Dinh,
Thanks, fixed.

Pantelis Antoniou

unread,
Nov 7, 2013, 2:30:02 AM11/7/13
to
Hi Sebastian,

On Nov 6, 2013, at 10:41 PM, Sebastian Andrzej Siewior wrote:

> On 06.11.13, Sebastian Andrzej Siewior wrote:
>>> It has been discussed.
>>>
>>> We are doing it because
>
> Please don't get me wrong. I am not against this, I am just curious why
> this needs to be done at runtime and bootloader time is not an option.
>

No harm done.

I guess it all comes down to being done at runtime being the better option.

> Sebastian

Regards

-- Pantelis

Pantelis Antoniou

unread,
Nov 7, 2013, 2:30:02 AM11/7/13
to
Hi Sebestian,

On Nov 6, 2013, at 10:31 PM, Sebastian Andrzej Siewior wrote:

> On 06.11.13, Pantelis Antoniou wrote:
>> Hi Sebastian,
> Hi Pantelis,
>
>> It has been discussed.
>>
>> We are doing it because
>>
>> a) We tried to do it in u-boot and it has been a complete disaster.
>> Regular users just can't handle bootloader updates.
>
> How so? The "additional" dtb piece was deleted by accident as part of
> the u-boot upgrade? Do you maybe a link which describes the disaster?
>

You're assuming that bootloaders are anything like u-boot or barebox.
In the field, especially on consumer products, bootloaders are custom one-off
jobs that don't do much besides handing control to the kernel as soon as possible.

Even when using u-boot, end users botch updates related to bootloader in
such a way that systems end up RMAed.

Ask Koen Kooi in the CClist about the messy details.

>> b) It is similar to that. It was originally created for the beaglebone,
>> which has a concept of capes (similar to Arduino shields).
>> http://circuitco.com/support/index.php?title=BeagleBone_Capes
>> Turns out it's really useful to anyone doing reconfigurable hardware too,
>> so that's why FPGA people are thinking of using it.
>
> I am aware of this. My understanding is that those capes have hardware
> information encoded in an eeprom behind i2c _and_ you can't or should
> not replace capes at runtime.
> Naive as I am I *assume* it should be easy to read this eeprom in u-boot
> as part of the boot setup and extend the dtb before passing it to the
> kernel. In case this works well, the problem here is a) ?
>

It is just better system design to have it all done in the kernel.
Other people in the list can chime in, but it's hard to get a feel of the
problem if you haven't dealt with it before.

>> c) There are people that want to tinker with Linux based hardware boards
>> but are not kernel developers. This gives them a way to do so without
>> having to recompile the kernel and/or reboot while tinkering.
>
> I understand that they want to avoid reboot. But they still need to
> recompile the device tree, don't they? Or does this allow to change the
> HW description without knowing the syntax of .dts?
>

They understand the syntax of the DTS (barely).
They can't hack compiling the kernel and updating it, not by a long shot.
Not everyone is a kernel hacker (neither it needs be).

Imagine people coming over from Arduino trying to hack a 4K board file to
add support for the thing they're working on.

>> Regards
>>
>> -- Pantelis
>
> Sebastian

Regards

-- Pantelis

Pantelis Antoniou

unread,
Nov 7, 2013, 2:30:02 AM11/7/13
to
Hi Guenter,

On Nov 6, 2013, at 11:17 PM, Guenter Roeck wrote:

> On Wed, Nov 06, 2013 at 09:38:21PM +0100, Sebastian Andrzej Siewior wrote:
>> On 06.11.13, Guenter Roeck wrote:
>>> At least that is our use case. u-boot doesn't know which cards are going to be
>>> inserted at runtime. Even PCIe hotplug itself is insufficient, as the PCIe
>>> configuration differs per card, and the cards support a variety of i2c devices
>>> as well as other card specific devices such LEDs and multi-function FPGAs.
>>
>> So you have your FPGA behind PCIe and you use the DT to describe the
>> chips behind i2c? And then you update your FPGA and want update the
>> devices in DT without reboot?
>>
> No.
>
> We have a variety of boards ...
[snip]

TL;DR the real world...

>
> We use DT overlays to describe the hardware on those boards and, if necessary,
> its configuration. For example, if there is a PCIe switch, the overlay would
> describe its memory and bus number configuration.
>
> Guenter

Glad it's been useful...

Regards

-- Pantelis--

Pantelis Antoniou

unread,
Nov 7, 2013, 2:50:01 AM11/7/13
to
Hi Matt,
I see. This can be made to work.

> Only problem is that configfs doesn't support binary attributes like
> sysfs. If it is a agreed that overlays are configuration then that would
> be a strong argument to bring over the binary attribute feature.
>
>

Oops...

>> /config/dto/0/remove <- unload by echo 1 >/config/dto/0/remove
>
> rmdir /config/dto/0
>
>> /config/dto/0/${prop} <- root level properties that are ignore by the overlay
>> mechanism
>

Oh well, let's see what the maintainer have to say about which way to do.
Any option of the tree presented would work fine.

> -Matt

Regards

-- Pantelis

Alexander Sverdlin

unread,
Nov 7, 2013, 4:50:02 AM11/7/13
to
Hi!

On 06/11/13 20:08, ext Pantelis Antoniou wrote:
>>> The following patchset introduces Device Tree overlays, a method
>>> of dynamically altering the kernel's live Device Tree, along with
>>> a generic interface to use it in a board agnostic manner.
>>
>> In case this has been discussed and I missed it: Why are we doing this?
>> Isn't it possible to d o the overlay thingy in u-boot and pass a
>> complete device tree to the kernel?
>> Are you trying to do something like hotplug-PCI where the PCI card can
>> be replaced at runtime?
>>
>
> It has been discussed.
>
> We are doing it because
>
> a) We tried to do it in u-boot and it has been a complete disaster.
> Regular users just can't handle bootloader updates.

There are buses we simply do not want to deal with in uboot, like SRIO, PCIe, everything-over-ethernet.
All of them are hot-pluggable. And yeah, device-tree processing in uboot is possible, but in Linux
it's way easier...

> b) It is similar to that. It was originally created for the beaglebone,
> which has a concept of capes (similar to Arduino shields).
> http://circuitco.com/support/index.php?title=BeagleBone_Capes
> Turns out it's really useful to anyone doing reconfigurable hardware too,
> so that's why FPGA people are thinking of using it.
>
> c) There are people that want to tinker with Linux based hardware boards
> but are not kernel developers. This gives them a way to do so without
> having to recompile the kernel and/or reboot while tinkering.
>
>> Sebastian
>
> Regards
>
> -- Pantelis
>
>
>

--
Best regards,
Alexander Sverdlin.

Sebastian Andrzej Siewior

unread,
Nov 7, 2013, 2:30:02 PM11/7/13
to
On 06.11.13, Guenter Roeck wrote:
|…
thanks for the explanation.

> We use DT overlays to describe the hardware on those boards and, if necessary,
> its configuration. For example, if there is a PCIe switch, the overlay would
> describe its memory and bus number configuration.

So have your "fix" configuration and a few overlays you switch at
runtime. The problem you have is that you want to switch a specific part
if your configuration at runtime. I assume you run DT on ARM. What
happens if you swtich from ARM to x86 and you "keep" your FPGA
configuration requirement? You can't use both, DT and ACPI, right? So
what happens then?

>
> Guenter

Sebastian

Pantelis Antoniou

unread,
Nov 7, 2013, 3:10:02 PM11/7/13
to
Hi Sebastian,

On Nov 7, 2013, at 9:25 PM, Sebastian Andrzej Siewior wrote:

> On 06.11.13, Guenter Roeck wrote:
> |�
> thanks for the explanation.
>
>> We use DT overlays to describe the hardware on those boards and, if necessary,
>> its configuration. For example, if there is a PCIe switch, the overlay would
>> describe its memory and bus number configuration.
>
> So have your "fix" configuration and a few overlays you switch at
> runtime. The problem you have is that you want to switch a specific part
> if your configuration at runtime. I assume you run DT on ARM. What
> happens if you swtich from ARM to x86 and you "keep" your FPGA
> configuration requirement? You can't use both, DT and ACPI, right? So
> what happens then?
>

FWIW DT has been ported to x86. And is present on arm/powerpc/mips/arc and possibly
others.

So what are we talking about again? If you care about the non-DT case, why
don't you make a patch about how you could support Guenter's use case on
the x86.

His use case is not uncommon, believe it or not, and x86 would benefit from
something this flexible.

>>
>> Guenter
>
> Sebastian

Regards

-- Pantelis--

Sebastian Andrzej Siewior

unread,
Nov 7, 2013, 3:50:01 PM11/7/13
to
On 07.11.13, Pantelis Antoniou wrote:
> Hi Sebastian,
Hi Pantelis,

> FWIW DT has been ported to x86. And is present on arm/powerpc/mips/arc and possibly
> others.

Yes, I know. I am the one that did the work for CE4100, the first one
that boots with DT on x86.

> So what are we talking about again? If you care about the non-DT case, why
> don't you make a patch about how you could support Guenter's use case on
> the x86.

I am only saying that this "hot-plug a device at a non hot-plugagle bus at
runtime" is not limited to DT but this solution is. X86 + ACPI is not
the only limitation. ARM is (forced) going to ACPI as well as far I
know. And this solution is limited to DT. This is what I am pointing
out.

> His use case is not uncommon, believe it or not, and x86 would benefit from
> something this flexible.

I *think* a more flexible solution would be something like bus_type which is
exposed via configfs. It would be attached behind a certain device/bus where
the "physical" hotplug interface is. The user would then be able to read the
configuration based on whatever information he has and could then create
devices he likes at runtime. This wouldn't depend much on the firmware that is
used but would require a little more work I think.

> Regards
>
> -- Pantelis

Sebastian
--

Guenter Roeck

unread,
Nov 7, 2013, 5:30:02 PM11/7/13
to
On Thu, Nov 07, 2013 at 08:25:58PM +0100, Sebastian Andrzej Siewior wrote:
> On 06.11.13, Guenter Roeck wrote:
> |…
> thanks for the explanation.
>
> > We use DT overlays to describe the hardware on those boards and, if necessary,
> > its configuration. For example, if there is a PCIe switch, the overlay would
> > describe its memory and bus number configuration.
>
> So have your "fix" configuration and a few overlays you switch at
> runtime. The problem you have is that you want to switch a specific part
> if your configuration at runtime. I assume you run DT on ARM. What
> happens if you swtich from ARM to x86 and you "keep" your FPGA
> configuration requirement? You can't use both, DT and ACPI, right? So
> what happens then?
>
We intend to use DT on x86 to augment ACPI data. There is a variety of reasons
why we can not use ACPI, nor do we want to as we prefer a single method for
handling OIR on all platforms. FWIW, the non-x86 platform is powerpc, not arm.

Guenter

Guenter Roeck

unread,
Nov 7, 2013, 6:00:01 PM11/7/13
to
On Thu, Nov 07, 2013 at 10:06:31PM +0200, Pantelis Antoniou wrote:
> Hi Sebastian,
>
> On Nov 7, 2013, at 9:25 PM, Sebastian Andrzej Siewior wrote:
>
> > On 06.11.13, Guenter Roeck wrote:
> > |…
> > thanks for the explanation.
> >
> >> We use DT overlays to describe the hardware on those boards and, if necessary,
> >> its configuration. For example, if there is a PCIe switch, the overlay would
> >> describe its memory and bus number configuration.
> >
> > So have your "fix" configuration and a few overlays you switch at
> > runtime. The problem you have is that you want to switch a specific part
> > if your configuration at runtime. I assume you run DT on ARM. What

powerpc

> > happens if you swtich from ARM to x86 and you "keep" your FPGA
> > configuration requirement? You can't use both, DT and ACPI, right? So
> > what happens then?
> >
>
> FWIW DT has been ported to x86. And is present on arm/powerpc/mips/arc and possibly
> others.
>
> So what are we talking about again? If you care about the non-DT case, why
> don't you make a patch about how you could support Guenter's use case on
> the x86.
>
> His use case is not uncommon, believe it or not, and x86 would benefit from
> something this flexible.
>
Together with the work Thierry has done a couple of years ago, using DT
to augment ACPI data on x86 platforms, I don't really see a major problem
with using DT overlays on x86. Sure, it will require some work, and the
resulting patches may not be accepted for upstream integration,
but the concept is already there, and we plan to make good use of it.

Guenter

Guenter Roeck

unread,
Nov 7, 2013, 6:10:02 PM11/7/13
to
On Thu, Nov 07, 2013 at 09:46:26PM +0100, Sebastian Andrzej Siewior wrote:
> On 07.11.13, Pantelis Antoniou wrote:
> > Hi Sebastian,
> Hi Pantelis,
>
> > FWIW DT has been ported to x86. And is present on arm/powerpc/mips/arc and possibly
> > others.
>
> Yes, I know. I am the one that did the work for CE4100, the first one
> that boots with DT on x86.
>
> > So what are we talking about again? If you care about the non-DT case, why
> > don't you make a patch about how you could support Guenter's use case on
> > the x86.
>
> I am only saying that this "hot-plug a device at a non hot-plugagle bus at
> runtime" is not limited to DT but this solution is. X86 + ACPI is not
> the only limitation. ARM is (forced) going to ACPI as well as far I
> know. And this solution is limited to DT. This is what I am pointing
> out.
>
I can't tell about ARM, but I am not entirely sure how ACPI support on ARM
is going to help us on powerpc.

> > His use case is not uncommon, believe it or not, and x86 would benefit from
> > something this flexible.
>
> I *think* a more flexible solution would be something like bus_type which is
> exposed via configfs. It would be attached behind a certain device/bus where
> the "physical" hotplug interface is. The user would then be able to read the
> configuration based on whatever information he has and could then create
> devices he likes at runtime. This wouldn't depend much on the firmware that is
> used but would require a little more work I think.
>
Quite frankly, I am interested at a solution that works and solves our problems.
I am not looking for something that is 100% perfect and may never be delivered.

Fortunately, the Linux kernel was willing to adopt multiple different file
systems, and still accepts new ones on a regular basis. If a new file system
is better, it will start getting used, and old file systems are being phased out
as fewer people use them. I would hope the same should be possible with DT
overlays and possible other future solutions for the same problem, and that
we won't have to wait for the perfect solution from day 1.

Guenter

delicious quinoa

unread,
Nov 7, 2013, 6:40:02 PM11/7/13
to
On Tue, Nov 5, 2013 at 12:41 PM, Pantelis Antoniou
<pa...@antoniou-consulting.com> wrote:
> +
> + pr_info("%s: Applied #%d overlay segments @%d\n", __func__,
> + od->ovinfo_cnt, od->id);
> +

This could be pr_debug so that we get normally get silence unless
something fails, like insmod.

Also please take out the '#'.

We see it working with our fpga ip driver and it looks really great.

Alan

Pantelis Antoniou

unread,
Nov 8, 2013, 2:10:02 AM11/8/13
to
Hi Sebastian,

On Nov 7, 2013, at 10:46 PM, Sebastian Andrzej Siewior wrote:

> On 07.11.13, Pantelis Antoniou wrote:
>> Hi Sebastian,
> Hi Pantelis,
>
>> FWIW DT has been ported to x86. And is present on arm/powerpc/mips/arc and possibly
>> others.
>
> Yes, I know. I am the one that did the work for CE4100, the first one
> that boots with DT on x86.
>
>> So what are we talking about again? If you care about the non-DT case, why
>> don't you make a patch about how you could support Guenter's use case on
>> the x86.
>
> I am only saying that this "hot-plug a device at a non hot-plugagle bus at
> runtime" is not limited to DT but this solution is. X86 + ACPI is not
> the only limitation. ARM is (forced) going to ACPI as well as far I
> know. And this solution is limited to DT. This is what I am pointing
> out.
>

Who is forcing ACPI on ARM? Maybe for ARM64 and server markets but interest
in ACPI for all the other markets I'd say is nil.

A DT limited solution has more reach _today_ that what ACPI _might_ do sometime.

There is a big big world outside of x86.

>> His use case is not uncommon, believe it or not, and x86 would benefit from
>> something this flexible.
>
> I *think* a more flexible solution would be something like bus_type which is
> exposed via configfs. It would be attached behind a certain device/bus where
> the "physical" hotplug interface is. The user would then be able to read the
> configuration based on whatever information he has and could then create
> devices he likes at runtime. This wouldn't depend much on the firmware that is
> used but would require a little more work I think.
>

You might've missed the posting, but the original implementation was using a
bus (a capebus) and that went over like a lead ballon.

People use this, and find the concept useful.

In a nutshell, sure, there _might_ be better ways to do it, but no-one has
actually stepped forward and did it better.

>> Regards
>>
>> -- Pantelis
>
> Sebastian

Regards

-- Pantelis

Pantelis Antoniou

unread,
Nov 8, 2013, 2:20:01 AM11/8/13
to
Hi Alan,

On Nov 8, 2013, at 1:38 AM, delicious quinoa wrote:

> On Tue, Nov 5, 2013 at 12:41 PM, Pantelis Antoniou
> <pa...@antoniou-consulting.com> wrote:
>> +
>> + pr_info("%s: Applied #%d overlay segments @%d\n", __func__,
>> + od->ovinfo_cnt, od->id);
>> +
>
> This could be pr_debug so that we get normally get silence unless
> something fails, like insmod.
>
> Also please take out the '#'.
>

OK, will do on the next series.

> We see it working with our fpga ip driver and it looks really great.
>

Glad to hear that.

> Alan

Regards

-- Pantelis

Pantelis Antoniou

unread,
Nov 8, 2013, 2:20:01 AM11/8/13
to
Hi Guenter,
+1

> Fortunately, the Linux kernel was willing to adopt multiple different file
> systems, and still accepts new ones on a regular basis. If a new file system
> is better, it will start getting used, and old file systems are being phased out
> as fewer people use them. I would hope the same should be possible with DT
> overlays and possible other future solutions for the same problem, and that
> we won't have to wait for the perfect solution from day 1.
>

Fully agreed here. I was told open source is about scratching an itch, this
is worthy of scratching.

> Guenter

Regards

-- Pantelis

Sebastian Andrzej Siewior

unread,
Nov 8, 2013, 3:50:02 AM11/8/13
to
On 07.11.13, Guenter Roeck wrote:
> > I am only saying that this "hot-plug a device at a non hot-plugagle bus at
> > runtime" is not limited to DT but this solution is. X86 + ACPI is not
> > the only limitation. ARM is (forced) going to ACPI as well as far I
> > know. And this solution is limited to DT. This is what I am pointing
> > out.
> >
> I can't tell about ARM, but I am not entirely sure how ACPI support on ARM
> is going to help us on powerpc.

I'm not saying help. Just if you extend one firmware to solve a problem
you need to do it again on an another firmware.

> > > His use case is not uncommon, believe it or not, and x86 would benefit from
> > > something this flexible.
> >
> > I *think* a more flexible solution would be something like bus_type which is
> > exposed via configfs. It would be attached behind a certain device/bus where
> > the "physical" hotplug interface is. The user would then be able to read the
> > configuration based on whatever information he has and could then create
> > devices he likes at runtime. This wouldn't depend much on the firmware that is
> > used but would require a little more work I think.
> >
> Quite frankly, I am interested at a solution that works and solves our problems.
> I am not looking for something that is 100% perfect and may never be delivered.
>
> Fortunately, the Linux kernel was willing to adopt multiple different file
> systems, and still accepts new ones on a regular basis. If a new file system
> is better, it will start getting used, and old file systems are being phased out
> as fewer people use them. I would hope the same should be possible with DT
> overlays and possible other future solutions for the same problem, and that
> we won't have to wait for the perfect solution from day 1.

don't argue about that. However you have to provide support for those
things, they don't simply phase out. You still have support for reiserfs
but I think a lot (likely not all) of the its users migrated away.

If you guys already talked about this and everyone agreed on this then I
am not standing in your way, never was. I was just curious about the
background for this.

>
> Guenter

Sebastian

Pantelis Antoniou

unread,
Nov 8, 2013, 10:10:01 AM11/8/13
to
Introduce support for dynamic device tree resolution.
Using it, it is possible to prepare a device tree that's
been loaded on runtime to be modified and inserted at the kernel
live tree.

Export of of_resolve by Guenter Roeck <gro...@juniper.net>

Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
---
.../devicetree/dynamic-resolution-notes.txt | 25 ++
drivers/of/Kconfig | 9 +
drivers/of/Makefile | 1 +
drivers/of/resolver.c | 396 +++++++++++++++++++++
include/linux/of.h | 17 +
5 files changed, 448 insertions(+)
create mode 100644 Documentation/devicetree/dynamic-resolution-notes.txt
create mode 100644 drivers/of/resolver.c

diff --git a/Documentation/devicetree/dynamic-resolution-notes.txt b/Documentation/devicetree/dynamic-resolution-notes.txt
new file mode 100644
index 0000000..0b396c4
--- /dev/null
+++ b/Documentation/devicetree/dynamic-resolution-notes.txt
@@ -0,0 +1,25 @@
+Device Tree Dynamic Resolver Notes
+----------------------------------
+
+This document describes the implementation of the in-kernel
+Device Tree resolver, residing in drivers/of/resolver.c and is a
+companion document to Documentation/devicetree/dt-object-internal.txt[1]
+
+How the resolver works
+----------------------
+
+The resolver is given as an input an arbitrary tree compiled with the
+proper dtc option and having a /plugin/ tag. This generates the
+appropriate __fixups__ & __local_fixups__ nodes as described in [1].
+
+In sequence the resolver works by the following steps:
+
+1. Get the maximum device tree phandle value from the live tree + 1.
+2. Adjust all the local phandles of the tree to resolve by that amount.
+3. Using the __local__fixups__ node information adjust all local references
+ by the same amount.
+4. For each property in the __fixups__ node locate the node it references
+ in the live tree. This is the label used to tag the node.
+5. Retrieve the phandle of the target of the fixup.
+5. For each fixup in the property locate the node:property:offset location
+ and replace it with the phandle value.
diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index 78cc760..2a00ae5 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -74,4 +74,13 @@ config OF_MTD
depends on MTD
def_bool y

+config OF_RESOLVE
+ bool "OF Dynamic resolution support"
+ depends on OF
+ select OF_DYNAMIC
+ select OF_DEVICE
+ help
+ Enable OF dynamic resolution support. This allows you to
+ load Device Tree object fragments are run time.
+
endmenu # OF
diff --git a/drivers/of/Makefile b/drivers/of/Makefile
index 9bc6d8c..93da457 100644
--- a/drivers/of/Makefile
+++ b/drivers/of/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_OF_MDIO) += of_mdio.o
obj-$(CONFIG_OF_PCI) += of_pci.o
obj-$(CONFIG_OF_PCI_IRQ) += of_pci_irq.o
obj-$(CONFIG_OF_MTD) += of_mtd.o
+obj-$(CONFIG_OF_RESOLVE) += resolver.o
diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c
new file mode 100644
index 0000000..dfbb51a
--- /dev/null
+++ b/drivers/of/resolver.c
@@ -0,0 +1,396 @@
+/*
+ * Functions for dealing with DT resolution
+ *
+ * Copyright (C) 2012 Pantelis Antoniou <pa...@antoniou-consulting.com>
+ * Copyright (C) 2012 Texas Instruments Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/of_fdt.h>
+#include <linux/string.h>
+#include <linux/ctype.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+
+/**
+ * Find a subtree's maximum phandle value.
+ */
+static phandle __of_get_tree_max_phandle(struct device_node *node,
+ phandle max_phandle)
+{
+ struct device_node *child;
+
+ if (node->phandle != 0 && node->phandle != OF_PHANDLE_ILLEGAL &&
+ node->phandle > max_phandle)
+ max_phandle = node->phandle;
+
+ __for_each_child_of_node(node, child)
+ max_phandle = __of_get_tree_max_phandle(child, max_phandle);
+
+ return max_phandle;
+}
+
+/**
+ * Find live tree's maximum phandle value.
+ */
+static phandle of_get_tree_max_phandle(void)
+{
+ struct device_node *node;
+ phandle phandle;
+ unsigned long flags;
+
+ /* get root node */
+ node = of_find_node_by_path("/");
+ if (node == NULL)
+ return OF_PHANDLE_ILLEGAL;
+
+ /* now search recursively */
+ raw_spin_lock_irqsave(&devtree_lock, flags);
+ phandle = __of_get_tree_max_phandle(node, 0);
+ raw_spin_unlock_irqrestore(&devtree_lock, flags);
+
+ of_node_put(node);
+
+ return phandle;
+}
+
+/**
+ * Adjust a subtree's phandle values by a given delta.
+ * Makes sure not to just adjust the device node's phandle value,
+ * but modify the phandle properties values as well.
+ */
+static void __of_adjust_tree_phandles(struct device_node *node,
+ int phandle_delta)
+{
+ struct device_node *child;
+ struct property *prop;
+ phandle phandle;
+
+ /* first adjust the node's phandle direct value */
+ if (node->phandle != 0 && node->phandle != OF_PHANDLE_ILLEGAL)
+ node->phandle += phandle_delta;
+
+ /* now adjust phandle & linux,phandle values */
+ for_each_property_of_node(node, prop) {
+
+ /* only look for these two */
+ if (of_prop_cmp(prop->name, "phandle") != 0 &&
+ of_prop_cmp(prop->name, "linux,phandle") != 0)
+ continue;
+
+ /* must be big enough */
+ if (prop->length < 4)
+ continue;
+
+ /* read phandle value */
+ phandle = be32_to_cpu(*(uint32_t *)prop->value);
+ if (phandle == OF_PHANDLE_ILLEGAL) /* unresolved */
+ continue;
+
+ /* adjust */
+ *(uint32_t *)prop->value = cpu_to_be32(node->phandle);
+ }
+
+ /* now do the children recursively */
+ __for_each_child_of_node(node, child)
+ __of_adjust_tree_phandles(child, phandle_delta);
+}
+
+/**
+ * Adjust the local phandle references by the given phandle delta.
+ * Assumes the existances of a __local_fixups__ node at the root
+ * of the tree. Does not take any devtree locks so make sure you
+ * call this on a tree which is at the detached state.
+ */
+static int __of_adjust_tree_phandle_references(struct device_node *node,
+ int phandle_delta)
+{
+ phandle phandle;
+ struct device_node *refnode, *child;
+ struct property *rprop, *sprop;
+ char *propval, *propcur, *propend, *nodestr, *propstr, *s;
+ int offset, propcurlen;
+ int err;
+
+ /* locate the symbols & fixups nodes on resolve */
+ __for_each_child_of_node(node, child)
+ if (of_node_cmp(child->name, "__local_fixups__") == 0)
+ break;
+
+ /* no local fixups */
+ if (child == NULL)
+ return 0;
+
+ /* find the local fixups property */
+ for_each_property_of_node(child, rprop) {
+
+ /* skip properties added automatically */
+ if (of_prop_cmp(rprop->name, "name") == 0)
+ continue;
+
+ /* make a copy */
+ propval = kmalloc(rprop->length, GFP_KERNEL);
+ if (propval == NULL) {
+ pr_err("%s: Could not copy value of '%s'\n",
+ __func__, rprop->name);
+ return -ENOMEM;
+ }
+ memcpy(propval, rprop->value, rprop->length);
+
+ propend = propval + rprop->length;
+ for (propcur = propval; propcur < propend;
+ propcur += propcurlen + 1) {
+
+ propcurlen = strlen(propcur);
+
+ nodestr = propcur;
+ s = strchr(propcur, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol entry '%s' (1)\n",
+ __func__, propcur);
+ err = -EINVAL;
+ goto err_fail;
+ }
+ *s++ = '\0';
+
+ propstr = s;
+ s = strchr(s, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol entry '%s' (2)\n",
+ __func__, (char *)rprop->value);
+ err = -EINVAL;
+ goto err_fail;
+ }
+
+ *s++ = '\0';
+ offset = simple_strtoul(s, NULL, 10);
+
+ /* look into the resolve node for the full path */
+ refnode = __of_find_node_by_full_name(node, nodestr);
+ if (refnode == NULL) {
+ pr_warn("%s: Could not find refnode '%s'\n",
+ __func__, (char *)rprop->value);
+ continue;
+ }
+
+ /* now find the property */
+ for_each_property_of_node(refnode, sprop) {
+ if (of_prop_cmp(sprop->name, propstr) == 0)
+ break;
+ }
+
+ if (sprop == NULL) {
+ pr_err("%s: Could not find property '%s'\n",
+ __func__, (char *)rprop->value);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ phandle = be32_to_cpu(*(uint32_t *)
+ (sprop->value + offset));
+ *(uint32_t *)(sprop->value + offset) =
+ cpu_to_be32(phandle + phandle_delta);
+ }
+
+ kfree(propval);
+ }
+
+ return 0;
+
+err_fail:
+ kfree(propval);
+ return err;
+}
+
+/**
+ * of_resolve - Resolve the given node against the live tree.
+ *
+ * @resolve: Node to resolve
+ *
+ * Perform dynamic Device Tree resolution against the live tree
+ * to the given node to resolve. This depends on the live tree
+ * having a __symbols__ node, and the resolve node the __fixups__ &
+ * __local_fixups__ nodes (if needed).
+ * The result of the operation is a resolve node that it's contents
+ * are fit to be inserted or operate upon the live tree.
+ * Returns 0 on success or a negative error value on error.
+ */
+int of_resolve(struct device_node *resolve)
+{
+ struct device_node *child, *refnode;
+ struct device_node *root_sym, *resolve_sym, *resolve_fix;
+ struct property *rprop, *sprop;
+ const char *refpath;
+ char *propval, *propcur, *propend, *nodestr, *propstr, *s;
+ int offset, propcurlen;
+ phandle phandle, phandle_delta;
+ int err;
+
+ /* the resolve node must exist, and be detached */
+ if (resolve == NULL ||
+ !of_node_check_flag(resolve, OF_DETACHED)) {
+ return -EINVAL;
+ }
+
+ /* first we need to adjust the phandles */
+ phandle_delta = of_get_tree_max_phandle() + 1;
+ __of_adjust_tree_phandles(resolve, phandle_delta);
+ err = __of_adjust_tree_phandle_references(resolve, phandle_delta);
+ if (err != 0)
+ return err;
+
+ root_sym = NULL;
+ resolve_sym = NULL;
+ resolve_fix = NULL;
+
+ /* this may fail (if no fixups are required) */
+ root_sym = of_find_node_by_path("/__symbols__");
+
+ /* locate the symbols & fixups nodes on resolve */
+ __for_each_child_of_node(resolve, child) {
+
+ if (resolve_sym == NULL &&
+ of_node_cmp(child->name, "__symbols__") == 0)
+ resolve_sym = child;
+
+ if (resolve_fix == NULL &&
+ of_node_cmp(child->name, "__fixups__") == 0)
+ resolve_fix = child;
+
+ /* both found, don't bother anymore */
+ if (resolve_sym != NULL && resolve_fix != NULL)
+ break;
+ }
+
+ /* we do allow for the case where no fixups are needed */
+ if (resolve_fix == NULL)
+ goto merge_sym;
+
+ /* we need to fixup, but no root symbols... */
+ if (root_sym == NULL)
+ return -EINVAL;
+
+ for_each_property_of_node(resolve_fix, rprop) {
+
+ /* skip properties added automatically */
+ if (of_prop_cmp(rprop->name, "name") == 0)
+ continue;
+
+ err = of_property_read_string(root_sym,
+ rprop->name, &refpath);
+ if (err != 0) {
+ pr_err("%s: Could not find symbol '%s'\n",
+ __func__, rprop->name);
+ goto err_fail;
+ }
+
+ refnode = of_find_node_by_path(refpath);
+ if (refnode == NULL) {
+ pr_err("%s: Could not find node by path '%s'\n",
+ __func__, refpath);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ phandle = refnode->phandle;
+ of_node_put(refnode);
+
+ pr_debug("%s: %s phandle is 0x%08x\n",
+ __func__, rprop->name, phandle);
+
+ /* make a copy */
+ propval = kmalloc(rprop->length, GFP_KERNEL);
+ if (propval == NULL) {
+ pr_err("%s: Could not copy value of '%s'\n",
+ __func__, rprop->name);
+ err = -ENOMEM;
+ goto err_fail;
+ }
+
+ memcpy(propval, rprop->value, rprop->length);
+
+ propend = propval + rprop->length;
+ for (propcur = propval; propcur < propend;
+ propcur += propcurlen + 1) {
+ propcurlen = strlen(propcur);
+
+ nodestr = propcur;
+ s = strchr(propcur, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol "
+ "entry '%s' (1)\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -EINVAL;
+ goto err_fail;
+ }
+ *s++ = '\0';
+
+ propstr = s;
+ s = strchr(s, ':');
+ if (s == NULL) {
+ pr_err("%s: Illegal symbol "
+ "entry '%s' (2)\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -EINVAL;
+ goto err_fail;
+ }
+
+ *s++ = '\0';
+ offset = simple_strtoul(s, NULL, 10);
+
+ /* look into the resolve node for the full path */
+ refnode = __of_find_node_by_full_name(resolve,
+ nodestr);
+ if (refnode == NULL) {
+ pr_err("%s: Could not find refnode '%s'\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ /* now find the property */
+ for_each_property_of_node(refnode, sprop) {
+ if (of_prop_cmp(sprop->name, propstr) == 0)
+ break;
+ }
+
+ if (sprop == NULL) {
+ pr_err("%s: Could not find property '%s'\n",
+ __func__, (char *)rprop->value);
+ kfree(propval);
+ err = -ENOENT;
+ goto err_fail;
+ }
+
+ *(uint32_t *)(sprop->value + offset) =
+ cpu_to_be32(phandle);
+ }
+
+ kfree(propval);
+ }
+
+merge_sym:
+
+ of_node_put(root_sym);
+
+ return 0;
+
+err_fail:
+
+ if (root_sym != NULL)
+ of_node_put(root_sym);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(of_resolve);
diff --git a/include/linux/of.h b/include/linux/of.h
index 9d69bd2..22d42e5 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -721,4 +721,21 @@ static inline int of_multi_prop_cmp(const struct property *prop, const char *val

#endif /* !CONFIG_OF */

+
+/* illegal phandle value (set when unresolved) */
+#define OF_PHANDLE_ILLEGAL 0xdeadbeef
+
+#ifdef CONFIG_OF_RESOLVE
+
+int of_resolve(struct device_node *resolve);
+
+#else
+
+static inline int of_resolve(struct device_node *resolve)
+{
+ return -ENOTSUPP;
+}
+
+#endif
+
#endif /* _LINUX_OF_H */
--
1.7.12

Pantelis Antoniou

unread,
Nov 8, 2013, 10:10:02 AM11/8/13
to
Add a runtime interface to /proc to enable generic device tree overlay
usage.

Two new /proc files are added:

/proc/device-tree-overlay & /proc/device-tree-overlay-status

/proc/device-tree-overlay accepts a stream of a device tree objects and
applies it to the running kernel's device tree.

$ cat ~/BB-UART2-00A0.dtbo >device-tree-overlay
overlay_proc_release: Applied #2 overlay segments @0

/proc/device-tree-overlay-status displays the the overlays added using
the /proc interface

$ cat device-tree-overlay-status
0: 861 bytes BB-UART2:00A0

The format of the status line is
<ID>: <SIZE> bytes <part-number>:<version>

<ID> is the id of the overlay
<SIZE> is the size of the overlay in bytes
<part-number>, <version> are (optional) root level properties of the DTBO

You can remove an overlay by echoing the <ID> number of the overlay
precedded with a '-'

So
$ echo "-0" >device-tree-overlay-status

Removes the overlay.

Note that this seldom works on most platforms since platform_device
removal is something that almost never works without extra patches.

Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
---
fs/proc/proc_devtree.c | 278 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 278 insertions(+)

diff --git a/fs/proc/proc_devtree.c b/fs/proc/proc_devtree.c
index 106a835..48f7925 100644
--- a/fs/proc/proc_devtree.c
+++ b/fs/proc/proc_devtree.c
@@ -16,6 +16,9 @@
#include <linux/slab.h>
#include <asm/prom.h>
#include <asm/uaccess.h>
+#include <linux/of_fdt.h>
+#include <linux/idr.h>
+
#include "internal.h"

static inline void set_node_proc_entry(struct device_node *np,
@@ -28,6 +31,275 @@ static inline void set_node_proc_entry(struct device_node *np,

static struct proc_dir_entry *proc_device_tree;

+#if defined(CONFIG_OF_OVERLAY)
+static struct proc_dir_entry *proc_device_tree_overlay;
+static struct proc_dir_entry *proc_device_tree_overlay_status;
+static DEFINE_MUTEX(overlay_lock);
+static DEFINE_IDR(overlay_idr);
+
+struct proc_overlay_data {
+ void *buf;
+ size_t alloc;
+ size_t size;
+
+ int id;
+ struct device_node *overlay;
+ int ovinfo_cnt;
+ struct of_overlay_info *ovinfo;
+ unsigned int failed : 1;
+ unsigned int applied : 1;
+ unsigned int removing : 1;
+};
+
+static int overlay_proc_open(struct inode *inode, struct file *file)
+{
+ struct proc_overlay_data *od;
+
+ od = kzalloc(sizeof(*od), GFP_KERNEL);
+ if (od == NULL)
+ return -ENOMEM;
+
+ od->id = -1;
+
+ /* save it */
+ file->private_data = od;
+
+ return 0;
+}
+
+static ssize_t overlay_proc_write(struct file *file, const char __user *buf, size_t size, loff_t *ppos)
+{
+ struct proc_overlay_data *od = file->private_data;
+ void *new_buf;
+
+ /* need to alloc? */
+ if (od->size + size > od->alloc) {
+
+ /* start at 256K at first */
+ if (od->alloc == 0)
+ od->alloc = SZ_256K / 2;
+
+ /* double buffer */
+ od->alloc <<= 1;
+ new_buf = kzalloc(od->alloc, GFP_KERNEL);
+ if (new_buf == NULL) {
+ pr_err("%s: failed to grow buffer\n", __func__);
+ od->failed = 1;
+ return -ENOMEM;
+ }
+
+ /* copy all we had previously */
+ memcpy(new_buf, od->buf, od->size);
+
+ /* free old buffer and assign new */
+ kfree(od->buf);
+ od->buf = new_buf;
+ }
+
+ if (unlikely(copy_from_user(od->buf + od->size, buf, size))) {
+ pr_err("%s: fault copying from userspace\n", __func__);
+ return -EFAULT;
+ }
+
+ od->size += size;
+ *ppos += size;
+
+ return size;
+}
+
+static int overlay_proc_release(struct inode *inode, struct file *file)
+{
+ struct proc_overlay_data *od = file->private_data;
+ int id;
+ int err = 0;
+
+ /* perfectly normal when not loading */
+ if (od == NULL)
+ return 0;
+
+ if (od->failed)
+ goto out_free;
+
+ of_fdt_unflatten_tree(od->buf, &od->overlay);
+ if (od->overlay == NULL) {
+ pr_err("%s: failed to unflatten tree\n", __func__);
+ err = -EINVAL;
+ goto out_free;
+ }
+ pr_debug("%s: unflattened OK\n", __func__);
+
+ /* mark it as detached */
+ of_node_set_flag(od->overlay, OF_DETACHED);
+
+ /* perform resolution */
+ err = of_resolve(od->overlay);
+ if (err != 0) {
+ pr_err("%s: Failed to resolve tree\n", __func__);
+ goto out_free;
+ }
+ pr_debug("%s: resolved OK\n", __func__);
+
+ /* now build an overlay info array */
+ err = of_build_overlay_info(od->overlay,
+ &od->ovinfo_cnt, &od->ovinfo);
+ if (err != 0) {
+ pr_err("%s: Failed to build overlay info\n", __func__);
+ goto out_free;
+ }
+
+ pr_debug("%s: built %d overlay segments\n", __func__,
+ od->ovinfo_cnt);
+
+ err = of_overlay(od->ovinfo_cnt, od->ovinfo);
+ if (err != 0) {
+ pr_err("%s: Failed to apply overlay\n", __func__);
+ goto out_free;
+ }
+
+ od->applied = 1;
+
+ mutex_lock(&overlay_lock);
+ idr_preload(GFP_KERNEL);
+ id = idr_alloc(&overlay_idr, od, 0, -1, GFP_KERNEL);
+ idr_preload_end();
+ mutex_unlock(&overlay_lock);
+
+ if (id < 0) {
+ err = id;
+ pr_err("%s: failed to get id for overlay\n", __func__);
+ goto out_free;
+ }
+ od->id = id;
+
+ pr_info("%s: Applied #%d overlay segments @%d\n", __func__,
+ od->ovinfo_cnt, od->id);
+
+ return 0;
+
+out_free:
+ if (od->id != -1)
+ idr_remove(&overlay_idr, od->id);
+ if (od->applied)
+ of_overlay_revert(od->ovinfo_cnt, od->ovinfo);
+ if (od->ovinfo)
+ of_free_overlay_info(od->ovinfo_cnt, od->ovinfo);
+ /* release memory */
+ kfree(od->buf);
+ kfree(od);
+
+ return 0;
+}
+
+static const struct file_operations overlay_proc_fops = {
+ .owner = THIS_MODULE,
+ .open = overlay_proc_open,
+ .write = overlay_proc_write,
+ .release = overlay_proc_release,
+};
+
+/*
+ * Supply data on a read from /proc/device-tree-overlay-status
+ */
+static int overlay_status_proc_show(struct seq_file *m, void *v)
+{
+ int err, id;
+ struct proc_overlay_data *od;
+ const char *part_number, *version;
+
+ rcu_read_lock();
+ idr_for_each_entry(&overlay_idr, od, id) {
+ seq_printf(m, "%d: %d bytes", id, od->size);
+ /* TODO Make this standardized? */
+ err = of_property_read_string(od->overlay, "part-number",
+ &part_number);
+ if (err != 0)
+ part_number = NULL;
+ err = of_property_read_string(od->overlay, "version",
+ &version);
+ if (err != 0)
+ version = NULL;
+ if (part_number) {
+ seq_printf(m, " %s", part_number);
+ if (version)
+ seq_printf(m, ":%s", version);
+ }
+ seq_printf(m, "\n");
+ }
+ rcu_read_unlock();
+
+ return 0;
+}
+
+static int overlay_status_proc_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, overlay_status_proc_show, __PDE_DATA(inode));
+}
+
+static ssize_t overlay_status_proc_write(struct file *file, const char __user *buf,
+ size_t size, loff_t *ppos)
+{
+ struct proc_overlay_data *od;
+ char buffer[PROC_NUMBUF + 1];
+ char *ptr;
+ int id, err, count;
+
+ memset(buffer, 0, sizeof(buffer));
+ count = size;
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+
+ if (copy_from_user(buffer, buf, count)) {
+ err = -EFAULT;
+ goto out;
+ }
+
+ ptr = buffer;
+ if (*ptr != '-') { /* only removal supported */
+ err = -EINVAL;
+ goto out;
+ }
+ ptr++;
+ err = kstrtoint(strstrip(ptr), 0, &id);
+ if (err)
+ goto out;
+
+ /* find it */
+ mutex_lock(&overlay_lock);
+ od = idr_find(&overlay_idr, id);
+ if (od == NULL) {
+ mutex_unlock(&overlay_lock);
+ err = -EINVAL;
+ goto out;
+ }
+ /* remove it */
+ idr_remove(&overlay_idr, id);
+ mutex_unlock(&overlay_lock);
+
+ err = of_overlay_revert(od->ovinfo_cnt, od->ovinfo);
+ if (err != 0) {
+ pr_err("%s: of_overlay_revert failed\n", __func__);
+ goto out;
+ }
+ /* release memory */
+ of_free_overlay_info(od->ovinfo_cnt, od->ovinfo);
+ kfree(od->buf);
+ kfree(od);
+
+ pr_info("%s: Removed overlay with id %d\n", __func__, id);
+out:
+ return err < 0 ? err : count;
+}
+
+static const struct file_operations overlay_status_proc_fops = {
+ .owner = THIS_MODULE,
+ .open = overlay_status_proc_open,
+ .read = seq_read,
+ .write = overlay_status_proc_write,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+#endif
+
/*
* Supply data on a read from /proc/device-tree/node/property.
*/
@@ -239,5 +511,11 @@ void __init proc_device_tree_init(void)
return;
}
proc_device_tree_add_node(root, proc_device_tree);
+#if defined(CONFIG_OF_OVERLAY)
+ proc_device_tree_overlay = proc_create_data("device-tree-overlay",
+ S_IWUSR, NULL, &overlay_proc_fops, NULL);
+ proc_device_tree_overlay_status = proc_create_data("device-tree-overlay-status",
+ S_IRUSR| S_IWUSR, NULL, &overlay_status_proc_fops, NULL);
+#endif
of_node_put(root);

Pantelis Antoniou

unread,
Nov 8, 2013, 10:10:03 AM11/8/13
to
The following patchset introduces Device Tree overlays, a method
of dynamically altering the kernel's live Device Tree, along with
a generic interface to use it in a board agnostic manner.

It is against mainline as of today, Nov 5 2013:
be408cd3e1fef73e9408b196a79b9934697fe3b1
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

It relies on the following previously submitted patches/patchsets:

The DTC compiler patch required to get symbol resolution working

"dtc: Dynamic symbols & fixup support (v2)"

The I2C device registration patch

"of: i2c: Export single device registration method"

And finally a patchset exporting various OF fixes

"OF: Fixes in preperation of DT overlays"

Note that although this patchset allows DT overlay removal
on my preferred platform (omap), platform device registration
fails; another patchset that deals with this issue has been posted:

"omap_device removal fixups"

I should also note that the /proc interface is an anachronism, but
it is the place where /proc/device-tree is also located, so it makes
sense to group /proc/device-tree & /proc/device-tree-overlay* together.
I am open to suggestions about where the generic interface should reside.
A suggestion has been made about configfs but I'd like to get this out
as a basis of discussion.

This low level interface should be used as a starting point for platforms
with discoverable hardware on the daughtercard level since they should use the
facilities provided to implement their own policy, dealing with things like
conflicts, high level resource allocation and so on.

Changes in V3:
* Fixed of_free_overlay_info() missing static inline
* Punting when handling PCI devices
* Missing EXPORTs()
all by Guenter Roeck <gro...@juniper.net>

Changes in V2:
* Removal of any bits related to a specific board (beaglebone).
* Introduced a platform agnostic interface using /proc/device-tree-overlay
* Various bug fixes related to i2c device handling have been squashed in.

Pantelis Antoniou (3):
OF: Introduce Device Tree resolve support.
OF: Introduce DT overlay support.
DT: proc: Add runtime overlay interface in /proc

.../devicetree/dynamic-resolution-notes.txt | 25 +
Documentation/devicetree/overlay-notes.txt | 179 +++++
drivers/of/Kconfig | 19 +
drivers/of/Makefile | 2 +
drivers/of/overlay.c | 886 +++++++++++++++++++++
drivers/of/resolver.c | 396 +++++++++
fs/proc/proc_devtree.c | 278 +++++++
include/linux/of.h | 130 +++
8 files changed, 1915 insertions(+)
create mode 100644 Documentation/devicetree/dynamic-resolution-notes.txt
create mode 100644 Documentation/devicetree/overlay-notes.txt
create mode 100644 drivers/of/overlay.c
create mode 100644 drivers/of/resolver.c

Pantelis Antoniou

unread,
Nov 8, 2013, 10:10:03 AM11/8/13
to
Introduce DT overlay support.
Using this functionality it is possible to dynamically overlay a part of
the kernel's tree with another tree that's been dynamically loaded.
It is also possible to remove node and properties.

Note that the I2C client devices are 'special', as in they're not platform
devices. They need to be registered with an I2C specific method.

In general I2C is just no good platform device citizen and needs
special casing.

PCI is another special case where PCI device insertion and removal
is implemented in the PCI subsystem.

Bug fixes & PCI support by Guenter Roeck <gro...@juniper.net>

Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>
---
Documentation/devicetree/overlay-notes.txt | 179 ++++++
drivers/of/Kconfig | 10 +
drivers/of/Makefile | 1 +
drivers/of/overlay.c | 886 +++++++++++++++++++++++++++++
include/linux/of.h | 113 ++++
5 files changed, 1189 insertions(+)
create mode 100644 Documentation/devicetree/overlay-notes.txt
create mode 100644 drivers/of/overlay.c

diff --git a/Documentation/devicetree/overlay-notes.txt b/Documentation/devicetree/overlay-notes.txt
new file mode 100644
index 0000000..5289cbb
--- /dev/null
+++ b/Documentation/devicetree/overlay-notes.txt
@@ -0,0 +1,179 @@
+Device Tree Overlay Notes
+-------------------------
+
+This document describes the implementation of the in-kernel
diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index 2a00ae5..c2d5596 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -83,4 +83,14 @@ config OF_RESOLVE
Enable OF dynamic resolution support. This allows you to
load Device Tree object fragments are run time.

+config OF_OVERLAY
+ bool "OF overlay support"
+ depends on OF
+ select OF_DYNAMIC
+ select OF_DEVICE
+ select OF_RESOLVE
+ help
+ OpenFirmware overlay support. Allows you to modify on runtime the
+ live tree using overlays.
+
endmenu # OF
diff --git a/drivers/of/Makefile b/drivers/of/Makefile
index 93da457..ca466e4 100644
--- a/drivers/of/Makefile
+++ b/drivers/of/Makefile
@@ -10,3 +10,4 @@ obj-$(CONFIG_OF_PCI) += of_pci.o
obj-$(CONFIG_OF_PCI_IRQ) += of_pci_irq.o
obj-$(CONFIG_OF_MTD) += of_mtd.o
obj-$(CONFIG_OF_RESOLVE) += resolver.o
+obj-$(CONFIG_OF_OVERLAY) += overlay.o
diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
new file mode 100644
index 0000000..7d14881
--- /dev/null
+++ b/drivers/of/overlay.c
@@ -0,0 +1,886 @@
+/*
+ * Functions for working with device tree overlays
+ *
+ * Copyright (C) 2012 Pantelis Antoniou <pa...@antoniou-consulting.com>
+ * Copyright (C) 2012 Texas Instruments Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ */
+#undef DEBUG
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/i2c.h>
+#include <linux/string.h>
+#include <linux/ctype.h>
+#include <linux/errno.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+ if (target == NULL || overlay == NULL)
+ return -EINVAL;
+
+ for_each_property_of_node(overlay, prop) {
+
+ /* don't touch, 'name' */
+ if (of_prop_cmp(prop->name, "name") == 0)
+ continue;
+
+ /* default is add */
+ remove = 0;
+ pname = prop->name;
+ if (*pname == '-') { /* skip, - notes removal */
+ pname++;
+ remove = 1;
+ propn = NULL;
+ } else {
+ propn = __of_copy_property(prop,
+ GFP_KERNEL);
+ if (propn == NULL)
+ return -ENOMEM;
+ }
+
+ if (full_name == NULL)
+ return -ENOMEM;
+
+ /* create empty tree as a target */
+ tchild = __of_create_empty_node(cname,
+ child->type, full_name,
+ child->phandle, GFP_KERNEL);
+
+ /* free either way */
+ kfree(full_name);
+
+ if (tchild == NULL)
+ return -ENOMEM;
+
+ /* point to parent */
+ tchild->parent = target;
+
+ ret = of_attach_node(tchild);
+ if (ret != 0)
+ return ret;
+
+ /* apply the overlay */
+ ret = of_overlay_apply_one(tchild, child);
+ if (ret != 0) {
+ __of_free_tree(tchild);
+ return ret;
+ }
+ }
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * Lookup an overlay device entry
+ */
+struct of_overlay_device_entry *of_overlay_device_entry_lookup(
+ struct of_overlay_info *ovinfo, struct device_node *node)
+{
+ struct of_overlay_device_entry *de;
+
+ /* no need for locks, we'de under the ovinfo->lock */
+ list_for_each_entry(de, &ovinfo->de_list, node) {
+ if (de->np == node)
+ return de;
+ }
+ return NULL;
+}
+
+/**
+ * Add an overlay log entry
+ */
+static int of_overlay_log_entry_entry_add(struct of_overlay_info *ovinfo,
+ unsigned long action, struct device_node *dn,
+ struct property *prop)
+{
+ struct of_overlay_log_entry *le;
+
+ /* check */
+ if (ovinfo == NULL || dn == NULL)
+ return -EINVAL;
+
+ le = kzalloc(sizeof(*le), GFP_KERNEL);
+ if (le == NULL) {
+ pr_err("%s: Failed to allocate\n", __func__);
+ return -ENOMEM;
+ }
+
+ /* get a reference to the node */
+ le->action = action;
+ le->np = of_node_get(dn);
+ le->prop = prop;
+
+ if (action == OF_RECONFIG_UPDATE_PROPERTY && prop)
+ le->old_prop = of_find_property(dn, prop->name, NULL);
+
+ list_add_tail(&le->node, &ovinfo->le_list);
+
+ return 0;
+}
+
+}
+
+/**
+ * Overlay OF notifier
+ *
+ * Called every time there's a property/node modification
+ * Every modification causes a log entry addition, while
+ * any modification that causes a node's state to change
+ * from/to disabled to/from enabled causes a device entry
+ * addition.
+ */
+static int of_overlay_notify(struct notifier_block *nb,
+ unsigned long action, void *arg)
+{
+ struct of_overlay_info *ovinfo;
+ struct device_node *node;
+ struct property *prop, *sprop, *cprop;
+ struct of_prop_reconfig *pr;
+ struct platform_device *pdev;
+ struct i2c_client *client;
+ struct device_node *tnode;
+ int depth;
+ int prevstate, state;
+ int err = 0;
+
+ ovinfo = container_of(nb, struct of_overlay_info, notifier);
+
+ /* prep vars */
+ switch (action) {
+ case OF_RECONFIG_ATTACH_NODE:
+ case OF_RECONFIG_DETACH_NODE:
+ node = arg;
+ if (node == NULL)
+ return notifier_from_errno(-EINVAL);
+ prop = NULL;
+ break;
+ case OF_RECONFIG_ADD_PROPERTY:
+ case OF_RECONFIG_REMOVE_PROPERTY:
+ case OF_RECONFIG_UPDATE_PROPERTY:
+ pr = arg;
+ if (pr == NULL)
+ return notifier_from_errno(-EINVAL);
+ node = pr->dn;
+ if (node == NULL)
+ return notifier_from_errno(-EINVAL);
+ prop = pr->prop;
+ if (prop == NULL)
+ return notifier_from_errno(-EINVAL);
+ break;
+ default:
+ return notifier_from_errno(0);
+ }
+
+ /* add to the log */
+ err = of_overlay_log_entry_entry_add(ovinfo, action, node, prop);
+ if (err != 0)
+ return notifier_from_errno(err);
+
+ /* come up with the device entry (if any) */
+ pdev = NULL;
+ client = NULL;
+ state = 0;
+ prevstate = 0;
+
+ goto out;
+ }
+
+ if (state == 0) {
+ pdev = of_find_device_by_node(node);
+ client = of_find_i2c_device_by_node(node);
+ }
+
+ of_overlay_device_entry_entry_add(ovinfo, node, pdev, client,
+ prevstate, state);
+out:
+
+ return notifier_from_errno(err);
+}
+
+/**
+ * Prepare for the overlay, for now it just registers the
+ * notifier.
+ */
+static int of_overlay_prep_one(struct of_overlay_info *ovinfo)
+{
+ int err;
+
+ err = of_reconfig_notifier_register(&ovinfo->notifier);
+ if (err != 0) {
+ pr_err("%s: failed to register notifier for '%s'\n",
+ __func__, ovinfo->target->full_name);
+ return err;
+ }
+ return 0;
+}
+
+static int of_overlay_device_entry_change(struct of_overlay_info *ovinfo,
+ struct of_overlay_device_entry *de, int revert)
+{
+ struct i2c_adapter *adap = NULL;
+ struct i2c_client *client;
+ struct platform_device *pdev, *parent_pdev = NULL;
+ int state;
+ struct device_node *np;
+ bool is_pci = false;
+
+ np = de->np;
+ while (np && !is_pci) {
+ is_pci = np->type && !of_node_cmp(np->type, "pci");
+ np = np->parent;
+ }
+
+ if (is_pci) {
+ pr_debug("%s: Node %s is pci node, leaving it alone\n",
+ __func__, np->full_name);
+ return 0;
+ }
+
+
+ return 0;
+}
+
+/**
+ * Revert one overlay
+ * Either due to an error, or due to normal overlay removal.
+ * Using the log entries, we revert any change to the live tree.
+ * In the same manner, using the device entries we enable/disable
+ * the platform devices appropriately.
+ */
+static void of_overlay_revert_one(struct of_overlay_info *ovinfo)
+{
+ struct of_overlay_device_entry *de, *den;
+ struct of_overlay_log_entry *le, *len;
+ struct property *prop, **propp;
+ int ret;
+ unsigned long flags;
+
+ /* property is possibly on deadprops (avoid alloc) */
+ raw_spin_lock_irqsave(&devtree_lock, flags);
+ prop = le->action == OF_RECONFIG_REMOVE_PROPERTY ?
+ le->prop : le->old_prop;
+ propp = &le->np->deadprops;
+ while (*propp != NULL) {
+ if (*propp == prop)
+ break;
+ propp = &(*propp)->next;
+ }
+ if (*propp != NULL) {
+ /* remove it from deadprops */
+ (*propp)->next = prop->next;
+ raw_spin_unlock_irqrestore(&devtree_lock,
+ flags);
+ } else {
+ raw_spin_unlock_irqrestore(&devtree_lock,
+ flags);
+ /* not found, just make a copy */
+ prop = __of_copy_property(prop, GFP_KERNEL);
+ if (prop == NULL) {
+ pr_err("%s: Failed to copy property\n",
+ __func__);
+ break;
+ }
+ }
+
+ if (le->action == OF_RECONFIG_REMOVE_PROPERTY)
+ ret = of_add_property(le->np, prop);
+ else
+ ret = of_update_property(le->np, prop);
+ break;
+
+ default:
+ /* nothing */
+ break;
+ }
+
+ if (ret != 0)
+ pr_err("%s: revert on node %s Failed!\n",
+ __func__, le->np->full_name);
+
+ of_node_put(le->np);
+
+ list_del(&le->node);
+
+ kfree(le);
+ }
+}
+
+/**
+ * Perform the post overlay work.
+ *
+ * We unregister the notifier, and in the case on an error we
+ * revert the overlay.
+ * If the overlay applied correctly, we iterare over the device entries
+ * and create/destroy the platform devices appropriately.
+ */
+static int of_overlay_post_one(struct of_overlay_info *ovinfo, int err)
+{
+ struct of_overlay_device_entry *de, *den;
+
+ of_reconfig_notifier_unregister(&ovinfo->notifier);
+
+ if (err != 0) {
+ /* revert this (possible partially applied) overlay */
+ of_overlay_revert_one(ovinfo);
+ return 0;
+ }
+
+ /* overlay applied correctly, now create/destroy pdevs */
+ list_for_each_entry_safe(de, den, &ovinfo->de_list, node) {
+
+ /* no state change? just remove this entry */
+ if (de->prevstate == de->state) {
+ of_node_put(de->np);
+ list_del(&de->node);
+ kfree(de);
+ } else {
+ of_overlay_device_entry_change(ovinfo, de, 0);
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * of_overlay - Apply @count overlays pointed at by @ovinfo_tab
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to apply
+ *
+ * Applies the overlays given, while handling all error conditions
+ * appropriately. Either the operation succeeds, or if it fails the
+ * live tree is reverted to the state before the attempt.
+ * Returns 0, or an error if the overlay attempt failed.
+ */
+int of_overlay(int count, struct of_overlay_info *ovinfo_tab)
+{
+ struct of_overlay_info *ovinfo;
+ int i, err;
+
+ if (!ovinfo_tab)
+ return -EINVAL;
+
+ /* first we apply the overlays atomically */
+ for (i = 0; i < count; i++) {
+
+ ovinfo = &ovinfo_tab[i];
+
+ mutex_lock(&ovinfo->lock);
+
+ err = of_overlay_prep_one(ovinfo);
+ if (err == 0)
+ err = of_overlay_apply_one(ovinfo->target,
+ ovinfo->overlay);
+ of_overlay_post_one(ovinfo, err);
+
+ mutex_unlock(&ovinfo->lock);
+
+ if (err != 0) {
+ pr_err("%s: overlay failed '%s'\n",
+ __func__, ovinfo->target->full_name);
+ goto err_fail;
+ }
+ }
+
+ return 0;
+
+err_fail:
+ while (--i >= 0) {
+ ovinfo = &ovinfo_tab[i];
+
+ mutex_lock(&ovinfo->lock);
+ of_overlay_revert_one(ovinfo);
+ mutex_unlock(&ovinfo->lock);
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(of_overlay);
+
+/**
+ * of_overlay_revert - Revert a previously applied overlay
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to apply
+ *
+ * Revert a previous overlay. The state of the live tree
+ * is reverted to the one before the overlay.
+ * Returns 0, or an error if the overlay table is not given.
+ */
+int of_overlay_revert(int count, struct of_overlay_info *ovinfo_tab)
+{
+ struct of_overlay_info *ovinfo;
+ int i;
+
+ if (!ovinfo_tab)
+ return -EINVAL;
+
+ /* revert the overlays in reverse */
+ for (i = count - 1; i >= 0; i--) {
+
+ ovinfo = &ovinfo_tab[i];
+
+ mutex_lock(&ovinfo->lock);
+ of_overlay_revert_one(ovinfo);
+ mutex_unlock(&ovinfo->lock);
+
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(of_overlay_revert);
+
+/**
+ * of_init_overlay_info - Initialize a single of_overlay_info structure
+ * @ovinfo: Pointer to the overlay info structure to initialize
+ *
+ * Initialize a single overlay info structure.
+ */
+void of_init_overlay_info(struct of_overlay_info *ovinfo)
+{
+ memset(ovinfo, 0, sizeof(ovinfo));
+ mutex_init(&ovinfo->lock);
+ INIT_LIST_HEAD(&ovinfo->de_list);
+ INIT_LIST_HEAD(&ovinfo->le_list);
+
+ ovinfo->notifier.notifier_call = of_overlay_notify;
+}
+
+/**
+ * of_fill_overlay_info - Fill an overlay info structure
+ * @info_node: Device node containing the overlay
+ * @ovinfo: Pointer to the overlay info structure to fill
+ *
+ * Fills an overlay info structure with the overlay information
+ * from a device node. This device node must have a target property
+ * which contains a phandle of the overlay target node, and an
+ * __overlay__ child node which has the overlay contents.
+ * Both ovinfo->target & ovinfo->overlay have their references taken.
+ *
+ * Returns 0 on success, or a negative error value.
+ */
+int of_fill_overlay_info(struct device_node *info_node,
+ struct of_overlay_info *ovinfo)
+{
+ u32 val;
+ int ret;
+
+ if (!info_node || !ovinfo)
+ return -EINVAL;
+
+ ret = of_property_read_u32(info_node, "target", &val);
+ if (ret != 0)
+ goto err_fail;
+
+ ovinfo->target = of_find_node_by_phandle(val);
+ if (ovinfo->target == NULL)
+ goto err_fail;
+
+ ovinfo->overlay = of_get_child_by_name(info_node, "__overlay__");
+ if (ovinfo->overlay == NULL)
+ goto err_fail;
+
+ ret = of_property_read_u32(info_node, "depth", &val);
+ if (ret == 0)
+ ovinfo->device_depth = val;
+ else
+ ovinfo->device_depth = 0;
+
+
+ return 0;
+
+err_fail:
+ of_node_put(ovinfo->target);
+ of_node_put(ovinfo->overlay);
+
+ memset(ovinfo, 0, sizeof(*ovinfo));
+ return -EINVAL;
+}
+
+/**
+ * of_build_overlay_info - Build an overlay info array
+ * @tree: Device node containing all the overlays
+ * @cntp: Pointer to where the overlay info count will be help
+ * @ovinfop: Pointer to the pointer of an overlay info structure.
+ *
+ * Helper function that given a tree containing overlay information,
+ * allocates and builds an overlay info array containing it, ready
+ * for use using of_overlay.
+ *
+ * Returns 0 on success with the @cntp @ovinfop pointers valid,
+ * while on error a negative error value is returned.
+ */
+int of_build_overlay_info(struct device_node *tree,
+ int *cntp, struct of_overlay_info **ovinfop)
+{
+ struct device_node *node;
+ struct of_overlay_info *ovinfo;
+ int cnt, err;
+
+ if (tree == NULL || cntp == NULL || ovinfop == NULL)
+ return -EINVAL;
+
+ /* worst case; every child is a node */
+ cnt = 0;
+ for_each_child_of_node(tree, node)
+ cnt++;
+
+ ovinfo = kzalloc(cnt * sizeof(*ovinfo), GFP_KERNEL);
+ if (ovinfo == NULL)
+ return -ENOMEM;
+
+ cnt = 0;
+ for_each_child_of_node(tree, node) {
+
+ of_init_overlay_info(&ovinfo[cnt]);
+ err = of_fill_overlay_info(node, &ovinfo[cnt]);
+ if (err == 0)
+ cnt++;
+ }
+
+ /* if nothing filled, return error */
+ if (cnt == 0) {
+ kfree(ovinfo);
+ return -ENODEV;
+ }
+
+ *cntp = cnt;
+ *ovinfop = ovinfo;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(of_build_overlay_info);
+
+/**
+ * of_free_overlay_info - Free an overlay info array
+ * @count: Number of of_overlay_info's
+ * @ovinfo_tab: Array of overlay_info's to free
+ *
+ * Releases the memory of a previously allocate ovinfo array
+ * by of_build_overlay_info.
+ * Returns 0, or an error if the arguments are bogus.
+ */
+int of_free_overlay_info(int count, struct of_overlay_info *ovinfo_tab)
+{
+ struct of_overlay_info *ovinfo;
+ int i;
+
+ if (!ovinfo_tab || count < 0)
+ return -EINVAL;
+
+ /* do it in reverse */
+ for (i = count - 1; i >= 0; i--) {
+ ovinfo = &ovinfo_tab[i];
+
+ of_node_put(ovinfo->target);
+ of_node_put(ovinfo->overlay);
+ }
+ kfree(ovinfo_tab);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(of_free_overlay_info);
diff --git a/include/linux/of.h b/include/linux/of.h
index 22d42e5..4e19013 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
+ struct property *prop;
+ struct property *old_prop;
+};
+
+/**
+ * struct of_overlay_device_entry - Holds an overlay device entry
+ * @node: list_head for the device list
+ * @np: device node pointer to the device node affected
+ * @pdev: pointer to the platform device affected
+ * @client: pointer to the I2C client device affected
+ * @state: new device state
+ * @prevstate: previous device state
+ *
+ * When the overlay results in a device node's state to change this
+ * fact is recorded in a list of device entries. After the overlay
+ * is applied we can create/destroy the platform devices according
+ * to the new state of the live tree.
+ */
+struct of_overlay_device_entry {
+ struct list_head node;
+ struct device_node *np;
+ struct platform_device *pdev;
+ struct i2c_client *client;
+ int prevstate;
+ int state;
+};
+
+/**
+ * struct of_overlay_info - Holds a single overlay info
+ * @target: target of the overlay operation
+ * @overlay: pointer to the overlay contents node
+ * @lock: Lock to hold when accessing the lists
+ * @le_list: List of the overlay logs
+ * @de_list: List of the overlay records
+ * @notifier: of reconfiguration notifier
+ *
+ * Holds a single overlay state, including all the overlay logs &
+ * records.
+ */
+struct of_overlay_info {
+ struct device_node *target;
+ struct device_node *overlay;
+ struct mutex lock;
+ struct list_head le_list;
+ struct list_head de_list;
+ struct notifier_block notifier;
+ int device_depth;
+};
+
+#ifdef CONFIG_OF_OVERLAY
+
+int of_overlay(int count, struct of_overlay_info *ovinfo_tab);
+int of_overlay_revert(int count, struct of_overlay_info *ovinfo_tab);
+
+int of_fill_overlay_info(struct device_node *info_node,
+ struct of_overlay_info *ovinfo);
+int of_build_overlay_info(struct device_node *tree,
+ int *cntp, struct of_overlay_info **ovinfop);
+int of_free_overlay_info(int cnt, struct of_overlay_info *ovinfo);
+
+#else
+
+static inline int of_overlay(int count, struct of_overlay_info *ovinfo_tab)
+{
+ return -ENOTSUPP;
+}
+
+static inline int of_overlay_revert(int count, struct of_overlay_info *ovinfo_tab)
+{
+ return -ENOTSUPP;
+}
+
+static inline int of_fill_overlay_info(struct device_node *info_node,
+ struct of_overlay_info *ovinfo)
+{
+ return -ENOTSUPP;
+}
+
+static inline int of_build_overlay_info(struct device_node *tree,
+ int *cntp, struct of_overlay_info **ovinfop)
+{
+ return -ENOTSUPP;
+}
+
+static inline int of_free_overlay_info(int cnt, struct of_overlay_info *ovinfo)
+{
+ return -ENOTSUPP;
+}
+
+#endif
+
#endif /* _LINUX_OF_H */

Guenter Roeck

unread,
Nov 8, 2013, 2:20:03 PM11/8/13
to
On Fri, Nov 08, 2013 at 05:06:09PM +0200, Pantelis Antoniou wrote:
> Introduce DT overlay support.
> Using this functionality it is possible to dynamically overlay a part of
> the kernel's tree with another tree that's been dynamically loaded.
> It is also possible to remove node and properties.
>
> Note that the I2C client devices are 'special', as in they're not platform
> devices. They need to be registered with an I2C specific method.
>
> In general I2C is just no good platform device citizen and needs
> special casing.
>
> PCI is another special case where PCI device insertion and removal
> is implemented in the PCI subsystem.
>
> Bug fixes & PCI support by Guenter Roeck <gro...@juniper.net>
>
> Signed-off-by: Pantelis Antoniou <pa...@antoniou-consulting.com>

Hi Pantelis,

another nitpick:

.git/rebase-apply/patch:195: trailing whitespace.
end up filling in the fields of the of_overlay_info array.
warning: 1 line adds whitespace errors.

Guenter

Guenter Roeck

unread,
Nov 8, 2013, 3:10:01 PM11/8/13
to
On Fri, Nov 08, 2013 at 05:06:10PM +0200, Pantelis Antoniou wrote:
> Add a runtime interface to /proc to enable generic device tree overlay
> usage.
>

Hi Pantelis,

> ---
> fs/proc/proc_devtree.c | 278 +++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 278 insertions(+)
>
[ ... ]

> +
> + /* start at 256K at first */
> + if (od->alloc == 0)
> + od->alloc = SZ_256K / 2;
> +

I have to include <linux/sizes.h> to get this to build.

Also, even though /proc/device-tree-overlay is supposed to be write only, I get
this:

# cat /proc/device-tree-overlay
cat: [ 651.973568] overlay_proc_release: failed to unflatten tree
/proc/device-tree-overlay: Input/output error

which seems to be a bit odd. I did not spend any time to track it down, though.

Otherwise, the code seems to be working well.

Thanks,
Guenter

Ionut Nicu

unread,
Nov 11, 2013, 5:30:01 AM11/11/13
to
Hi,

On 08.11.2013 16:06, ext Pantelis Antoniou wrote:
> Add a runtime interface to /proc to enable generic device tree overlay
> usage.
>

> +
> + /* start at 256K at first */
> + if (od->alloc == 0)
> + od->alloc = SZ_256K / 2;
> +

Same problem as Guenter reported. I have to include <linux/sizes.h> or otherwise
the code will not build.

> + rcu_read_lock();
> + idr_for_each_entry(&overlay_idr, od, id) {
> + seq_printf(m, "%d: %d bytes", id, od->size);

For od->size we should use "%zu", otherwise there will be a compilation warning.
At least I get one on mips64...

Otherwise, I've tested this and it works fine for me also.

Thanks,
Ionut

Pantelis Antoniou

unread,
Nov 11, 2013, 7:40:02 AM11/11/13
to
Hi Ionut,

On Nov 11, 2013, at 11:22 AM, Ionut Nicu wrote:

> Hi,
>
> On 08.11.2013 16:06, ext Pantelis Antoniou wrote:
>> Add a runtime interface to /proc to enable generic device tree overlay
>> usage.
>>
>
>> +
>> + /* start at 256K at first */
>> + if (od->alloc == 0)
>> + od->alloc = SZ_256K / 2;
>> +
>
> Same problem as Guenter reported. I have to include <linux/sizes.h> or otherwise
> the code will not build.
>

OK, I guess on my platform it gets implicitly included.

>> + rcu_read_lock();
>> + idr_for_each_entry(&overlay_idr, od, id) {
>> + seq_printf(m, "%d: %d bytes", id, od->size);
>
> For od->size we should use "%zu", otherwise there will be a compilation warning.
> At least I get one on mips64...
>

Both of these will be included on the next revision of the patchset.

> Otherwise, I've tested this and it works fine for me also.
>
> Thanks,
> Ionut

Regards

-- Pantelis

Grant Likely

unread,
Nov 11, 2013, 9:40:02 PM11/11/13
to
On Thu, 7 Nov 2013 21:46:26 +0100, Sebastian Andrzej Siewior <seba...@breakpoint.cc> wrote:
> On 07.11.13, Pantelis Antoniou wrote:
> > Hi Sebastian,
> Hi Pantelis,
>
> > FWIW DT has been ported to x86. And is present on arm/powerpc/mips/arc and possibly
> > others.
>
> Yes, I know. I am the one that did the work for CE4100, the first one
> that boots with DT on x86.
>
> > So what are we talking about again? If you care about the non-DT case, why
> > don't you make a patch about how you could support Guenter's use case on
> > the x86.
>
> I am only saying that this "hot-plug a device at a non hot-plugagle bus at
> runtime" is not limited to DT but this solution is. X86 + ACPI is not
> the only limitation. ARM is (forced) going to ACPI as well as far I
> know. And this solution is limited to DT. This is what I am pointing
> out.

I'm going to nip this in the bud before it spreads... No.

ARM is not going wholesale to ACPI. ACPI will be implemented on ARMv8
servers. Don't expect the rest of the ARM world to go there anytime
soon, if at all.

As for being limited to DT, that is correct currently, just as SSDT is
an ACPI specific way of doing mostly the same thing. However, I'm not sure
it needs to be a limitation. As long as there is well definied
segregation between the base tree and the overlay (ie. it would be
useful to filter which nodes can be attached to, there are security
implications here too) then it would be fine for a DT overlay to pull in
additional devices on top of an ACPI base tree.

> > His use case is not uncommon, believe it or not, and x86 would benefit from
> > something this flexible.
>
> I *think* a more flexible solution would be something like bus_type which is
> exposed via configfs. It would be attached behind a certain device/bus where
> the "physical" hotplug interface is. The user would then be able to read the
> configuration based on whatever information he has and could then create
> devices he likes at runtime. This wouldn't depend much on the firmware that is
> used but would require a little more work I think.

I'm certainly fine with investigating the above (modulo security
concerns of creating arbitrary devices). However, that is only one
use-case. There still needs to be the mechanism of passing the kernel a
blob all at once and say, "Here is the layout of this new hunk of
hardware". For instance, you don't want to have to build up a set of
devices from scratch every time a new device is passed in. Xilinx for
instance has a tool which creates a FDT for an FPGA bitstream right from
the FPGA tools.

g.

Grant Likely

unread,
Nov 11, 2013, 9:40:02 PM11/11/13
to
On Fri, 8 Nov 2013 17:06:09 +0200, Pantelis Antoniou <pa...@antoniou-consulting.com> wrote:
> Introduce DT overlay support.
> Using this functionality it is possible to dynamically overlay a part of
> the kernel's tree with another tree that's been dynamically loaded.
> It is also possible to remove node and properties.
>
> Note that the I2C client devices are 'special', as in they're not platform
> devices. They need to be registered with an I2C specific method.
>
> In general I2C is just no good platform device citizen and needs
> special casing.
>
> PCI is another special case where PCI device insertion and removal
> is implemented in the PCI subsystem.

Actually. I2C isn't special in this regard; SPI, MDIO, PCI, USB are all
in the same boat. platform_device just happens to be able to make a few
assumptions. If anything, platform_device is the 'special' one. :-)

In all cases it must be the underlying subsystem that should be
responsible for creation/removal of devices described in the overlay.
Sometimes that will extend right down into the individual bus device
drivers, but it is better if that can be avoided.
Make of_overlay_info a kobject and use a release method to free it when
all users go away. Freeing directly is a dangerous thing to do.
Are the node & property removals reversable? What about property
modifications? If they are not, then it would be better to not support
removals at all for now. Otherwise we'll end up with overlays that
cannot be removed and I don't want to inadvertently put users into that
situation.
I've got big problems here. This is entirely the wrong place to create
and delete devices. Each subsystem needs to create and remove its own
device types. To begin with, only the subsystem knows which busses are
appropriate for creating child devices.

Second, it is incorrect to create a platform_device for every single
node by default. Some nodes aren't devices at all. Just looking for a
compatible property isn't sufficient either.

The solution here is for each subsystem to have it's own notifier, and
only create a child device if the changed node has a parent the
subsystem already knows about (because it's already populated device
tree devices from that parent). This is also true for platform_devices.

Finally, sometimes the tips of the tree will get populated, but not
until after some parent nodes have been delt with. In that case the
creation of devices needs to be deferred back to the driver for the
parent bus.
Can overlays interact in bad ways? If overlay 1 is installed before
overlay 2, what happens if overlay 1 is removed?

Okay, my brain is tired. It's a complicated series and I don't see any
obvious bits that can be split off and be independently useful. I would
*really* like to see some test cases for this functionality. It's
complicated enough to make me quite nervous.... in fact I would really
like to see drivers/of/selftest.c become an early user of overlays. That
would make the DT selftests executable on any platform.

g.

Grant Likely

unread,
Nov 11, 2013, 9:40:02 PM11/11/13
to
Why /proc? Did you consider using the firmware loading mechanism? While
I expressed concerned about the capebus approach, the loading of
overlays needs to remain under the control of either a driver or the
platform. By default it should not be possible to drop an arbitrary
overlay into /proc/device-tree-overlay and have things change in the
base tree. Along the same lines, I would expect for the device driver or
platform to be able to filter or limit the parts of the tree that are
allowed to be modified.

A side benefit of the firmware loader is that the kernel can obtain the
overlay on its own if needed at boot time without userspace involvement.

g.

Grant Likely

unread,
Nov 11, 2013, 9:40:02 PM11/11/13
to
dt-object-internal.txt is in the DTC patch, not the kernel tree.

> +
> +How the resolver works
> +----------------------
> +
> +The resolver is given as an input an arbitrary tree compiled with the
> +proper dtc option and having a /plugin/ tag. This generates the
> +appropriate __fixups__ & __local_fixups__ nodes as described in [1].

Missing footnote reference line for [1]?

> +
> +In sequence the resolver works by the following steps:
> +
> +1. Get the maximum device tree phandle value from the live tree + 1.

Is there a (realistic) worry about leaking phandle number space from
plugging/unplugging trees repeated addition/removal of overlays?
I don't see another user. What is the reason for the __ version of
of_get_tree_max_phandle?
Unnecessary cast if you use:
phandle = be32_to_cpup(prop->value);

> + if (phandle == OF_PHANDLE_ILLEGAL) /* unresolved */
> + continue;

Isn't this an error condition? Should never have OF_PHANDLE_ILLEGAL in
the property here.

> +
> + /* adjust */
> + *(uint32_t *)prop->value = cpu_to_be32(node->phandle);

*(__be32*)prop->value = ...
Probably need to grab the devtree lock before doing the above, and not
release it until after the trees are merged. Otherwise there is the
potential of trying to merge two trees at once and getting phandle
conflicts.

Overall the patch looks pretty nice. I'm looking forward to hearing back
on the questions above.

g.

Pantelis Antoniou

unread,
Nov 12, 2013, 3:20:01 AM11/12/13
to
Hi Grant,
Well, this can be pretty radical, but what about converting the ACPI base tree
in DT format and then take it from there.

I'm sure the state of DT encompasses most of what ACPI does ATM, modulo the
whole craziness of calling in firmware/interpreting bytecode to do stuff like
turn on a GPIO.

I might be in the minority, but it took us so many years on not having to rely
on weird firmware to get our systems to work, and ACPI seems to be heading back
into the old ways on having to rely on firmware quirks to patch firmware we
are dependent on.

DT way of describing only, may be not to the liking of everyone, but I certainly
prefer it.

>>> His use case is not uncommon, believe it or not, and x86 would benefit from
>>> something this flexible.
>>
>> I *think* a more flexible solution would be something like bus_type which is
>> exposed via configfs. It would be attached behind a certain device/bus where
>> the "physical" hotplug interface is. The user would then be able to read the
>> configuration based on whatever information he has and could then create
>> devices he likes at runtime. This wouldn't depend much on the firmware that is
>> used but would require a little more work I think.
>
> I'm certainly fine with investigating the above (modulo security
> concerns of creating arbitrary devices). However, that is only one
> use-case. There still needs to be the mechanism of passing the kernel a
> blob all at once and say, "Here is the layout of this new hunk of
> hardware". For instance, you don't want to have to build up a set of
> devices from scratch every time a new device is passed in. Xilinx for
> instance has a tool which creates a FDT for an FPGA bitstream right from
> the FPGA tools.
>
> g.

Regards

-- Pantelis

Pantelis Antoniou

unread,
Nov 12, 2013, 3:30:01 AM11/12/13
to
Yes, good catch. I will fix the reference.

BTW, what about moving/copying some of the DTC docs in the kernel doc
directory? The dtc Documentation directory is missing from the kernel tree.


>> +
>> +How the resolver works
>> +----------------------
>> +
>> +The resolver is given as an input an arbitrary tree compiled with the
>> +proper dtc option and having a /plugin/ tag. This generates the
>> +appropriate __fixups__ & __local_fixups__ nodes as described in [1].
>
> Missing footnote reference line for [1]?
>

Yes.

>> +
>> +In sequence the resolver works by the following steps:
>> +
>> +1. Get the maximum device tree phandle value from the live tree + 1.
>
> Is there a (realistic) worry about leaking phandle number space from
> plugging/unplugging trees repeated addition/removal of overlays?
>

I think not. But doing it this way has the nice property of keeping all phandle
values the same each time you do a load-unload-load sequence.
OK.

>
>> + if (phandle == OF_PHANDLE_ILLEGAL) /* unresolved */
>> + continue;
>
> Isn't this an error condition? Should never have OF_PHANDLE_ILLEGAL in
> the property here.

Hmm, I think so. I'll see if there's anything special there.

>
>> +
>> + /* adjust */
>> + *(uint32_t *)prop->value = cpu_to_be32(node->phandle);
>
> *(__be32*)prop->value = ...


It is the same for the compiler, but you're right.
No, because the device tree being passed it it guaranteed to be
in detached state, it is not part of the live device tree;
the check in the beginning of the function makes sure.

When we apply the overlay the devtree lock is taken properly.

>
> Overall the patch looks pretty nice. I'm looking forward to hearing back
> on the questions above.
>

Thank you
> g.
>

Regards

-- Pantelis

Pantelis Antoniou

unread,
Nov 12, 2013, 3:50:02 AM11/12/13
to
Hi Grant,
The firmware loading mechanism is the preferred way to handle it, and is
in fact what is used in practice by the whole cape manager mechanism.
But that's part of specific board support, and this is more like a general
way to use overlays, in any kind of DT enabled system.

I agree this is a) completely dangerous and b) /proc is a bad place to put
it. I put it here in order to get the ball rolling, about the proper place
to put device tree related interfaces.

The good thing about this interface is that it's always available, and
it is good for debugging (especially loading/unloading debugging).


> A side benefit of the firmware loader is that the kernel can obtain the
> overlay on its own if needed at boot time without userspace involvement.
>

Yes, that is correct, and it is the way we use it on the beaglebone.
The root fs device is present on a overlay, which is built into the kernel's
firmware.


> g.
>

Regards

-- Pantelis

Pantelis Antoniou

unread,
Nov 12, 2013, 4:40:01 AM11/12/13
to
Hi Grant,

On Nov 11, 2013, at 7:42 PM, Grant Likely wrote:

> On Fri, 8 Nov 2013 17:06:09 +0200, Pantelis Antoniou <pa...@antoniou-consulting.com> wrote:
>> Introduce DT overlay support.
>> Using this functionality it is possible to dynamically overlay a part of
>> the kernel's tree with another tree that's been dynamically loaded.
>> It is also possible to remove node and properties.
>>
>> Note that the I2C client devices are 'special', as in they're not platform
>> devices. They need to be registered with an I2C specific method.
>>
>> In general I2C is just no good platform device citizen and needs
>> special casing.
>>
>> PCI is another special case where PCI device insertion and removal
>> is implemented in the PCI subsystem.
>
> Actually. I2C isn't special in this regard; SPI, MDIO, PCI, USB are all
> in the same boat. platform_device just happens to be able to make a few
> assumptions. If anything, platform_device is the 'special' one. :-)
>

I'm of the opinion that 'platform_device' shouldn't exist at all btw :)
Most of it's functionality can pretty easily be subsumed by device proper
and the world would be a better place :)

> In all cases it must be the underlying subsystem that should be
> responsible for creation/removal of devices described in the overlay.
> Sometimes that will extend right down into the individual bus device
> drivers, but it is better if that can be avoided.
>

Yes, that's the right way to handle this, but we're talking about
serious modification in almost every kind of device core.

Just getting the basic concept of overlays in is hard without having
to tackle something like this.

In the future, I'd be ecstatic if we could get a per device class
filter, but it's nothing we can fix right now.

IMHO of course.

>>
>> Bug fixes & PCI support by Guenter Roeck <gro...@juniper.net>
>>

[snip]

>> +so a bar platform device will be registered and if a matching device driver
>> +is loaded the device will be created as expected.
>> +
>> +Overlay in-kernel API
>> +---------------------
>> +
>> +The steps typically required to get an overlay to work are as follows:
>> +
>> +1. Use of_build_overlay_info() to create an array of initialized and
>> +ready to use of_overlay_info structures.
>> +2. Call of_overlay() to apply the overlays declared in the array.
>> +3. If the overlay needs to be removed, call of_overlay_revert().
>> +4. Finally release the memory taken by the overlay info array by
>> +of_free_overlay_info().
>
> Make of_overlay_info a kobject and use a release method to free it when
> all users go away. Freeing directly is a dangerous thing to do.
>

It's more dangerous than even that. We also have the mess with the way
unflattening works (properties point directly to binary blob area
with no way to free per property data). In general freeing/releasing
device tree nodes is tricky on a good day.

I agree we have to move to a kobject model eventually.
Yes, everything is reversible. Doesn't mean that the drivers/driver core
can handle it. Using overlays is an excellent way to trigger bugs in
device removal as I've found out :)

>
>> + };
>> + }
>> + fragment@1 { /* second child node */
>> + ...
>> + };
>> + /* more fragments follow */
>> +}

[snip]

>> +
>> +static int of_overlay_device_entry_change(struct of_overlay_info *ovinfo,
>> + struct of_overlay_device_entry *de, int revert)
>> +{
>
> I've got big problems here. This is entirely the wrong place to create
> and delete devices. Each subsystem needs to create and remove its own
> device types. To begin with, only the subsystem knows which busses are
> appropriate for creating child devices.
>

It was the best I could under the circumstances, without having to touch
every subsystem.


> Second, it is incorrect to create a platform_device for every single
> node by default. Some nodes aren't devices at all. Just looking for a
> compatible property isn't sufficient either.
>

Yep.

> The solution here is for each subsystem to have it's own notifier, and
> only create a child device if the changed node has a parent the
> subsystem already knows about (because it's already populated device
> tree devices from that parent). This is also true for platform_devices.
>
> Finally, sometimes the tips of the tree will get populated, but not
> until after some parent nodes have been delt with. In that case the
> creation of devices needs to be deferred back to the driver for the
> parent bus.
>

Well, any solution that requires each and every subsystem to do dynamic
child insertions/removal is going to be hard sell.

What about something more gentle?

For example, I know that most of the cases fall into a couple of set
patterns for my case

a) Target is the ocp node (on chip peripheral node) and results in
the creation of platform devices under that node.

b) Target is one of the disabled devices under the ocp node
and the status property is modified, plus optional nodes are
added that result in the creation of more devices under the bus this
node controls. This is the case of enabled an I2C/SPI bus and inserting
children devices.

Perhaps the trick is having hooks about what to do in case certain
nodes are modified in a certain way, i.e. a notifier on the ocp node
would 'catch' those. The problem is a having a way to present this
information to the notifier in a way that's easy to be handled.

What do you think?
Yes, they can. It is not something easily fixed; the proper way would
be to calculate overlay intersection points and refuse to unload.

> Okay, my brain is tired. It's a complicated series and I don't see any
> obvious bits that can be split off and be independently useful. I would
> *really* like to see some test cases for this functionality. It's
> complicated enough to make me quite nervous.... in fact I would really
> like to see drivers/of/selftest.c become an early user of overlays. That
> would make the DT selftests executable on any platform.
>

OK, I see your point. My brain is tired too, and I'm on travel too for
this week and the next. I'll see what I can come up with.

> g.
>

Regards

-- Pantelis

Matt Porter

unread,
Nov 12, 2013, 10:30:02 AM11/12/13
to
As was pointed out earlier in the thread, /proc just isn't the place to
add these configuration interfaces any longer.
>
>
> > A side benefit of the firmware loader is that the kernel can obtain the
> > overlay on its own if needed at boot time without userspace involvement.
> >
>
> Yes, that is correct, and it is the way we use it on the beaglebone.
> The root fs device is present on a overlay, which is built into the kernel's
> firmware.

Let's be clear that there are two use cases here

1) kernel driven overlay loading
- root device configuration, other early devices, kernel-based h/w
detection

2) userspace driven overlay loading
- s/w defined h/w, hobbyist connecting stuff on a breadboard,
userspace arduino support

Combining a configfs api for userspace configuration of the overlay to
load with the firmware loader interface would address both of these.

As Grant noted, you want the firmware loader support to handle the
standard kernel driver initiated case which you were doing with capebus
already. But for userspace we have configuration to hand to the kernel
in the form of the name of an overlay. This configuration should be
provided via configfs.

Instantiate an overlay:

mkdir /config/dto/0

would create the following attributes:

/config/dto/0/overlay
/config/dto/0/status

Start a firmware load of an overlay:

echo "foo.dtbo" > /config/dto/0/overlay
[loads /lib/firmware/foo.dtbo]

Status of overlay:

cat /config/dto/0/status
0: 861 bytes FOO:BAR0

Remove an overlay:

rmdir /config/dto/0

-Matt

Grant Likely

unread,
Nov 12, 2013, 11:00:02 PM11/12/13
to
It will break if there are two overlays "leapfrogging" each other on
loads/unloads. It may be a very outside corner case, but it is worth
thinking about.
That doesn't protect against getting duplicate phandle bases. You need
something to protect the range of phandles that the overlay trees will
want to use. The problem with the above code is that it calculates the
phandle base that it wants, but then goes and does a bunch of stuff
without a lock which allows another overlay to try and use the same
phandle range.

g.

Pantelis Antoniou

unread,
Nov 13, 2013, 4:10:02 AM11/13/13
to
Hi Grant,
Normally that can't happen when using an overlay manager when these calls
are taken while holding it's lock.

But I see your point. I will rework it.
Yes, I see. I'll try to see how that can be reworked.

> g.
>

Regards

-- Pantelis

Grant Likely

unread,
Nov 13, 2013, 9:20:01 PM11/13/13
to
On Tue, 12 Nov 2013 10:30:37 +0100, Pantelis Antoniou <pa...@antoniou-consulting.com> wrote:
> Hi Grant,
>
> On Nov 11, 2013, at 7:42 PM, Grant Likely wrote:
>
> > On Fri, 8 Nov 2013 17:06:09 +0200, Pantelis Antoniou <pa...@antoniou-consulting.com> wrote:
> >> Introduce DT overlay support.
> >> Using this functionality it is possible to dynamically overlay a part of
> >> the kernel's tree with another tree that's been dynamically loaded.
> >> It is also possible to remove node and properties.
> >>
> >> Note that the I2C client devices are 'special', as in they're not platform
> >> devices. They need to be registered with an I2C specific method.
> >>
> >> In general I2C is just no good platform device citizen and needs
> >> special casing.
> >>
> >> PCI is another special case where PCI device insertion and removal
> >> is implemented in the PCI subsystem.
> >
> > Actually. I2C isn't special in this regard; SPI, MDIO, PCI, USB are all
> > in the same boat. platform_device just happens to be able to make a few
> > assumptions. If anything, platform_device is the 'special' one. :-)
> >
>
> I'm of the opinion that 'platform_device' shouldn't exist at all btw :)
> Most of it's functionality can pretty easily be subsumed by device proper
> and the world would be a better place :)

I'm fine for merging some/all of the platform_device fields into struct
device. There are a few things, like resources, which would probably be
useful to have common on all struct device variants. However,
platform_device is far more about matching drivers to devices. Even if
all of platform_device went into struct device, there would still need
to be the platform_bus_type as the collection point for the device
drivers.

> > In all cases it must be the underlying subsystem that should be
> > responsible for creation/removal of devices described in the overlay.
> > Sometimes that will extend right down into the individual bus device
> > drivers, but it is better if that can be avoided.
> >
>
> Yes, that's the right way to handle this, but we're talking about
> serious modification in almost every kind of device core.
>
> Just getting the basic concept of overlays in is hard without having
> to tackle something like this.
>
> In the future, I'd be ecstatic if we could get a per device class
> filter, but it's nothing we can fix right now.
>
> IMHO of course.

It's fine to limit yourself to only platform_devices and i2c for now,
but that doesn't give license to do the wrong thing with the
implementation. The subsystem modifications really shouldn't be very
invasive. It will need a per-subsystem notifier callback to deal with
the add/remove events, and it will require the subsystem to maintain a
list of parent bus nodes that it cares about and should create children
for.

Doing it with a central function like this patch does is not a good
approach at all.

>
> >>
> >> Bug fixes & PCI support by Guenter Roeck <gro...@juniper.net>
> >>
>
> [snip]
>
> >> +so a bar platform device will be registered and if a matching device driver
> >> +is loaded the device will be created as expected.
> >> +
> >> +Overlay in-kernel API
> >> +---------------------
> >> +
> >> +The steps typically required to get an overlay to work are as follows:
> >> +
> >> +1. Use of_build_overlay_info() to create an array of initialized and
> >> +ready to use of_overlay_info structures.
> >> +2. Call of_overlay() to apply the overlays declared in the array.
> >> +3. If the overlay needs to be removed, call of_overlay_revert().
> >> +4. Finally release the memory taken by the overlay info array by
> >> +of_free_overlay_info().
> >
> > Make of_overlay_info a kobject and use a release method to free it when
> > all users go away. Freeing directly is a dangerous thing to do.
> >
>
> It's more dangerous than even that. We also have the mess with the way
> unflattening works (properties point directly to binary blob area
> with no way to free per property data). In general freeing/releasing
> device tree nodes is tricky on a good day.

Repeating from my comment on the previous email, don't throw away the
blob. There is no need and it makes your job harder.
They have to do it /anyway/! How many subsystems are we talking about?
SPI, I2C, platform_device (and amba_device). Maybe MDIO. To get to a
useful subset it isn't very many targets. SPI and I2C will probably be
receptive, so no problem there. The maintainer of drivers/of/platform.c
is friendly too. :-) Haven't asked about MDIO.

>
> What about something more gentle?
>
> For example, I know that most of the cases fall into a couple of set
> patterns for my case
>
> a) Target is the ocp node (on chip peripheral node) and results in
> the creation of platform devices under that node.

No, I'm going to be hard-line on this. The code needs to be very
explicit about which nodes cause devices to be created for nodes. It
cannot be based on some arbitrary heuristics.

I'm not asking you to do something hard though. For the platform devices
I'd be fine with of_platform_populate() marking some kind of flag on
struct device_node that says children of it should be used to create
more platform_devices.

> b) Target is one of the disabled devices under the ocp node
> and the status property is modified, plus optional nodes are
> added that result in the creation of more devices under the bus this
> node controls. This is the case of enabled an I2C/SPI bus and inserting
> children devices.

Close, but again it has to match with what I've said above. The various
subsystems need to keep track of which nodes describe a subsystem bus
and only create devices on those nodes. It would be valuable to have a
simple library for keeping track of that information which is usable by
all subsystems, but it still needs to be under the control of the
subsystem.

>
> Perhaps the trick is having hooks about what to do in case certain
> nodes are modified in a certain way, i.e. a notifier on the ocp node
> would 'catch' those. The problem is a having a way to present this
> information to the notifier in a way that's easy to be handled.
>
> What do you think?

It would probably work to have a single notifier for each entire
subsystem instead of a separate notifier for each and every bus node.
Once a subsystem marks a node as interesting, it is quite simple for a
generic subsystem notifier callback to decide whether to create devices
based on it. If done right, it should work without any changes to the
SPI/I2C/etc bus drivers.
I think this is important. If it cannot be solved immediately, then the
kernel should enforce overlays always get removed in the reverse order
that they were added. There may be use-cases that don't like it, but it
is safe.

>
> > Okay, my brain is tired. It's a complicated series and I don't see any
> > obvious bits that can be split off and be independently useful. I would
> > *really* like to see some test cases for this functionality. It's
> > complicated enough to make me quite nervous.... in fact I would really
> > like to see drivers/of/selftest.c become an early user of overlays. That
> > would make the DT selftests executable on any platform.
> >
>
> OK, I see your point. My brain is tired too, and I'm on travel too for
> this week and the next. I'll see what I can come up with.

Enjoy your trip.

g.

Pantelis Antoniou

unread,
Nov 14, 2013, 5:10:01 AM11/14/13
to
Hi Grant,
We don't really need the resources structures on OF. That information is
present in OF format, which we can use to generate transient resources for
usage with the standard kernel interfaces.

BTW, last time I checked resource handling was broken on release.
There are a few patches I sent out fixing it but they were probably ignored.

>>> In all cases it must be the underlying subsystem that should be
>>> responsible for creation/removal of devices described in the overlay.
>>> Sometimes that will extend right down into the individual bus device
>>> drivers, but it is better if that can be avoided.
>>>
>>
>> Yes, that's the right way to handle this, but we're talking about
>> serious modification in almost every kind of device core.
>>
>> Just getting the basic concept of overlays in is hard without having
>> to tackle something like this.
>>
>> In the future, I'd be ecstatic if we could get a per device class
>> filter, but it's nothing we can fix right now.
>>
>> IMHO of course.
>
> It's fine to limit yourself to only platform_devices and i2c for now,
> but that doesn't give license to do the wrong thing with the
> implementation. The subsystem modifications really shouldn't be very
> invasive. It will need a per-subsystem notifier callback to deal with
> the add/remove events, and it will require the subsystem to maintain a
> list of parent bus nodes that it cares about and should create children
> for.
>
> Doing it with a central function like this patch does is not a good
> approach at all.
>

This is indeed the cleaner approach... I'll take a look at what it takes
to do it; hope the changes needed are not anything major.

>>
>>>>
>>>> Bug fixes & PCI support by Guenter Roeck <gro...@juniper.net>
>>>>
>>
>> [snip]
>>
>>>> +so a bar platform device will be registered and if a matching device driver
>>>> +is loaded the device will be created as expected.
>>>> +
>>>> +Overlay in-kernel API
>>>> +---------------------
>>>> +
>>>> +The steps typically required to get an overlay to work are as follows:
>>>> +
>>>> +1. Use of_build_overlay_info() to create an array of initialized and
>>>> +ready to use of_overlay_info structures.
>>>> +2. Call of_overlay() to apply the overlays declared in the array.
>>>> +3. If the overlay needs to be removed, call of_overlay_revert().
>>>> +4. Finally release the memory taken by the overlay info array by
>>>> +of_free_overlay_info().
>>>
>>> Make of_overlay_info a kobject and use a release method to free it when
>>> all users go away. Freeing directly is a dangerous thing to do.
>>>
>>
>> It's more dangerous than even that. We also have the mess with the way
>> unflattening works (properties point directly to binary blob area
>> with no way to free per property data). In general freeing/releasing
>> device tree nodes is tricky on a good day.
>
> Repeating from my comment on the previous email, don't throw away the
> blob. There is no need and it makes your job harder.
>

OK.
Famous last words :)
The proof is going to be in the code.
OK, that makes sense.

We are not talking about a global overlay stack though, we're talking about
an overlay stack for overlays that overlap.

>
>>
>>> Okay, my brain is tired. It's a complicated series and I don't see any
>>> obvious bits that can be split off and be independently useful. I would
>>> *really* like to see some test cases for this functionality. It's
>>> complicated enough to make me quite nervous.... in fact I would really
>>> like to see drivers/of/selftest.c become an early user of overlays. That
>>> would make the DT selftests executable on any platform.
>>>
>>
>> OK, I see your point. My brain is tired too, and I'm on travel too for
>> this week and the next. I'll see what I can come up with.
>
> Enjoy your trip.
>

Thanks, although it's a work trip :)

> g.

Regards

-- Pantelis

Grant Likely

unread,
Nov 14, 2013, 4:30:02 PM11/14/13
to
On Thu, 14 Nov 2013 11:01:35 +0100, Pantelis Antoniou <pa...@antoniou-consulting.com> wrote:
> On Nov 14, 2013, at 2:31 AM, Grant Likely wrote:
> > On Tue, 12 Nov 2013 10:30:37 +0100, Pantelis Antoniou <pa...@antoniou-consulting.com> wrote:
> >> On Nov 11, 2013, at 7:42 PM, Grant Likely wrote:
> >>> On Fri, 8 Nov 2013 17:06:09 +0200, Pantelis Antoniou <pa...@antoniou-consulting.com> wrote:
> >> I'm of the opinion that 'platform_device' shouldn't exist at all btw :)
> >> Most of it's functionality can pretty easily be subsumed by device proper
> >> and the world would be a better place :)
> >
> > I'm fine for merging some/all of the platform_device fields into struct
> > device. There are a few things, like resources, which would probably be
> > useful to have common on all struct device variants. However,
> > platform_device is far more about matching drivers to devices. Even if
> > all of platform_device went into struct device, there would still need
> > to be the platform_bus_type as the collection point for the device
> > drivers.
> >
>
> We don't really need the resources structures on OF. That information is
> present in OF format, which we can use to generate transient resources for
> usage with the standard kernel interfaces.
>
> BTW, last time I checked resource handling was broken on release.
> There are a few patches I sent out fixing it but they were probably ignored.

Please send them again. They probably got lost.

> >>> Can overlays interact in bad ways? If overlay 1 is installed before
> >>> overlay 2, what happens if overlay 1 is removed?
> >>>
> >>
> >> Yes, they can. It is not something easily fixed; the proper way would
> >> be to calculate overlay intersection points and refuse to unload.
> >
> > I think this is important. If it cannot be solved immediately, then the
> > kernel should enforce overlays always get removed in the reverse order
> > that they were added. There may be use-cases that don't like it, but it
> > is safe.
>
> OK, that makes sense.
>
> We are not talking about a global overlay stack though, we're talking about
> an overlay stack for overlays that overlap.

I'm actually talking about a global overlay stack. Otherwise you've
still got the ever-increasing-phandles problem again.

Cheers,
g.

Pantelis Antoniou

unread,
Nov 15, 2013, 3:40:02 AM11/15/13
to
Hi Grant,
A global overlay stack is easiest. Let's do that first.

There are use cases for multiple overlay stacks though. Take for instance the
case where you have multiple plugin connectors.

Each one can lead to creating a new platform device under ocp, but there is
no overlap.

Of course stackable expansion boards work just fine with a global overlay stack :)

> Cheers,
> g.

Regards

-- Pantelis

Grant Likely

unread,
Nov 15, 2013, 2:10:02 PM11/15/13
to
Agreed. baby steps...

g.

Grant Likely

unread,
Dec 10, 2013, 8:00:02 AM12/10/13
to
On Mon, 2 Dec 2013 23:44:47 +0800, Hanjun Guo <hanju...@linaro.org> wrote:
> When boot the kernel with MADT, the cpu possible and present maps should be
> prefilled for cpu topology and acpi based cpu hot-plug.
>
> The logic cpu id maps to APIC id (GIC id) is also implemented, it is needed
> for acpi processor drivers.
>
> Signed-off-by: Hanjun Guo <hanju...@linaro.org>
> ---
[...]
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index a0c2ca6..1428024 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -420,7 +420,9 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
> if (err)
> continue;
>
> +#ifndef CONFIG_ACPI
> set_cpu_present(cpu, true);
> +#endif
> max_cpus--;
> }
> }

This looks wrong. Will this break non-ACPI booting when CONFIG_ACPI is
enabled? The decision on whether or not to run code must be made at
runtime.

Grant Likely

unread,
Dec 10, 2013, 8:10:01 AM12/10/13
to
On Mon, 2 Dec 2013 23:44:49 +0800, Hanjun Guo <hanju...@linaro.org> wrote:
> Parked Address in GIC structure can be used as cpu release address
> for spin table SMP initialisation.
>
> This patch gets parked address from MADT and use it for SMP
> initialisation when DT is not available.
>
> Signed-off-by: Hanjun Guo <hanju...@linaro.org>
> ---
> arch/arm64/include/asm/acpi.h | 3 +++
> arch/arm64/kernel/smp_spin_table.c | 16 +++++++++---
> drivers/acpi/plat/arm-core.c | 50 ++++++++++++++++++++++++++++++++++++
> 3 files changed, 66 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> index 423a32c..180de4a 100644
> --- a/arch/arm64/include/asm/acpi.h
> +++ b/arch/arm64/include/asm/acpi.h
> @@ -90,6 +90,9 @@ extern int boot_cpu_apic_id;
>
> extern void prefill_possible_map(void);
>
> +extern int acpi_get_parked_address_with_gic_id(u32 gic_id,
> + u64 *parked_address);
> +
> #else /* !CONFIG_ACPI */
> #define acpi_disabled 1 /* ACPI sometimes enabled on ARM */
> #define acpi_noirq 1 /* ACPI sometimes enabled on ARM */
> diff --git a/arch/arm64/kernel/smp_spin_table.c b/arch/arm64/kernel/smp_spin_table.c
> index 44c2280..7997873 100644
> --- a/arch/arm64/kernel/smp_spin_table.c
> +++ b/arch/arm64/kernel/smp_spin_table.c
> @@ -25,6 +25,7 @@
> #include <asm/cpu_ops.h>
> #include <asm/cputype.h>
> #include <asm/smp_plat.h>
> +#include <asm/acpi.h>
>
> extern void secondary_holding_pen(void);
> volatile unsigned long secondary_holding_pen_release = INVALID_HWID;
> @@ -47,6 +48,11 @@ static void write_pen_release(u64 val)
> __flush_dcache_area(start, size);
> }
>
> +static int get_cpu_release_addr_acpi(unsigned int cpu, u64 *parked_address)
> +{
> + return acpi_get_parked_address_with_gic_id(arm_cpu_to_apicid[cpu],
> + parked_address);
> +}
>
> static int smp_spin_table_cpu_init(struct device_node *dn, unsigned int cpu)
> {
> @@ -55,10 +61,14 @@ static int smp_spin_table_cpu_init(struct device_node *dn, unsigned int cpu)
> */
> if (of_property_read_u64(dn, "cpu-release-addr",
> &cpu_release_addr[cpu])) {
> - pr_err("CPU %d: missing or invalid cpu-release-addr property\n",
> - cpu);
>
> - return -1;
> + /* try ACPI way */
> + if (get_cpu_release_addr_acpi(cpu, &cpu_release_addr[cpu])) {
> + pr_err("CPU %d: missing or invalid cpu-release-addr property\n",
> + cpu);
> +
> + return -1;
> + }
> }
>
> return 0;
> diff --git a/drivers/acpi/plat/arm-core.c b/drivers/acpi/plat/arm-core.c
> index 8527ecc..c4c8c68 100644
> --- a/drivers/acpi/plat/arm-core.c
> +++ b/drivers/acpi/plat/arm-core.c
> @@ -262,6 +262,56 @@ static int __init acpi_parse_madt_gic_entries(void)
> return 0;
> }
>
> +/* Parked Address in ACPI GIC structure can be used as cpu release addr */
> +int acpi_get_parked_address_with_gic_id(u32 gic_id, u64 *parked_address)
> +{
> + struct acpi_table_header *table_header = NULL;
> + struct acpi_subtable_header *entry;
> + int err = 0;
> + unsigned long table_end;
> + acpi_size tbl_size;
> + struct acpi_madt_generic_interrupt *processor = NULL;
> +
> + if (!parked_address)
> + return -EINVAL;
> +
> + acpi_get_table_with_size(ACPI_SIG_MADT, 0, &table_header, &tbl_size);
> + if (!table_header) {
> + pr_warn(PREFIX "MADT table not present\n");
> + return -ENODEV;
> + }
> +
> + table_end = (unsigned long)table_header + table_header->length;
> +
> + /* Parse all entries looking for a match. */
> + entry = (struct acpi_subtable_header *)
> + ((unsigned long)table_header + sizeof(struct acpi_table_madt));
> +
> + while (((unsigned long)entry) + sizeof(struct acpi_subtable_header) <
> + table_end) {
> + if (entry->type != ACPI_MADT_TYPE_GENERIC_INTERRUPT
> + || BAD_MADT_ENTRY(entry, table_end))
> + continue;
> +
> + processor = (struct acpi_madt_generic_interrupt *)entry;
> +
> + if (processor->gic_id == gic_id) {
> + *parked_address = processor->parked_address;
> + goto out;
> + }
> +
> + entry = (struct acpi_subtable_header *)
> + ((unsigned long)entry + entry->length);

All of the casting in this table looks suspicious. If you have to resort
to casting, then the variable types are very likely wrong.

In the case immediately above, it seems that the entry size doesn't
necessarily equal the acpi_subtable_header size, in which case you
should cast the values to a void* instead of an unsigned long. That
would mean you can do this:

entry = ((void*)entry) + entry->length;

In fact, if I were writing the code, I would have two variables; the
iterator pointer as a void* and a header pointer as a struct
acpi_subtable_header*. Like so:

void *entry, *table_end;
struct acpi_subtable_header *header;

entry = ((void*)table_header) + sizeof(struct acpi_table_madt);
table_end = ((void*)table_header) + table_header->length;
while (entry + sizeof(*header)) < table_end) {
header = entry;

if (header->type != ACPI_MADT_TYPE_GENERIC_INTERRUPT ||
BAD_MADT_ENTRY(entry, table_end))
continue;
processor = entry;

if (processor->gic_id == gic_id) {
*parked_address = processor->parked_address;
goto out;
}

entry += header->length;
}

See? Much cleaner code.

Grant Likely

unread,
Dec 10, 2013, 8:10:02 AM12/10/13
to
On Mon, 2 Dec 2013 23:44:53 +0800, Hanjun Guo <hanju...@linaro.org> wrote:
> This API is similar to DT based irq_of_parse_and_map but does link
> parent/child IRQ controllers. This is tested for primary GIC PPI and GIC SPI
> interrupts and not for secondary child irq controllers.
>
> Signed-off-by: Amit Daniel Kachhap <amit....@samsung.com>
> Signed-off-by: Hanjun Guo <hanju...@linaro.org>
> ---
> drivers/acpi/plat/arm-core.c | 36 ++++++++++++++++++++++++++++++++++--
> 1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/plat/arm-core.c b/drivers/acpi/plat/arm-core.c
> index 9cc0208..17c99e1 100644
> --- a/drivers/acpi/plat/arm-core.c
> +++ b/drivers/acpi/plat/arm-core.c
> @@ -90,7 +90,7 @@ enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_GIC;
>
> static unsigned int gsi_to_irq(unsigned int gsi)
> {
> - int irq = irq_create_mapping(NULL, gsi);
> + int irq = irq_find_mapping(NULL, gsi);

I suspect this will break FDT users that depend on the old behaviour.

g.

>
> return irq;
> }
> @@ -407,7 +407,39 @@ EXPORT_SYMBOL_GPL(acpi_gsi_to_irq);
> */
> int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity)
> {
> - return -1;
> + unsigned int irq;
> + unsigned int irq_type;
> +
> + /*
> + * ACPI have no bindings to indicate SPI or PPI, so we
> + * use different mappings from DT in ACPI.
> + *
> + * For FDT
> + * PPI interrupt: in the range [0, 15];
> + * SPI interrupt: in the range [0, 987];
> + *
> + * For ACPI, using identity mapping for hwirq:
> + * PPI interrupt: in the range [16, 31];
> + * SPI interrupt: in the range [32, 1019];
> + */
> +
> + if (trigger == ACPI_EDGE_SENSITIVE &&
> + polarity == ACPI_ACTIVE_LOW)
> + irq_type = IRQ_TYPE_EDGE_FALLING;
> + else if (trigger == ACPI_EDGE_SENSITIVE &&
> + polarity == ACPI_ACTIVE_HIGH)
> + irq_type = IRQ_TYPE_EDGE_RISING;
> + else if (trigger == ACPI_LEVEL_SENSITIVE &&
> + polarity == ACPI_ACTIVE_LOW)
> + irq_type = IRQ_TYPE_LEVEL_LOW;
> + else if (trigger == ACPI_LEVEL_SENSITIVE &&
> + polarity == ACPI_ACTIVE_HIGH)
> + irq_type = IRQ_TYPE_LEVEL_HIGH;
> + else
> + irq_type = IRQ_TYPE_NONE;
> +
> + irq = irq_create_acpi_mapping(gsi, irq_type);
> + return irq;
> }
> EXPORT_SYMBOL_GPL(acpi_register_gsi);
>
> --
> 1.7.9.5
>
>
> _______________________________________________
> linaro-kernel mailing list
> linaro...@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-kernel

Hanjun Guo

unread,
Dec 10, 2013, 10:10:02 AM12/10/13
to
On 2013年12月10日 20:53, Grant Likely wrote:
> On Mon, 2 Dec 2013 23:44:47 +0800, Hanjun Guo <hanju...@linaro.org> wrote:
>> When boot the kernel with MADT, the cpu possible and present maps should be
>> prefilled for cpu topology and acpi based cpu hot-plug.
>>
>> The logic cpu id maps to APIC id (GIC id) is also implemented, it is needed
>> for acpi processor drivers.
>>
>> Signed-off-by: Hanjun Guo <hanju...@linaro.org>
>> ---
> [...]
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index a0c2ca6..1428024 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -420,7 +420,9 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>> if (err)
>> continue;
>>
>> +#ifndef CONFIG_ACPI
>> set_cpu_present(cpu, true);
>> +#endif
>> max_cpus--;
>> }
>> }
> This looks wrong. Will this break non-ACPI booting when CONFIG_ACPI is
> enabled? The decision on whether or not to run code must be made at
> runtime.

Yes, you are right. I'm reworking on this patch now.

Thanks
Hanjun

Arnd Bergmann

unread,
Dec 11, 2013, 12:30:02 AM12/11/13
to
On Tuesday 10 December 2013, Grant Likely wrote:
> > --- a/drivers/acpi/plat/arm-core.c
> > +++ b/drivers/acpi/plat/arm-core.c
> > @@ -90,7 +90,7 @@ enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_GIC;
> >
> > static unsigned int gsi_to_irq(unsigned int gsi)
> > {
> > - int irq = irq_create_mapping(NULL, gsi);
> > + int irq = irq_find_mapping(NULL, gsi);
>
> I suspect this will break FDT users that depend on the old behaviour.

I think not, given this is only in drivers/acpi and gets added in one
of the prior patches of the same series.

Arnd

Hanjun Guo

unread,
Dec 11, 2013, 2:10:02 AM12/11/13
to
On 2013-12-10 21:03, Grant Likely wrote:
[...]
Aha, much much cleaner, thanks for the guidance, will rework my patch
and test it.

Hanjun

Grant Likely

unread,
Jan 20, 2014, 9:40:02 AM1/20/14
to
On Wed, 11 Dec 2013 06:23:04 +0100, Arnd Bergmann <ar...@arndb.de> wrote:
> On Tuesday 10 December 2013, Grant Likely wrote:
> > > --- a/drivers/acpi/plat/arm-core.c
> > > +++ b/drivers/acpi/plat/arm-core.c
> > > @@ -90,7 +90,7 @@ enum acpi_irq_model_id acpi_irq_model = ACPI_IRQ_MODEL_GIC;
> > >
> > > static unsigned int gsi_to_irq(unsigned int gsi)
> > > {
> > > - int irq = irq_create_mapping(NULL, gsi);
> > > + int irq = irq_find_mapping(NULL, gsi);
> >
> > I suspect this will break FDT users that depend on the old behaviour.
>
> I think not, given this is only in drivers/acpi and gets added in one
> of the prior patches of the same series.

Ah, okay.

g.

Sergei Shtylyov

unread,
Feb 18, 2014, 12:10:01 PM2/18/14
to
Hello.

On 02/18/2014 08:02 PM, Grant Likely wrote:

>>> On Mon, 17 Feb 2014 16:29:40 +0000, Ben Dooks <ben....@codethink.co.uk> wrote:
>>>> The of_mdiobus_register_phy() is not setting phy->irq this causing
>>>> some drivers to incorrectly assume that the PHY does not have an
>>>> IRQ associated with it or install an interrupt handler for the
>>>> PHY.

>>>> Simplify the code setting irq and set the phy->irq at the same
>>>> time so that the case if mdio->irq is not NULL is easier to read.

>>>> This fixes the issue:
>>>> net eth0: attached PHY 1 (IRQ -1) to driver Micrel KSZ8041RNLI

>>>> to the correct:
>>>> net eth0: attached PHY 1 (IRQ 416) to driver Micrel KSZ8041RNLI

>>>> Signed-off-by: Ben Dooks <ben....@codethink.co.uk>
>>>> ---
>>>> drivers/of/of_mdio.c | 12 ++++++------
>>>> 1 file changed, 6 insertions(+), 6 deletions(-)

>>>> diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
>>>> index 875b7b6..7b3e7b0 100644
>>>> --- a/drivers/of/of_mdio.c
>>>> +++ b/drivers/of/of_mdio.c
>>>> @@ -44,7 +44,7 @@ static int of_mdiobus_register_phy(struct mii_bus *mdio, struct device_node *chi
>>>> {
>>>> struct phy_device *phy;
>>>> bool is_c45;
>>>> - int rc, prev_irq;
>>>> + int rc;
>>>> u32 max_speed = 0;
>>>>
>>>> is_c45 = of_device_is_compatible(child,
>>>> @@ -55,11 +55,11 @@ static int of_mdiobus_register_phy(struct mii_bus *mdio, struct device_node *chi
>>>> return 1;
>>>>
>>>> if (mdio->irq) {
>>>> - prev_irq = mdio->irq[addr];
>>>> - mdio->irq[addr] =
>>>> - irq_of_parse_and_map(child, 0);
>>>> - if (!mdio->irq[addr])
>>>> - mdio->irq[addr] = prev_irq;

>>> I cannot for the life for me remeber why the code was structured that
>>> way. Your change is better.

>>>> + rc = irq_of_parse_and_map(child, 0);
>>>> + if (rc > 0) {
>>>> + mdio->irq[addr] = rc;
>>>> + phy->irq = rc;
>>>> + }
>>>> }

>>> The outer if is merely protecting against no irq array being allocated
>>> for the bus. Would not the following be better:

>>> rc = irq_of_parse_and_map(child, 0);
>>> if (rc > 0) {
>>> phy->irq = rc;
>>> if (mdio->irq)
>>> mdio->irq[addr] = rc;
>>> }

>> Thanks, that makes sense, although we've both failed to work
>> out if mdio->irq is set, and rc <= 0 case, so:

>> rc = irq_of_parse_and_map(child, 0);
>> if (rc > 0) {
>> phy->irq = rc;
>> if (mdio->irq)
>> mdio->irq[addr] = rc;
>> } else {
>> if (mdio->irq)
>> phy->irq = mdio->irq[addr];
>> }

> Is this actually a valid thing to do? I think the only time mdio->irq[]
> is non-zero is when it is set to PHY_POLL. Is it valid to set phy->irq
> to PHY_POLL? I didn't think it was.

It is valid, AFAIK.

> g.

>> --
>> Ben Dooks http://www.codethink.co.uk/
>> Senior Engineer Codethink - Providing Genius

WBR, Sergei

Grant Likely

unread,
Mar 15, 2014, 9:10:03 AM3/15/14
to
On Thu, 13 Mar 2014 14:51:56 -0700, Kevin Hilman <khi...@linaro.org> wrote:
> Josh Cartwright <jo...@codeaurora.org> writes:
>
> > On Thu, Mar 13, 2014 at 01:46:50PM -0700, Kevin Hilman wrote:
> >> On Fri, Feb 21, 2014 at 4:25 AM, Marek Szyprowski
> >> <m.szyp...@samsung.com> wrote:
> >> > Enable reserved memory initialization from device tree.
> >> >
> >> > Signed-off-by: Marek Szyprowski <m.szyp...@samsung.com>
> >>
> >> This patch has hit -next and several legacy (non-DT) boot failures
> >> were detected and bisected down to this patch. A quick scan looks
> >> like there needs to be some sanity checking whether a DT is even
> >> present.
> >
> > Hmm. Yes, the code unconditionally calls of_flat_dt_scan(), which will
> > gladly touch initial_boot_params, even though it may be uninitialized.
> > The below patch should allow these boards to boot...
> >
> > However, I'm wondering if there is a good reason why we don't parse the
> > /reserved-memory nodes at the right after we parse the /memory nodes as
> > part of early_init_dt_scan()...
> >
> > Thanks,
> > Josh
> >
> > --8<--
> > Subject: [PATCH] drivers: of: only scan for reserved mem when fdt present
> >
> > Reported-by: Kevin Hilman <khi...@linaro.org>
> > Signed-off-by: Josh Cartwright <jo...@codeaurora.org>
>
> This gets legacy boot working again. Thanks.
>
> Tested-by: Kevin Hilman <khi...@linaro.org>

Applied and confirmed on non-DT qemu boot. Thanks. It will be pushed out
shortly.

g.
0 new messages