Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[RFC PATCH] ulist: generic data structure to build unique lists

2 views
Skip to first unread message

Arne Jansen

unread,
Oct 10, 2011, 5:40:02 AM10/10/11
to
ulist is a generic data structures to hold a collection of unique u64
values. The only operations it supports is adding to the list and
enumerating it.

It is possible to store an auxiliary value along with the key.
The implementation is preliminary and can probably be sped up
significantly.
The main purpose of this submission is to get some idea whether this
data structure is worth having generally available.

It is used by btrfs subvolume quota to translate recursions into iterative
loops and to gather a unique list of roots. Once I had built this data
structure, several uses for it popped up.

Signed-off-by: Arne Jansen <sens...@gmx.net>
---
include/linux/ulist.h | 46 ++++++++++++++++++++++
lib/ulist.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 149 insertions(+), 0 deletions(-)

diff --git a/include/linux/ulist.h b/include/linux/ulist.h
new file mode 100644
index 0000000..ebba91a
--- /dev/null
+++ b/include/linux/ulist.h
@@ -0,0 +1,46 @@
+/*
+ * Copyright (C) 2011 STRATO AG
+ * written by Arne Jansen <sens...@gmx.net>
+ * Distributed under the GNU GPL license version 2.
+ *
+ * ulist is a generic data structures to hold a collection of unique u64
+ * values. The only operations it supports is adding to the list and
+ * enumerating it.
+ * It is possible to store an auxiliary value along with the key.
+ *
+ * The implementation is preliminary and can probably be sped up
+ * significantly. A first step would be to store the values in an rbtree
+ * as soon as ULIST_SIZE is exceeded.
+ */
+
+#ifndef __ULIST__
+#define __ULIST__
+
+#define ULIST_SIZE 16
+
+struct ulist_node {
+ u64 val;
+ unsigned long aux;
+ unsigned long next;
+};
+
+struct ulist {
+ unsigned long nnodes;
+ unsigned long gfp_mask;
+ struct ulist_node *nodes;
+ unsigned long nodes_alloced;
+ struct ulist_node int_nodes[ULIST_SIZE];
+};
+
+void ulist_init(struct ulist *ulist, unsigned long gfp_mask);
+void ulist_fini(struct ulist *ulist);
+void ulist_reinit(struct ulist *ulist);
+struct ulist *ulist_alloc(unsigned long gfp_mask);
+void ulist_free(struct ulist *ulist);
+
+/* returns < 0 on error, 0 on duplicate, 1 on insert */
+int ulist_add(struct ulist *ulist, u64 val, unsigned long aux);
+
+struct ulist_node *ulist_next(struct ulist *ulist, struct ulist_node *prev);
+
+#endif
diff --git a/lib/ulist.c b/lib/ulist.c
new file mode 100644
index 0000000..641a90f
--- /dev/null
+++ b/lib/ulist.c
@@ -0,0 +1,103 @@
+/*
+ * Copyright (C) 2011 STRATO AG
+ * written by Arne Jansen <sens...@gmx.net>
+ * Distributed under the GNU GPL license version 2.
+ *
+ * ulist is a generic data structures to hold a collection of unique u64
+ * values. The only operations it supports is adding to the list and
+ * enumerating it.
+ * It is possible to store an auxiliary value along with the key.
+ *
+ * The implementation is preliminary and can probably be sped up
+ * significantly. A first step would be to store the values in an rbtree
+ * as soon as ULIST_SIZE is exceeded.
+ */
+
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/ulist.h>
+
+void ulist_init(struct ulist *ulist, unsigned long gfp_mask)
+{
+ ulist->nnodes = 0;
+ ulist->gfp_mask = gfp_mask;
+ ulist->nodes = ulist->int_nodes;
+ ulist->nodes_alloced = ULIST_SIZE;
+}
+
+void ulist_fini(struct ulist *ulist)
+{
+ if (ulist->nodes_alloced > ULIST_SIZE)
+ kfree(ulist->nodes);
+}
+
+void ulist_reinit(struct ulist *ulist)
+{
+ ulist_fini(ulist);
+ ulist_init(ulist, ulist->gfp_mask);
+}
+
+struct ulist *ulist_alloc(unsigned long gfp_mask)
+{
+ struct ulist *ulist = kmalloc(sizeof(*ulist), gfp_mask);
+
+ if (!ulist)
+ return NULL;
+
+ ulist_init(ulist, gfp_mask);
+
+ return ulist;
+}
+
+void ulist_free(struct ulist *ulist)
+{
+ if (!ulist)
+ return;
+ ulist_fini(ulist);
+ kfree(ulist);
+}
+
+int ulist_add(struct ulist *ulist, u64 val, unsigned long aux)
+{
+ int i;
+
+ for (i = 0; i < ulist->nnodes; ++i) {
+ if (ulist->nodes[i].val == val)
+ return 0;
+ }
+
+ if (ulist->nnodes > ulist->nodes_alloced) {
+ u64 new_alloced = ulist->nodes_alloced + 128;
+ struct ulist_node *new_nodes = kmalloc(sizeof(*new_nodes) *
+ new_alloced, ulist->gfp_mask);
+
+ if (!new_nodes)
+ return -ENOMEM;
+ memcpy(new_nodes, ulist->nodes,
+ sizeof(*new_nodes) * ulist->nnodes);
+ if (ulist->nodes_alloced > ULIST_SIZE)
+ kfree(ulist->nodes);
+ ulist->nodes = new_nodes;
+ ulist->nodes_alloced = new_alloced;
+ }
+ ulist->nodes[ulist->nnodes].val = val;
+ ulist->nodes[ulist->nnodes].aux = aux;
+ ulist->nodes[ulist->nnodes].next = ulist->nnodes + 1;
+ ++ulist->nnodes;
+
+ return 1;
+}
+
+struct ulist_node *ulist_next(struct ulist *ulist, struct ulist_node *prev)
+{
+ if (ulist->nnodes == 0)
+ return NULL;
+
+ if (!prev)
+ return &ulist->nodes[0];
+
+ if (prev->next < 0 || prev->next >= ulist->nnodes)
+ return NULL;
+
+ return &ulist->nodes[prev->next];
+}
--
1.7.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Andrew Morton

unread,
Oct 11, 2011, 7:40:01 PM10/11/11
to
On Mon, 10 Oct 2011 11:22:05 +0200
Arne Jansen <sens...@gmx.net> wrote:

> ulist is a generic data structures to hold a collection of unique u64
> values. The only operations it supports is adding to the list and
> enumerating it.
>
> It is possible to store an auxiliary value along with the key.
> The implementation is preliminary and can probably be sped up
> significantly.
> The main purpose of this submission is to get some idea whether this
> data structure is worth having generally available.
>
> It is used by btrfs subvolume quota to translate recursions into iterative
> loops and to gather a unique list of roots. Once I had built this data
> structure, several uses for it popped up.
>
>

> ...


>
> +#define ULIST_SIZE 16
> +
> +struct ulist_node {
> + u64 val;
> + unsigned long aux;
> + unsigned long next;
> +};
> +
> +struct ulist {
> + unsigned long nnodes;
> + unsigned long gfp_mask;
> + struct ulist_node *nodes;
> + unsigned long nodes_alloced;
> + struct ulist_node int_nodes[ULIST_SIZE];
> +};

Please document your data structures. In particular, the relationship
between the various fields. And ulist_node.aux and .next are rather
mysterious. Would prefer not to have to reverse engineer these things
from an implementation.

> +void ulist_init(struct ulist *ulist, unsigned long gfp_mask);
> +void ulist_fini(struct ulist *ulist);
> +void ulist_reinit(struct ulist *ulist);
> +struct ulist *ulist_alloc(unsigned long gfp_mask);
> +void ulist_free(struct ulist *ulist);

These are all missing EXPORT_SYMBOL.

And they're all missing interface documentation ;)

The locking requirements should be documented. It appears to be
"caller provides it". That's a good design, but let's spell it out.

Bear in mind that some callers might wish to use rwlocks or rwsems, so
really good documentation would tell them which interface functions
need a read lock and which need a write lock. Bear in mind that such
decisions might constrain future additions to the library code.

Putting the gfp_mask into the ulist is a mistake, trust me. Been there
before. It should be passed in as an argument by the caller.

> +void ulist_fini(struct ulist *ulist)
> +{
> + if (ulist->nodes_alloced > ULIST_SIZE)
> + kfree(ulist->nodes);
> +}

What a strange function. I look forward to seeing the documentation.

We have krealloc().

Generally, I really do ask that you provide us a complete description
of what this utility is supposed to do. Once that is understood we can
then start to determine why none of the existing container helpers were
suitable, and were not capable of being modified to be suitable.

For example, from a quick squint at the code I'm wondering why
flex_array was unsuitable.

Arne Jansen

unread,
Nov 1, 2011, 3:40:01 AM11/1/11
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12.10.2011 01:37, Andrew Morton wrote:
>
> Generally, I really do ask that you provide us a complete description
> of what this utility is supposed to do. Once that is understood we can
> then start to determine why none of the existing container helpers were
> suitable, and were not capable of being modified to be suitable.

Thanks for your kind comments. I sent an updated version to the list
accordingly. Please let me know if the documentation is good enough
and shows the purpose clearly.

>
> For example, from a quick squint at the code I'm wondering why
> flex_array was unsuitable.
>

ulist is currently implemented as an array, but this is not the way
to go to scale to larger lists. I'd rather switch to using rbtrees
over a certain threshold and will do so as soon as we settle on
putting it in lib/ instead of burying it inside fs/btrfs.

- -Arne
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJOr6GeAAoJEPa4OwE4JkZ9TOcP/ArwGeCv8sQejihSahoXPum8
3F5/QhvNpQMtl//3iQ6duuyCbSNyskGQ6xViYmIhYoH8Q9E+DOgQrSa4lSgsJxte
4ItbiobmTnCOWHqbUckkjp8toBleXC2iSjvtpt7DCfP/H1enAi3lE9FONB5A6RcP
ekGoNlp0LIFqLa0hk8qZ8JGetxuR1YWcunkrZRAze/cj4nGcegxiH9bLvEWgEShk
o0QT8RNF1BFtef4unWyxnND+IUbwfvVMU61cKANbXvKhUfG6QxDH62QfYyxQ/Unw
S2IXFZ41I/vgZIE4MKUABYFhigFkmBC3Mze7BGeF084SussazisHdA6UQsQi/TPP
+fU73XTmpFbpg4z2XWAyDkq96qIYijgkSLwelSMRBza8/A3fGnLcKZSCNH3LbjGu
aBjIi8A4VbasiBt/Hx1Vu2v9feX+S1tlmiODoLVp5bsvjD592bAfJK8QOf9RAFYK
0PMV2Fw6yplv6RFfZ32aJwtsryLN6iTOq1LXOz71KBVDaMuZXEur298WhrVCmcEN
270Kd3cBbF/JAoCODw97f/UPE9qWF1Vye+dTEjIWqb8bk8RBrzW5C6Wsxk9BRoU4
dONNMet9ykhHJIbchlYiOv3SMz04+yJJV74MmypefXj6tbxrdYm4gvebx28bC4jm
Ok/DXMFAdLm6wCicjKgI
=GHYG
-----END PGP SIGNATURE-----
0 new messages