OLE - link or embed?

Richard Heathfield

unread,

Feb 18, 2010, 12:11:07 PM2/18/10

to

When designing dynamic data structures (lists, stacks, trees, etc), one
of the many choices open to us is one that was brought up in another
thread recently - whether to link to existing data (what we might call
an "index" approach), or to create a copy of the data and store that
within the structure (a "container" approach).

Clearly each has its merits. I thought it might be useful to take the
subject out of its rather heated context elsegroup, and give non-spinny
people an opportunity to air their views on the subject, spinlessly. (If
the spin spins, let it. It's of no consequence.)

Anyway, I'll kick off with one obvious advantage of each.

Let's say you have data like this:

{
char *name;
char *addr;
char *phone;
}

and you want to be able to search by any of those three criteria. One
good way would be to create three indices (pointer lists), one for each
sort order. To have three copies of the data would obviously be silly.

On the other hand, where *is* the data? It has to be somewhere, right?

Containers are very good at storing data on our behalf. Consider:

rec_readnext(tree **t, FILE *fp)
{
rec foo = {0};
get_csv_field(foo.name, fp);
get_csv_field(foo.addr, fp);
get_csv_field(foo.phone, fp);
tree_add(t, &foo, sizeof foo, rec_constructor);
}

By the time rec_readnext returns, foo no longer exists, so we can't use
it to store our data. We have to keep the data *somewhere*, and keeping
data somewhere is what containers are good at. By using a container data
structure, we are able to encapsulate our data reading functions and use
local temporary objects to get the data into memory; the container then
creates a safe copy, and the original can safely be dispensed with.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Malcolm McLean

unread,

Feb 18, 2010, 12:59:08 PM2/18/10

to

On Feb 18, 7:11 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> and you want to be able to search by any of those three criteria. One
> good way would be to create three indices (pointer lists), one for each
> sort order. To have three copies of the data would obviously be silly.
>

It depends whether the data is under edit.
Most GUI programs, for instance consist basically of a dataset, which
could be any structure (tree, arrays of arrays of arrays, linked list,
graph, single stream of data etc and mixtures of these) together with
a viewer onto the data, and some of those viewers will have edit
ability.
Now if the controls cache the data, every time the user edits, the
edit has to be detected and the model kept in sych. That can be
difficult. In this case, it is better to have one copy of the data,
and the controls read from it directly, and write to it directly.

>
> By using a container data
> structure, we are able to encapsulate our data reading functions and use
> local temporary objects to get the data into memory; the container then
> creates a safe copy, and the original can safely be dispensed with.
>

There's a snag with this approach using C. If the elements are fixed-
size structures, then you are OK. However most data contains variable-
length fields (text strings are the obvious example). These have to be
allocated via malloc. However the container has no way of knowing
where these fields are, without an absurdly clunky interface, so the
container has to be traversed and emptied instead of beign destoyed.
This desn't apply in C++, where you can add destructors, nor in
languages with garbage collection.

Richard Heathfield

unread,

Feb 18, 2010, 1:07:28 PM2/18/10

to

Malcolm McLean wrote:
> On Feb 18, 7:11 pm, Richard Heathfield <r...@see.sig.invalid> wrote:

<snip>

>> By using a container data
>> structure, we are able to encapsulate our data reading functions and use
>> local temporary objects to get the data into memory; the container then
>> creates a safe copy, and the original can safely be dispensed with.
>>
> There's a snag with this approach using C.

I have not found this.

> If the elements are fixed-
> size structures, then you are OK. However most data contains variable-
> length fields (text strings are the obvious example). These have to be
> allocated via malloc. However the container has no way of knowing
> where these fields are, without an absurdly clunky interface, so the
> container has to be traversed and emptied instead of beign destoyed.

Designing an unabsurdly unclunky interface is an interesting challenge.

> This desn't apply in C++, where you can add destructors, nor in
> languages with garbage collection.

You can add a destructor in C, too. You just have to call it explicitly,
that's all.

Richard

unread,

Feb 18, 2010, 1:11:53 PM2/18/10

to

Richard Heathfield <r...@see.sig.invalid> writes:

> When designing dynamic data structures (lists, stacks, trees, etc), one
> of the many choices open to us is one that was brought up in another
> thread recently - whether to link to existing data (what we might call
> an "index" approach), or to create a copy of the data and store that
> within the structure (a "container" approach).
>

Please to comp.lang.programming.

None of what you mention in in the C standard....

Branimir Maksimovic

unread,

Feb 18, 2010, 1:24:51 PM2/18/10

to

Richard Heathfield wrote:
By using a container data
> structure, we are able to encapsulate our data reading functions and use
> local temporary objects to get the data into memory; the container then
> creates a safe copy, and the original can safely be dispensed with.
>

Yes, in c++, containers are of very good use. Since they are in standard
library, everybody uses them, and number of custom containers is greatly
reduced.
Also container can perform internal memory management for user.
It would be good that someone adds such feature in C stdlib, like it is
added for C++.

Greets

ImpalerCore

unread,

Feb 18, 2010, 2:17:45 PM2/18/10

to

On Feb 18, 12:59 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote:

You can use "destructors" in a sense with a C list structure. I use a
generic void*/casting implementation and I have something like this.

my_list_t* restaurant = NULL;
my_list_t* l = NULL;
my_string_t* s = NULL;
int i;

char* menu[] = { "Hamburger", "Chicken Nuggets", "French Fries", "Side
Salad", "Soft Drink" };

for ( i = 0; i < (int)C_ARRAY_N( menu ); ++i )
{
s = my_string_new( menu[i] );
if ( s ) {
restaurant = my_list_insert_sorted( restaurant, s,
my_string_vcmp );
}

for ( l = restaurant; l != NULL; l = l->next ) {
printf( "%s\n", c_str( (my_string_t*)l->object ) );
}

my_list_free( restaurant, my_string_vfree );
}

The my_list_free's second argument is a function pointer, required to
fit the type 'void (*)( void* )' just like 'free'. I implement it in
the following way:

void my_string_free( my_string_t* mstr )
{
/* Free all malloced stuff */
}

void my_string_vfree( void* p )
{
my_string_t* mstr = (my_string_t*)p;
my_string_free( mstr );
}

This is how I destroy a list that owns all the my_string_t objects it
refers to. If I create a shallow copy (a view) of the list, to
destroy the copy I simply pass a no-op destructor (NULL).

my_list_free( restaurant_shallow_copy, NULL );

Of course this requires defining a destructor for each object that is
in a linked list that owns the data, but you have to do the same thing
in C++. It's just in C that the you have to do the dirty work. And
you can also manually free each item node by node as well for those
who prefer to do it that way. The interface is clunky (hopefully you
won't classify it as *absurdly* clunky) compared to C++ standards, but
for C I think it works well. I do the same thing for generic object
comparison to insert into a generic list or to sort objects.

The main snag is that for structures that consist of other objects,
you still have to define an explicit destructor for it. In other
words, the interface is not composable, compared to C++, where a class
is smart enough to call the right destructor for each of its private
members.

Branimir Maksimovic

unread,

Feb 18, 2010, 2:31:23 PM2/18/10

to

ImpalerCore wrote:
>
> The main snag is that for structures that consist of other objects,
> you still have to define an explicit destructor for it. In other
> words, the interface is not composable, compared to C++, where a class
> is smart enough to call the right destructor for each of its private
> members.

Well, microsoft COM is defined in terms that can be implemented in
C and assembler. There is vtable layout, inheritance, aggregation etc...
You can do it all in C except that last book which talks about
this is out of print ten years ago...
C++ compiler just automatically generates calls to appropriate
functions and generates vtables.
Point is that you don;t access data directly,
rather through "interface". It is ok practice...

Greets

Malcolm McLean

unread,

Feb 18, 2010, 2:32:36 PM2/18/10

to

On Feb 18, 9:17 pm, ImpalerCore <jadil...@gmail.com> wrote:
> On Feb 18, 12:59 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
>

> The main snag is that for structures that consist of other objects,
> you still have to define an explicit destructor for it. In other
> words, the interface is not composable, compared to C++, where a class
> is smart enough to call the right destructor for each of its private

> members.- Hide quoted text -
>
It's got to be

ARRAY *array(size_t datasize, int /*yes!*/ capacity, void (*destructor)
(void *obj, void *ptr), void *ptr);

(If you don't provide the ptr then you can get yourself tangled up in
all sorts of nasties, though generally it will be null.)

It works but it's cluntzy. I don't think many people would want to use
an array like that.

Malcolm McLean

unread,

Feb 18, 2010, 2:34:58 PM2/18/10

to

On Feb 18, 8:11 pm, Richard <rgrd...@gmail.com> wrote:

> Richard Heathfield <r...@see.sig.invalid> writes:
>
> Please to comp.lang.programming.
>
> None of what you mention in in the C standard....
>

The thread is about a mixture of language and data structure issues,
so is topical.

ImpalerCore

unread,

Feb 18, 2010, 3:53:24 PM2/18/10

to

On Feb 18, 2:32 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote:

My version of array looks like this:

struct my_array
{
void* buffer;
size_t size;
size_t alloc_size;
};
typedef struct my_array my_array_t;

my_array_t* gmy_array_alloc( size_t sz, size_t n, my_bool zeroize );

#define my_array_new ( type, n ) (gmy_array_alloc( sizeof(type), (n),
FALSE ))
#define my_array_new0( type, n ) (gmy_array_alloc( sizeof(type), (n),
TRUE ))

Then I have a generic function, and a macro wrapper to several
functions. Here is one example.

i.e. a push_back function.

my_array_t* gmy_array_push_back( my_array_t* array, void* object,
size_t sz );

(As long as you know the size of the object, you know everything you
need to know to index and copy the object into the internal dynamic
sized array.)

Then I have a macro wrapper.

#define my_array_push_back( array, object, type )
(gmy_array_push_back( (array), (object), sizeof(type) ))

This requires that all array functions that depend on the size of the
object, like insert, indexing, etc, get passed the type through the
macro to the generic version that does the gruntwork. It makes using
the interface look like this.

---

my_array_t* numbers = NULL;
int i;

int odd[] = { 1, 3, 5, 7, 9 };
int even[] = { 2, 4, 6, 8, 10 };

for ( i = 0; i < (int)C_ARRAY_N( odd ); ++i )
{
my_array_push_back( numbers, &odd[i], int );
my_array_push_back( numbers, &even[i], int );
}

printf( "numbers:"
for ( i = 0; i < my_array_size( numbers ); ++i ) {
printf( " %d", my_array_i( numbers, i, int ) );
}
printf( "\n" );

my_array_free( numbers );

---

My array structure copies everything by value, so it works well for
things like int and other fixed size structs. A copy of my array
makes a deep copy, contrasted with a copy of my linked list is a
shallow copy.

If I included structs with dynamic memory, my inclination is to make
sure copying an array will make a deep copy of its elements. The
interface would need to sport a destructor and copy constructor (for
deep copying) function pointers needed to make it work. You can
register the function pointers once per array, and not have to worry
about them again. I haven't looked at this use case, so I haven't
attempted to design or implement it.

The main trouble I'm dealing with is how to communicate an internal
memory error from the interface to the user. Some functions I can get
by with returning NULL and having the user check the result. In other
functions, the return value isn't used to indicate failure at all, but
it still has an internal malloc or realloc, i.e.
my_string_append_printf. I used to depend on errno = ENOMEM, but that
behavior is unportable. To work around it, I'm having to result to
using a special allocator wrapper that maintains an errno like state
just for memory failures. Now I mimic errno=ENOMEM using a global
c_enomem like variable that functions very similar to errno. It's
better than ignoring it completely, which is what I did before.

As far as whether people will use it, it depends in the value is worth
more than the cost of learning it. Having an generic array may be
worth it to some people even if the interface is somewhat clumsy. Of
course people will continue to develop and use their own resizing
dynamic array, string, list, tree, hash table, etc. if it doesn't suit
them, and I'm fine with that (it's what I'm doing right now ;).

ImpalerCore

unread,

Feb 18, 2010, 4:04:17 PM2/18/10

to

My apologies. My previous example left out the allocation part. I
forgot the 'my_array_new' statement to allocate. Not paying attention
to memory allocation errors either.

---

my_array_t* numbers = NULL;
int i;

int odd[] = { 1, 3, 5, 7, 9 };
int even[] = { 2, 4, 6, 8, 10 };

numbers = my_array_new( int, 10 );

ImpalerCore

unread,

Feb 18, 2010, 4:51:29 PM2/18/10

to

I was taking engineering computer programming classes a little over 10
years ago, so I missed the whole COM thing. It's nice to know it's
out there though.

> Greets

Branimir Maksimovic

unread,

Feb 18, 2010, 5:45:09 PM2/18/10

to

ImpalerCore wrote:
> I was taking engineering computer programming classes a little over 10
> years ago, so I missed the whole COM thing. It's nice to know it's
> out there though.
>

/* C style interface */

typedef struct IUnknownVtbl
{
HRESULT ( __stdcall __RPC_FAR *QueryInterface )(
IUnknown __RPC_FAR * This,
/* [in] */ REFIID riid,
/* [out] */ void __RPC_FAR *__RPC_FAR *ppvObject);

ULONG ( __stdcall __RPC_FAR *AddRef )(
IUnknown __RPC_FAR * This);

ULONG ( __stdcall __RPC_FAR *Release )(
IUnknown __RPC_FAR * This);

} IUnknownVtbl;

interface IUnknown
{
CONST_VTBL struct IUnknownVtbl __RPC_FAR *lpVtbl;
};

;)

Greets

Chris M. Thomasson

unread,

Feb 18, 2010, 6:43:11 PM2/18/10

to

"Branimir Maksimovic" <bm...@hotmail.com> wrote in message
news:4b7dc300$1...@news.x-privat.org...

> ImpalerCore wrote:
>> I was taking engineering computer programming classes a little over 10
>> years ago, so I missed the whole COM thing. It's nice to know it's
>> out there though.
>>
> /* C style interface */

Minimalist generic interfaces for C:

http://clc.pastebin.com/f52a443b1

They work pretty darn well... :^)

Mark

unread,

Feb 18, 2010, 10:36:49 PM2/18/10

to

ImpalerCore wrote:
> My version of array looks like this:
>
> struct my_array
> {
> void* buffer;
> size_t size;
> size_t alloc_size;
> };
> typedef struct my_array my_array_t;
>

[snip]

Isn't it the same what is often called 'vector' ? I know there's a vector
class in C++ standard library, but I refer to a generic dats structure with
the same name.

--
Mark

ImpalerCore

unread,

Feb 18, 2010, 11:10:12 PM2/18/10

to

It is similar in essence to what std::vector provides. I suppose that
they wanted to distinguish the C++ data structure name from a regular
C array. In C land, I mostly see people referring to creating a
resizable or dynamic array, so I went with that nomenclature. I'm not
completely opposed to calling it vector, but at the moment I think
that array is a better name.

Best regards,
John D.

> --
> Mark

Malcolm McLean

unread,

Feb 19, 2010, 2:36:12 AM2/19/10

to

On Feb 19, 6:10 am, ImpalerCore <jadil...@gmail.com> wrote:
> I'm not completely opposed to calling it vector, but at the moment I think
> that array is a better name.
>

A vector is a 1d array
A matrix is a 2d array
An array is a N-dimensional array (including a vector or a matrix).

Michael Foukarakis

unread,

Feb 19, 2010, 3:51:41 AM2/19/10

to

On Feb 19, 9:36 am, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote:
> An *array* is a N-dimensional *array* (including a vector or a matrix).

Heh, that made me giggle.

Phred Phungus

unread,

Feb 19, 2010, 4:24:53 AM2/19/10

to

This is C:

void *DLGetData(DLLIST *Item,
int *Tag,
size_t *Size)
{
void *p = NULL;

if(Item != NULL)
{
if(Tag != NULL)
{
*Tag = Item->Tag;
}
if(Size != NULL)
{
*Size = Item->Size;
}
p = Item->Object;
}

return p;
}

http://i47.tinypic.com/a4bjax.jpg

How do I force data into this data structure?
--
fred

Michael Foukarakis

unread,

Feb 19, 2010, 4:26:10 AM2/19/10

to

On Feb 18, 7:11 pm, Richard Heathfield <r...@see.sig.invalid> wrote:

> When designing dynamic data structures (lists, stacks, trees, etc), one
> of the many choices open to us is one that was brought up in another
> thread recently - whether to link to existing data (what we might call
> an "index" approach), or to create a copy of the data and store that
> within the structure (a "container" approach).

This is quite interesting. I ran into it a few days ago after I
decided to use GLib and its nice collection of data types in a
networked application. I found it was extremely convenient to keep
pointers (an index) to certain messages that would represent preset
requests/responses between application instances so I only had to
allocate them once (or keep them in a file or whatever) and never
worry about their lifetime or deallocation. All I'd need in case I
wanted to send one was find the appropriate pointer and pass it to
send(), for instance. A container would do me no good in that
particular situation.

> Anyway, I'll kick off with one obvious advantage of each.
>
> Let's say you have data like this:
>
> {
> char *name;
> char *addr;
> char *phone;
>
> }
>
> and you want to be able to search by any of those three criteria. One
> good way would be to create three indices (pointer lists), one for each
> sort order. To have three copies of the data would obviously be silly.

One container/index with a comparator[*] function would be just as
usable, and not at all silly, too. But then the question becomes;
which ordering do you want to impose on the data, which I guess
essentially leads to having 3 indices. ;-)

[*] A void *func(void *data, void *user_data); which could be passed
as an argument to a myarray_foreach() function which would allow us to
define a search function for each of the three (or N) fields...or
something to that effect.

Richard Heathfield

unread,

Feb 19, 2010, 4:58:21 AM2/19/10

to

Malcolm McLean wrote:
> On Feb 19, 6:10 am, ImpalerCore <jadil...@gmail.com> wrote:
>> I'm not completely opposed to calling it vector, but at the moment I think
>> that array is a better name.
>>
> A vector is a 1d array

Right.

> A matrix is a 2d array

Right.

> An array is a N-dimensional array (including a vector or a matrix).

A first order tensor is a 1d array.
A second order tensor is a 2d array.
A third order tensor is a 3d array.

And so on. It doesn't take long before a pattern can be seen to emerge. :-)

bartc

unread,

Feb 19, 2010, 6:05:16 AM2/19/10

to

"Phred Phungus" <Ph...@example.invalid> wrote in message
news:7u73n5...@mid.individual.net...

> Malcolm McLean wrote:
>> On Feb 18, 8:11 pm, Richard <rgrd...@gmail.com> wrote:
>>> Richard Heathfield <r...@see.sig.invalid> writes:
>>>
>>> Please to comp.lang.programming.
>>>
>>> None of what you mention in in the C standard....
>>>
>> The thread is about a mixture of language and data structure issues,
>> so is topical.
>>
>
> This is C:
>
> void *DLGetData(DLLIST *Item,
> int *Tag,
> size_t *Size)
> {
> void *p = NULL;
> if(Item != NULL)
> {
> if(Tag != NULL)
> {
> *Tag = Item->Tag;
> }
> if(Size != NULL)
> {
> *Size = Item->Size;
> }
> p = Item->Object;
> }
> return p;
> }

> How do I force data into this data structure?

I guess, something like the following?

void DLSetData(DLLIST *Item,
int Tag,
size_t Size, void* Object)
{
if(Item != NULL)
{
Item->Tag = Tag;
Item->Size = Size;
Item->Object = Object;
}
}

DLSetData(p, STRING, 9, "testdata");

The node won't own the data in this case. It could probably make it's own
copy, provided it is flat data (ie. no embedded pointers to data that also
needs duplicating).

--
bartc

jacob navia

unread,

Feb 19, 2010, 7:27:57 AM2/19/10

to

Richard Heathfield a �crit :

> When designing dynamic data structures (lists, stacks, trees, etc), one
> of the many choices open to us is one that was brought up in another
> thread recently - whether to link to existing data (what we might call
> an "index" approach), or to create a copy of the data and store that
> within the structure (a "container" approach).
>

You can have both.

In the container library I am writing you can:
(1) Store a pointer to your data and then manage the data yourself, or
(2) Store the data itself.

> Clearly each has its merits.

If you choose (1) you are responsible for allocating/deallocating
data. If you chose (2) it is the container responsability to allocate
and free the data. Note that the container copies the data. If you
receive a pointer to the data you should not free it of course.

jacob navia

unread,

Feb 19, 2010, 7:29:49 AM2/19/10

to

Richard Heathfield a �crit :

>
> You can add a destructor in C, too. You just have to call it explicitly,
> that's all.
>

Would it be useful to store a pointer to a destructor function in the
container?

Till now I haven't see any situation where that would be necessary. The
user can always call the destructor if needed.

Richard Heathfield

unread,

Feb 19, 2010, 8:02:50 AM2/19/10

to

jacob navia wrote:
> Richard Heathfield a �crit :
>> When designing dynamic data structures (lists, stacks, trees, etc),
>> one of the many choices open to us is one that was brought up in
>> another thread recently - whether to link to existing data (what we
>> might call an "index" approach), or to create a copy of the data and
>> store that within the structure (a "container" approach).
>>
>
> You can have both.

Right, because you can use a container-style structure for holding
pointers just as easily as you can hold any other type of object.

This is a strong argument in favour of containment, since it gives the
flexibility to serve both needs. A container can contain an index, but
an index can only index a container.

> In the container library I am writing you can:
> (1) Store a pointer to your data and then manage the data yourself, or
> (2) Store the data itself.
>
>> Clearly each has its merits.
>
> If you choose (1) you are responsible for allocating/deallocating
> data. If you chose (2) it is the container responsability to allocate
> and free the data. Note that the container copies the data. If you
> receive a pointer to the data you should not free it of course.

Yes. I think all of the above is generally true of well-designed containers.

Richard Heathfield

unread,

Feb 19, 2010, 8:05:27 AM2/19/10

to

jacob navia wrote:
> Richard Heathfield a �crit :
>>
>> You can add a destructor in C, too. You just have to call it
>> explicitly, that's all.
>>
>
> Would it be useful to store a pointer to a destructor function in the
> container?

Yes. Well, that's what I do - the pointer is supplied to the constructor.

> Till now I haven't see any situation where that would be necessary.

Well, you have to tell the container's destructor how to destroy
contained objects at /some/ point, whether at construction time or at
destruction time. I just think it's more elegant to store the pointer,
so that the container always has full knowledge of how to tear itself
down. But it's a style issue, I think - I don't know of any compelling
arguments against either scheme.

Malcolm McLean

unread,

Feb 19, 2010, 8:26:30 AM2/19/10

to

On Feb 19, 3:05 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> Well, you have to tell the container's destructor how to destroy
> contained objects at /some/ point, whether at construction time or at
> destruction time. I just think it's more elegant to store the pointer,
> so that the container always has full knowledge of how to tear itself
> down. But it's a style issue, I think - I don't know of any compelling
> arguments against either scheme.
>

It's a design issue.

Consider this. We have a generic array container. Now I can write a
routine that takes an array container as an argument, and removes
every item with a prime number index. Why I'd want to do that is
anyone's guess. However for some reason the program requires this
generic function.

If the array has a destructor for its contents, then we can write the
function

int removeprimes(ARRAY *a)

if it doesn't, we must write

int removeprimes(ARRAY *a, void (*destructor)(void *obj))

so far, this is workable.
But now let's say that the array contains elements that are allocated
from a pool, using a fixed-size allocator. The destructor must be

FIXEDMEMORYPOOL *pool;

void destructor(void *obj)
{
fixedfree(pool, obj);
}

That means we can only have one pool at a time. That could well be
unacceptable.

So it's got to be

int removeprimes(ARRAY *a, void (*destructor)(void *obj, void *ptr),
void *ptr)

The caller not only has to know how to destroy the elements in array,
he also has to know which pool they were allocated from. The system
rapidly becomes unworkable, and prone to design mistakes.

If we have a strict policy of every fucntion pointer takes an object
plus an arbitrary pointer, we don't have this problem.

typedef struct
{
void (*destructor)(void *obj, void *ptr);
void *ptr;
void *data;
size_t size;
int N;
} ARRAY;

The ARRAY is messy, but
void removeprimes(ARRAY *a)

is neat and clean.
It's almost always better to have the mess low down,a nd the high-
level interfaces clean.

Chris M. Thomasson

unread,

Feb 19, 2010, 8:36:00 AM2/19/10

to

"Richard Heathfield" <r...@see.sig.invalid> wrote in message
news:qOSdnedqr5k56eDW...@bt.com...

> When designing dynamic data structures (lists, stacks, trees, etc), one of
> the many choices open to us is one that was brought up in another thread
> recently - whether to link to existing data (what we might call an "index"
> approach), or to create a copy of the data and store that within the
> structure (a "container" approach).
>

> Clearly each has its merits. I thought it might be useful to take the
> subject out of its rather heated context elsegroup, and give non-spinny
> people an opportunity to air their views on the subject, spinlessly. (If
> the spin spins, let it. It's of no consequence.)

>
> Anyway, I'll kick off with one obvious advantage of each.

[...]

http://www.boost.org/doc/libs/1_35_0/doc/html/intrusive/intrusive_vs_nontrusive.html

http://groups.google.com/group/comp.lang.c/msg/60621e9cea18d8de

http://groups.google.com/group/comp.lang.c/msg/69b5aa756509c735

ImpalerCore

unread,

Feb 19, 2010, 10:25:38 AM2/19/10

to

On Feb 19, 7:29 am, jacob navia <ja...@nospam.org> wrote:
> Richard Heathfield a écrit :

I think that it certainly can be useful, but it depends on the data
structure.

For distributed data structures like linked lists and trees, I believe
passing the destructor to the interface is simpler. You don't want to
pack a destructor function in each node, and I didn't like having a
special sentinel node that just packs the destructor (it made the
implementation feel more cumbersome). It just felt 'right' to me.

For centralized data structures, typically arrays and hash tables, all
the struct elements are tightly bound. In this case, having a
destructor function pointer (and copy constructor if deep copying is
required) can simplify the interface. The main benefit that I can see
with storing the pointer is that you have one failure point (setting
the right function pointer), versus having the user attempt to either
pass the right destructor function pointer to the interface whenever
an element is erased or destroying the array, or iterate through each
element calling the right destructor. If it's stored locally, get the
container initialization right with the right destructor function
right and it's fire and forget, which I think would simplify the
interface.

I think that along with a destructor, a copy constructor is also a
good part of the interface. It gives you the flexibility to decide
between shallow or deep copies of your pointer based container
objects. Keep in mind that these opinions are not hard-line, as I try
different things out they may change.

It's possible to have the destructor at the object level, but I found
that can get out of hand real quick. I'd have to wrap all the basic
types, and seemed like it was more trouble that it was worth. I
didn't pursue it very far though.

Best regards,
John D.

Richard Harter

unread,

Feb 20, 2010, 10:20:45 AM2/20/10

to

On Thu, 18 Feb 2010 17:11:07 +0000, Richard Heathfield
<r...@see.sig.invalid> wrote:

>When designing dynamic data structures (lists, stacks, trees, etc), one
>of the many choices open to us is one that was brought up in another
>thread recently - whether to link to existing data (what we might call
>an "index" approach), or to create a copy of the data and store that
>within the structure (a "container" approach).
>
>Clearly each has its merits. I thought it might be useful to take the
>subject out of its rather heated context elsegroup, and give non-spinny
>people an opportunity to air their views on the subject, spinlessly. (If
>the spin spins, let it. It's of no consequence.)

Here are a few thoughts.

The relevant wikipedia article uses the terms store-by-value and
store-by-reference. I opine that the important difference lies
in the purpose of the structure. When the structure contains
values we have a repository, a place where the actual object is
stored. When it contains references we have a reference
container.

An analogy is money backed by metal. Value containers store
gold; reference containers store paper money. One unit of gold
can provide backing for an absurdly large number of pieces of
paper money, and so on. (Today paper money is backed by the
belief that it is worth something. I do not know how to
translate that into programming terms.)

Much like paper money, reference containers are cheap and simple.
Value containers, on the other hand, have issues. Objects
require space; the amount of space per object may vary. In other
words there is a storage management issue associated with value
containers. This line of thinking suggests that we separate the
actual storage management from the container.

Then there is the security issue. If we can have multiple
pointers pointing to an object, the integrity of the data is
always open to inadvertent damage. Of course we can use handles
of some kind, but that can get awkward and expensive.

Another issue is that some objects, e.g., integers, occupy no
more space than pointers. In other words, a value container can
be faster and cheaper than reference containers.

And so on. Just a few thoughts.

Richard Harter, c...@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
Infinity is one of those things that keep philosophers busy when they
could be more profitably spending their time weeding their garden.

Richard Bos

unread,

Feb 21, 2010, 10:23:11 AM2/21/10

to

jacob navia <ja...@nospam.org> wrote:

> Richard Heathfield a �crit :
> >
> > You can add a destructor in C, too. You just have to call it explicitly,
> > that's all.
>
> Would it be useful to store a pointer to a destructor function in the
> container?

If you go that way, you might as well go the whole OO hog and put it in
the object itself.

Richard

Malcolm McLean

unread,

Feb 21, 2010, 11:14:04 AM2/21/10

to

On Feb 21, 5:23 pm, ralt...@xs4all.nl (Richard Bos) wrote:

> jacob navia <ja...@nospam.org> wrote:
>
> > Would it be useful to store a pointer to a destructor function in the
> > container?
>
> If you go that way, you might as well go the whole OO hog and put it in
> the object itself.
>

Then the container can't hold plain data, like raw integers. In fact,
in C, it can't hold any object not specifically designed for it,
because it has to know the offset with which to retrieve the
destructor pointer.

Branimir Maksimovic

unread,

Feb 21, 2010, 6:40:17 PM2/21/10

to

You can always provide two alternative implementations for same
container, which will be chosen by user. You can place
pointer to implementation in container struct in same OO way as in user
created data.
One implementation for raw data , second which goes through
generic interface .

Greets

wolfgang.riedel

unread,

Feb 23, 2010, 3:02:36 PM2/23/10

to

On 18 Feb, 21:53, ImpalerCore <jadil...@gmail.com> wrote:
> On Feb 18, 2:32 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
> wrote:
> > On Feb 18, 9:17 pm, ImpalerCore <jadil...@gmail.com> wrote:> On Feb 18, 12:59 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
>

>

> My version of array looks like this:
>
> struct my_array
> {
> void* buffer;
> size_t size;
> size_t alloc_size;};
>
> typedef struct my_array my_array_t;
>
> my_array_t* gmy_array_alloc( size_t sz, size_t n, my_bool zeroize );
>

This uses list as sample for a collection, trees and maps are similar
My version of node is
/* let nodeDTR, nodeCmp & nodeSrch be destructor- compare- and
searchfunctions */
struct a_node
{
node* next;
#if defined(something)
node* prev:
#endif
void* data;
size_t input_length;
size_t alloc_length;
#if defined(some other thing)
nodeDTR fn;
#endif
}
and
ListAnchor as
struct a_anch
{
node* first;
node* last:
node* current:
size_t length;
nodeDTR fn;
nodeCmp fnc;
nodeSrch fns;
}
and a simple (there are others: push, queue, insAt etc.) inserter
t_list_err lIns(void* v, size_t sz);
Alloc struct a_node;
Now, if sz > 0 -> copy v to data, set input_length & alloc_length to
sz;
else -> just store v in data, set input_length & alloc_length
to 0;
// input_length & alloc_length, because alloc may change, input not.
update linking according to list policy (my default: after
current, which may be NULL or last).

Now, Listdestruction is simple:
for each node:
if there is a node.nodeDTR, call it
if input_length > 0, free data, using either anchor.nodeDTR or
free.
free node;

So one list can handle a mixture of dynamically allocated and
reference nodes.

Of course, this can be recursive ad lib., if data itself is a struct
like
struct a_token
{
enum parseT type;
union a
{
a_anch a; // another list
char* b;
...
...
and have an inserter, that sets the appropiate node-functions.

The nice thing is, the original listdestructor does not have to be
changed at all.
Served me well, and a lot of things can be implemented on top of it
(stack, queue, priority queue, even trees (which I don't
recommend)).

To the OP: I would create one collection, which 'holds' the data and
two
others with indices, sorted and searched by nodeCmp (a bit like IMS).

Wolfgang