Memory efficiency of lists in TemplateDictionary ?

4 views
Skip to first unread message

Elliot Foster

unread,
Jul 24, 2008, 10:05:58 PM7/24/08
to google-c...@googlegroups.com
As near as I can figure, the way to make a list of items is:

google::TemplateDictionary *dict, *subdict;
dict = new google::TemplateDictionary("LIST");
for(i = 0; i < node_count; i++) {
subdict = dict->AddSectionDictionary("ELEMENT");
subdict->SetValue("KEY", "VAL");
}

I got this impression from the example located at
'http://google-ctemplate.googlecode.com/svn/trunk/doc/example.html' where it
adds a dictionary for every query result.

However, this results in a fair amount of memory being used, so I'm hoping that
I'm missing something. Or am I getting it right, and this is something that
TemplateDictionary was not designed for (holding large-ish record sets)? Is
there a way to get around it?

I'm not expecting the memory usage to be super low, but I would be very happy if
I was doing something wrong.

Examples follow, source files are attached.

With 100 entries in the list, using TemplateDictionary:

$ pmap -d `pidof ctemplate_dict` | egrep 'rw|priv'
0000000000601000 4 rw--- 0000000000001000 0fe:00001 ctemplate_dict
0000000000602000 396 rw--- 0000000000602000 000:00000 [ anon ]
00007f53b147b000 8 rw--- 000000000015b000 0fe:00001 libc-2.7.so
00007f53b147d000 20 rw--- 00007f53b147d000 000:00000 [ anon ]
00007f53b168f000 4 rw--- 000000000000d000 0fe:00001 libgcc_s.so.1
00007f53b190f000 8 rw--- 000000000007f000 0fe:00001 libm-2.7.so
00007f53b1c06000 12 rw--- 00000000000f5000 0fe:00001 libstdc++.so.6.0.9
00007f53b1c09000 76 rw--- 00007f53b1c09000 000:00000 [ anon ]
00007f53b1e55000 12 rw--- 0000000000039000 0fe:00001
libctemplate_nothreads.so.0.0.0
00007f53b2058000 12 rw--- 00007f53b2058000 000:00000 [ anon ]
00007f53b2071000 16 rw--- 00007f53b2071000 000:00000 [ anon ]
00007f53b2075000 8 rw--- 000000000001d000 0fe:00001 ld-2.7.so
00007fffba061000 84 rw--- 00007ffffffea000 000:00000 [ stack ]
mapped: 14192K writeable/private: 660K shared: 0K

Not that it's all that fair, but here is a comparable test (as far as I can
tell) using Clearsilver's HDF:

$ pmap -d `pidof hdf_dict` | egrep 'rw|priv'
000000000060f000 4 rw--- 000000000000f000 0fe:00001 hdf_dict
0000000000610000 132 rw--- 0000000000610000 000:00000 [ anon ]
00007fa535b42000 8 rw--- 000000000015b000 0fe:00001 libc-2.7.so
00007fa535b44000 20 rw--- 00007fa535b44000 000:00000 [ anon ]
00007fa535d56000 4 rw--- 000000000000d000 0fe:00001 libgcc_s.so.1
00007fa535fd6000 8 rw--- 000000000007f000 0fe:00001 libm-2.7.so
00007fa5362cd000 12 rw--- 00000000000f5000 0fe:00001 libstdc++.so.6.0.9
00007fa5362d0000 76 rw--- 00007fa5362d0000 000:00000 [ anon ]
00007fa5364e3000 12 rw--- 00007fa5364e3000 000:00000 [ anon ]
00007fa5364fc000 16 rw--- 00007fa5364fc000 000:00000 [ anon ]
00007fa536500000 8 rw--- 000000000001d000 0fe:00001 ld-2.7.so
00007fff3e4ec000 84 rw--- 00007ffffffea000 000:00000 [ stack ]
mapped: 11696K writeable/private: 384K shared: 0K

The gap widens with 1k nodes:

TemplateDictionary:

$ pmap -d `pidof ctemplate_dict` | egrep 'rw|priv'
0000000000601000 4 rw--- 0000000000001000 0fe:00001 ctemplate_dict
0000000000602000 3348 rw--- 0000000000602000 000:00000 [ anon ]
00007f98a2c5d000 8 rw--- 000000000015b000 0fe:00001 libc-2.7.so
00007f98a2c5f000 20 rw--- 00007f98a2c5f000 000:00000 [ anon ]
00007f98a2e71000 4 rw--- 000000000000d000 0fe:00001 libgcc_s.so.1
00007f98a30f1000 8 rw--- 000000000007f000 0fe:00001 libm-2.7.so
00007f98a33e8000 12 rw--- 00000000000f5000 0fe:00001 libstdc++.so.6.0.9
00007f98a33eb000 76 rw--- 00007f98a33eb000 000:00000 [ anon ]
00007f98a3637000 12 rw--- 0000000000039000 0fe:00001
libctemplate_nothreads.so.0.0.0
00007f98a383a000 12 rw--- 00007f98a383a000 000:00000 [ anon ]
00007f98a3853000 16 rw--- 00007f98a3853000 000:00000 [ anon ]
00007f98a3857000 8 rw--- 000000000001d000 0fe:00001 ld-2.7.so
00007fffab844000 84 rw--- 00007ffffffea000 000:00000 [ stack ]
mapped: 17144K writeable/private: 3612K shared: 0K

Clearsilver HDF:

$ pmap -d `pidof hdf_dict` | egrep 'rw|priv'
000000000060f000 4 rw--- 000000000000f000 0fe:00001 hdf_dict
0000000000610000 528 rw--- 0000000000610000 000:00000 [ anon ]
00007fdaa1da1000 8 rw--- 000000000015b000 0fe:00001 libc-2.7.so
00007fdaa1da3000 20 rw--- 00007fdaa1da3000 000:00000 [ anon ]
00007fdaa1fb5000 4 rw--- 000000000000d000 0fe:00001 libgcc_s.so.1
00007fdaa2235000 8 rw--- 000000000007f000 0fe:00001 libm-2.7.so
00007fdaa252c000 12 rw--- 00000000000f5000 0fe:00001 libstdc++.so.6.0.9
00007fdaa252f000 76 rw--- 00007fdaa252f000 000:00000 [ anon ]
00007fdaa2742000 12 rw--- 00007fdaa2742000 000:00000 [ anon ]
00007fdaa275b000 16 rw--- 00007fdaa275b000 000:00000 [ anon ]
00007fdaa275f000 8 rw--- 000000000001d000 0fe:00001 ld-2.7.so
00007fffaa74b000 84 rw--- 00007ffffffea000 000:00000 [ stack ]
mapped: 12092K writeable/private: 780K shared: 0K

ctemplate_dict.cc
hdf_dict.cc

Craig Silverstein

unread,
Jul 25, 2008, 12:21:18 AM7/25/08
to google-c...@googlegroups.com
} As near as I can figure, the way to make a list of items is:

Yes, that's right.

} I'm not expecting the memory usage to be super low, but I would be
} very happy if I was doing something wrong.

No, there's nothing you can do to get around it, that I know of. The
"right" solution is to make a TemplateDictionary instance use less
memory than it does now.

There has been some talk of changing the TemplateDictionary
constructor so that variable_dict_, section_dict_, and include_dict_
are allocated lazily, rather than always being allocated in the
constructor. My guess -- untested -- is most of the space in a
TemplateDictionary is going to these hashtables, and lazy construction
could save about 50% of the space used. The cost is probably low in
terms of running time. If you'd like to play around with something
like that, feel free! It would be nice to get both timing effects and
space effects of such a change.

Alternately, feel free to file a feature request on code.google.com,
and I'll try to take a look at it when I have a free moment.

craig

Elliot Foster

unread,
Jul 25, 2008, 1:56:27 AM7/25/08
to google-c...@googlegroups.com
Craig Silverstein wrote:
> No, there's nothing you can do to get around it, that I know of. The
> "right" solution is to make a TemplateDictionary instance use less
> memory than it does now.
>
> There has been some talk of changing the TemplateDictionary
> constructor so that variable_dict_, section_dict_, and include_dict_
> are allocated lazily, rather than always being allocated in the
> constructor. My guess -- untested -- is most of the space in a
> TemplateDictionary is going to these hashtables, and lazy construction
> could save about 50% of the space used. The cost is probably low in
> terms of running time. If you'd like to play around with something
> like that, feel free! It would be nice to get both timing effects and
> space effects of such a change.
>
> Alternately, feel free to file a feature request on code.google.com,
> and I'll try to take a look at it when I have a free moment.

I went ahead and created a ticket for it:

http://code.google.com/p/google-ctemplate/issues/detail?id=17

I didn't see any way to mark it as a feature, request though. Sorry about that.

Right now, this concerns me more than a few unfree'd global variables, as the
idea of 100 items in a template stash eating up 396k by themselves is a
frightening thought. I'm writing the app to be embedded with a small-to-medium
memory footprint, so ~300k is a fair amount to dedicate to one stash for one
request. What's surprising is that TemplateDictionary is almost an order of
magnitude larger than HDF.

Hopefully there's some good low-hanging fruit to take the edge off of the memory
footprint. It would be wonderful to get TemplateDictionary within striking
distance of HDF.

Elliot

Craig Silverstein

unread,
Jul 28, 2008, 5:46:24 PM7/28/08
to google-c...@googlegroups.com
As we get closer to the ctemplate 1.0 release, the API is getting more
and more settled. The next release I make will add a few new features
(hopefully the last ones befoe 1.0!), but the release after that I'd
like to be a deprecation release, where I remove all API calls that
were a good idea at one time, but are now better done some other way.
That way the 1.0 release will be as clean as possible.

Before I do that, I'd like to get feedback from folks as to how
invasive such a change would be. This way I can try to make the
deprecation as painless as possible.

1) I'm currently in final-testing of a new method,
Template::RegisterStringAsTemplate, which replaces
template_from_string.{h,cc}. The new method is both easier to use
and more flexible. It's mostly the same interface as
TemplateFromString, but users of TemplateFromString would need to
make some small changes to their code to move to
RegisterStringAsTemplate. Anyone forsee big problems to them, if
that happens?

2) I'd like to fix a nit in the current design, where
TemplateDictionary::SetModifierData() and
TemplateDictionary::SetAnnotateOutput() are set on a single
dictionary, but affect sub-dictionaries (due to {{>INCLUDE}} as
well. My plan is to introduce a new class, PerExpandData, which
will hold modifier data and annotation output. You'll then pass
this class to Template::Expand().

This change will require a slight rejiggering of the API. In
particular, new template modifiers you write will now take a
PerExpandData* rather than a ModifierData* -- ModifierData will be
subsumed into this new class -- and if you want annotated template
output, you'll have to use this new class to get it. (Annotated
output is pretty obscure, so I don't think most folks are using
it.)

Any problems if we rejigger the API in this way?

3) template_dictionary.h has a section called "DEPRECATED ESCAPING
FUNCTIONALITY" -- stuff that is now better done with template
modifiers like :html_escape. I'm thinking of getting rid of these
functions entirely -- SetEscapedValue, SetEscapedFormattedValue,
SetEscapedValueAndShowSection -- along with the pre-defined escaped
objects we define, like TemplateDictionary::url_query_escape.

It's still possible to escape in code; you can just call the
relevant escape methods yourself. What you would lose in this
change, is the convenience of having one function call doing the
escaping and setting for you; now it'll be (at least) two steps for
you. It will also be a tad bit slower, since you'll need to create
your own storage for the escaped value, before you add it to the
template dictionary.

Overall, I'm excited to make this change since it will encourage
people to do escaping within the template, where it's safer (and
also, where it's easier to convert to auto-escaping, which is safer
still). Anyone use these methods/instances a lot, who will have
trouble if they go away?

Sorry for the verbiage above, and if you get this far, thanks for
reading! Please feel free to send me any feedback, or better yet,
post it to this list.

craig

Reply all
Reply to author
Forward
0 new messages