Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Data structure for sparse mapping of values

17 views
Skip to first unread message

bitrex

unread,
Aug 9, 2016, 11:45:06 AM8/9/16
to
I'm working on an embedded project that has to drive a multi-numeral 7
segment LED display. At the moment the logic is working pretty well; the
user can treat the display as a stream in the fashion of "cout" and
write lines of text to it, and the class will automatically handle the
scanning, scrolling (if the amount of text is larger than the width of
the display,etc.)

The class has an internal std::string buffer that's written to, and then
a "frame buffer" that holds the substring which is currently being
scanned to the physical digits. So when the display updates the code has
to look at each character in the buffer and determine the appropriate
output lines to drive on the current numeral.

If I were implementing the entire printable ASCII character set it would
be easy: just have array of values where each printable char input
mapped directly one-to-one to a bitfield. But I'm only implementing a
restricted subset of it, so if the values are all contiguous in a plain
old array to facilitate quick lookup there would be quite a few
unimplemented gaps taking up space. An array holding bitfields in say
0xFF hex format is also clunky for the user to modify.

At the moment I just have a naive implementation where I have a std::map
from a std::string to a pointer to an array of bools, stored in program
memory, it works but it's not ideal, since std::map has a pretty huge
memory footprint, and it seems goofy to store pointers in memory when
the size of the data structure you're actually interested in is smaller
than the width of a pointer type...

Right now it's something like this, and I'm hoping for something with
similar convenience but less memory footprint:

//.h

namespace DisplayConstants
{
extern const bool char_0[7] PROGMEM;
extern const bool char_1[7] PROGMEM;
extern const bool char_2[7] PROGMEM;
extern const bool char_3[7] PROGMEM;
extern const bool char_4[7] PROGMEM;
extern const bool char_5[7] PROGMEM;
extern const bool char_6[7] PROGMEM;
extern const bool char_7[7] PROGMEM;
extern const bool char_8[7] PROGMEM;
extern const bool char_9[7] PROGMEM;
extern const bool char_A[7] PROGMEM;
//etc
extern const bool char_space[7] PROGMEM;
extern const std::map<std::string, const bool*> character_map;
}

//.cpp

namespace DisplayConstants
{
static std::map<std::string, const bool*> create_map()
{
std::map<std::string, const bool*> m;

m["0"] = &char_0[0];
m["1"] = &char_1[0];
m["2"] = &char_2[0];
m["3"] = &char_3[0];
m["4"] = &char_4[0];
m["5"] = &char_5[0];
m["6"] = &char_6[0];
m["7"] = &char_7[0];
m["8"] = &char_8[0];
m["9"] = &char_9[0];
m["A"] = &char_A[0];
m["a"] = &char_A[0];
M[" "] = &char_space[0];
//etc
}

const bool char_0[7] PROGMEM = { 1, 1, 1, 1, 1, 1, 0 };
const bool char_1[7] PROGMEM = { 0, 1, 1, 0, 0, 0, 0 };
const bool char_2[7] PROGMEM = { 1, 1, 0, 1, 1, 0, 1 };
const bool char_3[7] PROGMEM = { 1, 1, 1, 1, 0, 0, 1 };
const bool char_4[7] PROGMEM = { 0, 1, 1, 0, 0, 1, 1 };
const bool char_5[7] PROGMEM = { 1, 0, 1, 1, 0, 1, 1 };
const bool char_6[7] PROGMEM = { 1, 0, 1, 1, 1, 1, 1 };
const bool char_7[7] PROGMEM = { 1, 1, 1, 0, 0, 0, 0 };
const bool char_8[7] PROGMEM = { 1, 1, 1, 1, 1, 1, 1 };
const bool char_9[7] PROGMEM = { 1, 1, 1, 1, 0, 1, 1 };
const bool char_A[7] PROGMEM = { 1, 1, 1, 0, 1, 1, 1 };
const bool char_space[7] PROGMEM = {0, 0, 0, 0, 0, 0, 0};
//etc

const std::map<std::string, const bool*> character_map = create_map();
}

Victor Bazarov

unread,
Aug 9, 2016, 12:19:54 PM8/9/16
to
On 8/9/2016 11:44 AM, bitrex wrote:
> I'm working on an embedded project that has to drive a multi-numeral 7
> segment LED display. [..]
>
> At the moment I just have a naive implementation where I have a std::map
> from a std::string to a pointer to an array of bools, stored in program
> memory, [..]

Keep in mind that saving memory will likely cost you time. If speed is
not what you're after, instead of an array of bools, store a single char
value. Seven times smaller... And if you go for a static array that
maps your symbol to a char representation, you will likely only use 128
bytes (basic character set). You can only uniquely represent 128 values
with a 7-segment display anyway, no?

V
--
I do not respond to top-posted replies, please don't ask

Ben Bacarisse

unread,
Aug 9, 2016, 12:37:25 PM8/9/16
to
bitrex <bit...@de.lete.earthlink.net> writes:

> I'm working on an embedded project that has to drive a multi-numeral 7
> segment LED display.
<snip>
> If I were implementing the entire printable ASCII character set it
> would be easy: just have array of values where each printable char
> input mapped directly one-to-one to a bitfield. But I'm only
> implementing a restricted subset of it, so if the values are all
> contiguous in a plain old array to facilitate quick lookup there would
> be quite a few unimplemented gaps taking up space. An array holding
> bitfields in say 0xFF hex format is also clunky for the user to
> modify.

I'd try an array of pairs. Not sure if using std::pair and std::vector
is worth it, but the idea is just that you search the array
(exhaustively) for the character you are looking to display. For the
segments, I'd use C++'s binary literals:

struct segment_list {
unsigned char character; unsigned char segments;
} slist[] = {
{ '0', 0b1'11'1'11'0 },
{ '1', 0b0'11'0'00'0 },
...
{ 0, 0 }
};

(I'm using the optional 's to show something about the segment groups.)

Obviously this is not 'efficient', but that is unlikely to be much of a
problem in this sort of situation.

<snip>
--
Ben.

Christian Gollwitzer

unread,
Aug 9, 2016, 12:46:43 PM8/9/16
to
Am 09.08.16 um 17:44 schrieb bitrex:
> I'm working on an embedded project that has to drive a multi-numeral 7
> segment LED display. At the moment the logic is working pretty well; the
> user can treat the display as a stream in the fashion of "cout" and
> write lines of text to it, and the class will automatically handle the
> scanning, scrolling (if the amount of text is larger than the width of
> the display,etc.)

How many different letters do you have? The old display drivers for 7
segment displays used hard-wired logic, some Boolean functions on the
input to transform 16 digits into codes for hex display. You could do
something similar. Quine McCluskey is an algorithm to compute a minimal
expression for a single-valued Boolean. Not sure if something exists for
multiple Boolean outputs, where several McCluskeys obviously are not
minimal.

Christian

Christian Gollwitzer

unread,
Aug 9, 2016, 12:50:33 PM8/9/16
to
Am 09.08.16 um 18:46 schrieb Christian Gollwitzer:
Yet another one is the perfect hash. Usually it maps strings to indices.
gperf is an utility which can do that (create code for a list of
strings). It'll not work well for single letters, but if tweaked, then
maybe.

Christian

Scott Lurndal

unread,
Aug 9, 2016, 1:09:07 PM8/9/16
to
bitrex <bit...@de.lete.earthlink.net> writes:

>
>If I were implementing the entire printable ASCII character set it would
>be easy: just have array of values where each printable char input
>mapped directly one-to-one to a bitfield. But I'm only implementing a
>restricted subset of it, so if the values are all contiguous in a plain
>old array to facilitate quick lookup there would be quite a few
>unimplemented gaps taking up space. An array holding bitfields in say
>0xFF hex format is also clunky for the user to modify.

Given that ASCII is 7-bit, you'll need an array of 128 bitfields. If
the bitfield is 8 bits, then you're consuming 128 bytes. That's almost
in the noise. "Mind the gap" shouldn't apply in this case :-)

bitrex

unread,
Aug 9, 2016, 1:33:57 PM8/9/16
to
It's probably at most around 30, the ten numerals and the upper and
lowercase letters (which for a non-alphanumeric 7 segment have to be
approximated rather crudely, some uppercase and lowercase letters map to
the same value, as for example "M" and "m" is simply like a box without
the lower segment, with a bar across the top.)

bitrex

unread,
Aug 9, 2016, 1:40:07 PM8/9/16
to
Thanks. For such a small data set, I'm guessing that simply linear or
binary searching a data structure for the required value won't be
significantly slower than using some kind of hash table like map.

bitrex

unread,
Aug 9, 2016, 1:44:24 PM8/9/16
to
On 08/09/2016 12:46 PM, Christian Gollwitzer wrote:
That's a neat idea...implement it in "virtual logic" rather than a
lookup table.


Scott Lurndal

unread,
Aug 9, 2016, 2:36:12 PM8/9/16
to
That's 62 characters, store the bitmasks in an array of bytes
indexed starting at 0x30 through 0x7a. The number of gaps will
amount to a dozen bytes or so.

if ((character >= ASCII_ZERO)
&& (character <= ASCII_z)) {
bitmask = bitmasks[character - ASCII_ZERO];
} else {
return NO_BITMAP_FOR_CHARACTER;
}

bitrex

unread,
Aug 9, 2016, 2:58:20 PM8/9/16
to
One likes to feel like one is clever, but it does seem like that's the
most straightforward solution... ; )

Marcel Mueller

unread,
Aug 9, 2016, 3:56:31 PM8/9/16
to
On 09.08.16 17.44, bitrex wrote:
> If I were implementing the entire printable ASCII character set it would
> be easy: just have array of values where each printable char input
> mapped directly one-to-one to a bitfield. But I'm only implementing a
> restricted subset of it, so if the values are all contiguous in a plain
> old array to facilitate quick lookup there would be quite a few
> unimplemented gaps taking up space. An array holding bitfields in say
> 0xFF hex format is also clunky for the user to modify.
>
> At the moment I just have a naive implementation where I have a std::map
> from a std::string to a pointer to an array of bools, stored in program
> memory, it works but it's not ideal, since std::map has a pretty huge
> memory footprint, and it seems goofy to store pointers in memory when
> the size of the data structure you're actually interested in is smaller
> than the width of a pointer type...

std::map uses red black trees. They are intended to keep iterators valid
on manipulations. Indeed they consume much memory.
=> Use a sorted vector with binary search.

But, the data type bool[] is bad too as it uses at least one byte per
entry.(Simply because common CPUs cannot handle sub byte addressing in a
way that complies with the rules for C++ pointers.)
=> use std::bitset instead. (It depends on the platform if bitset<7> can
be packed into a single byte. Some platforms require C++ objects to be
aligned at a machine size word.)

Furthermore ASCII has only 7 bits. I.e. you need at most 127 entries to
cover the entire ASCII range. So if you use a
std::array<std::bitset<7>,128> it will take 128 bytes. Really no big deal.

In contrast a sorted vector of pair<char,std::bitset<7>> will take twice
as much space when all entries are defined. So it is only efficient when
you have less than 64 symbols (likely with 7 segments). However the code
to do the binary search will probably take more than you saved.

The same applies to your initialization code since you did not use
constexpr. It will take much more memory than the resulting data
structures, including std::map.
Unfortunately std::bitset is out at this point since it does not support
reasonable constexpr list initializers. So you have to revert to
unsigned char, at least for the initialization.

Last but not least trivial solutions are sometimes quite effective:
switch (character)
{case 0:
return std::bitset<7>(0b1111110);
case 1:
return std::bitset<7>(0b0110000);
//...
}
Most compilers will create reasonably optimized code for this.

If you want to have even more compact solutions put only continuous runs
of values into arrays:

static const bitset<7> Numbers[10] =
{ 0b1111110,
0b0110000,
//...
};
static const bitset<7> Alpha[6] = // or [26]
{ 0b1110111,
//...
};

character -= '0';
if (character <= '9' - '0')
return Numbers[character];
character -= 'A' - '0';
if (character <= 'F' - 'A')
return Alpha[character];
return bitset<7>(0);


Marcel

Vir Campestris

unread,
Aug 9, 2016, 4:09:51 PM8/9/16
to
On 09/08/2016 19:58, bitrex wrote:
> One likes to feel like one is clever, but it does seem like that's the
> most straightforward solution... ; )

https://en.wikipedia.org/wiki/KISS_principle

Andy

Mr Flibble

unread,
Aug 9, 2016, 4:11:33 PM8/9/16
to
KISS is overused, obvious and not always possible: some complicated
problems require complicated solutions.

/Flibble

Christian Gollwitzer

unread,
Aug 9, 2016, 4:55:31 PM8/9/16
to
Am 09.08.16 um 20:36 schrieb Scott Lurndal:
> bitrex <bit...@de.lete.earthlink.net> writes:
>> It's probably at most around 30, the ten numerals and the upper and
>> lowercase letters (which for a non-alphanumeric 7 segment have to be
>> approximated rather crudely, some uppercase and lowercase letters map to
>> the same value, as for example "M" and "m" is simply like a box without
>> the lower segment, with a bar across the top.)
>
> That's 62 characters, store the bitmasks in an array of bytes
> indexed starting at 0x30 through 0x7a. The number of gaps will
> amount to a dozen bytes or so.
>
> if ((character >= ASCII_ZERO)
> && (character <= ASCII_z)) {
> bitmask = bitmasks[character - ASCII_ZERO];
> } else {
> return NO_BITMAP_FOR_CHARACTER;
> }

I second this. The map is definitely not sparse enough to justify a more
complex algorithm.

Christian

0 new messages