Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Weird use of maps.

72 views
Skip to first unread message

James Kuyper

unread,
Aug 28, 2019, 9:30:51 AM8/28/19
to
I had to fix a bug recently in poorly documented code written by someone
else, who was not available for consultation. This code is written in
C++, but uses Qt, which I'm not as familiar with as I'd like. However, I
believe that QMap, QString, and QVariant do not differ from std::map,
std::string, and std::variant in ways that are significant to the
following explanation - please correct me if I'm wrong about this. I
can't show the actual code (it's proprietary), but if you find the
following description confusing, I could post equivalent code.

The context is a message string that gets broken up into multiple parts
for transmission. Each part is transmitted along with a message ID, a
part number, and the total number of parts. The parts were not
guaranteed to arrive at their destination in any particular order. Parts
of multiple different messages might be intermingled with each other.
The receiving code stored, for each a message, a map containing the
parts of the message received so far, to be recombined after all parts
have been received. Here's where it got decidedly odd: the code uses a
QMap<QString, QVariant), using the text of each message part as a key,
and passing the part number as the corresponding value. When all parts
had arrived, it would then sort the map by it's values, and then extract
the corresponding keys to reconstruct the original message.

The bug was that if one part of a message has the same exact contents as
a previously processed part, then the new part number would replace the
existing part number, rather than adding a new entry to the map. As a
result, the number of parts collected would never match the number
originally sent, so the message could never be completed, and eventually
it got discarded. In practice, this situation isn't likely to come up; I
found the bug during testing only because I was testing to confirm
limits on the total message size, so I used a test message containing
many repetitions of "1234567890", in order to more easily count how many
characters I was sending. As soon as I tested with a message that needed
more than 10 parts, a duplicate was guaranteed.

The solution was trivial: use a QMap<int,QString> rather than
QMap<QString,QVariant>, using the part number as the key. There was
absolutely no need in this code to generalize by using QVariant - it was
used solely to store int values. This rearrangement had the added
advantage of avoiding the slow sort step, because QMap already allows
fast extraction of values in key order.

The only explanation I could find for this peculiar design was a short
comment: "I'd prefer using QMap<int,QString>, which would be faster -
but then it wouldn't be a QVariantMap". The string "QVariantMap" occurs
nowhere else anywhere in the relevant code base, and there's nothing
about it at <https://doc.qt.io/>. I have no idea what he was thinking
about when he wrote that comment. Any suggestions?

At a higher level in the same code, all of the messages currently being
received are stored in a QMap<QString, QVariant>, where the key is the
message ID, and value is the QMap containing the corresponding part map.
That code does something that also strikes me as inefficient and
strange, but at least it's not actually buggy:

It first searches for the messageID in the message map. If it is not
found, it creates a new part map, inserts the current part into that
map, and then inserts that part map into the message map. That code
makes sense to me.
However, if it does find the messageID in the message map, it copies the
corresponding part map into a local variable, inserts the new part into
that map, and then copies the part map back into the message map,
replacing the old part map for that same message ID. I find this quite
strange: it wastes time and memory creating a copy of the part map, it
wastes time destroying the old part map when the new part map is copied
in it's place, and finally wastes time when the new part map gets destroyed.
It seems more sensible to me to insert the new part directly into the
part map while it is stored inside the message map. In principle, a
compiler might optimize the actual code accordingly, but I wouldn't
recommend relying on that. Can anyone suggest any good reason why
someone might use the more elaborate approach that was actually chosen?

Note that the code does not do anything else with either the message map
or the local copy of the part map. While the code as a whole is
multi-threaded, everything I described happens within a single thread.
The code contains none of the protections it would have to have if the
message map were shared between threads.

Jens Kallup

unread,
Aug 28, 2019, 10:22:41 AM8/28/19
to
Hello,

QMap, and other template class "extend" the
corespond stdC++ std::map<k,v> class.

So with QMap<QString,int> int_map, can store
data by "key" as QString (std::string) with value
"int":

-> int_map["foo"] = 4;
-> int_map["bar"] = 2;

under stdC++, you can use "emplace".

If you look deeper to Qt, Qt give you more compact
coding.
The fact, that QString based from std::string ,
you can convert std::string to QString to std::string,
there are members like:

QString s1;
std::string s2 = s1.toStdString();
char buffer[] = s2.c_str();

in Qt, you have cheaper worries about converting the
text encoding (ASCII -> UTF-8 <- ASCII)

If you would like transmit some data aka Client/Server
software, you should look to the RFC's descripted
internet protocolls.

So you can define your own "static" data Header like:

ID | 4 Bytes (as int)
CMD | 4 Bytes (as char[4] for "send", or "quit"...
BODY <n> | n Bytes, where n is the number of bytes that
| comes after sending the header.

It is easy to create a QtSocket stream, to send data.
And Qt comes with a SSL layer, so you don't have to code
abstrakt layer with the original ssl .so/.dll.

And, you could create your own/self signed certificates
for your SSL stream.

The only thing, that may be new to Qt beginner's are the
Slots, and Signal handling methods for various events
that can be occur/fired by the programmer/user.

Hope This Helps

Jens

Öö Tiib

unread,
Aug 28, 2019, 12:01:08 PM8/28/19
to
On Wednesday, 28 August 2019 16:30:51 UTC+3, James Kuyper wrote:

> The context is a message string that gets broken up into multiple parts
> for transmission. Each part is transmitted along with a message ID, a
> part number, and the total number of parts.

How can receiver realize that all parts have been received?
Is the message part count known compile time and part numbers
sequential and zero-based?
Then std::array<std::optional<std::string>,Count> might be
simplest.

> Can anyone suggest any good reason why
> someone might use the more elaborate approach that was actually chosen?

It can be anything. For example initially ambitious requirements
did change into simpler and so it was cut out from something
way more overcomplicated. The file change history from repo
(and commit comments) may illuminate it a bit (if such exist).
Results of developers with different skill levels differ by several
orders of magnitude, but no one notices when that inefficiency
is in some non-critical part of program.

Mike Terry

unread,
Aug 28, 2019, 2:19:31 PM8/28/19
to
On 28/08/2019 14:30, James Kuyper wrote:
> I had to fix a bug recently in poorly documented code written by someone
> else, who was not available for consultation. This code is written in
> C++, but uses Qt, which I'm not as familiar with as I'd like. However, I
> believe that QMap, QString, and QVariant do not differ from std::map,
> std::string, and std::variant in ways that are significant to the
> following explanation - please correct me if I'm wrong about this. I
> can't show the actual code (it's proprietary), but if you find the
> following description confusing, I could post equivalent code.
>
> The context is a message string that gets broken up into multiple parts
> for transmission. Each part is transmitted along with a message ID, a
> part number, and the total number of parts. The parts were not
> guaranteed to arrive at their destination in any particular order. Parts
> of multiple different messages might be intermingled with each other.
> The receiving code stored, for each a message, a map containing the
> parts of the message received so far, to be recombined after all parts
> have been received. Here's where it got decidedly odd: the code uses a
> QMap<QString, QVariant), using the text of each message part as a key,
> and passing the part number as the corresponding value. When all parts
> had arrived, it would then sort the map by it's values, and then extract
> the corresponding keys to reconstruct the original message.

That sounds so strange, that I wonder if there's a simple explanation:
the original coder just confused the ordering of the two map parameters?
Perhaps the intention was to have a map by part number. If multiple
part numbers were encountered they would be deemed retransmissions and
it would be OK to just keep the latest one. If duplicate message part
texts are rare, the original (incorrect) code would seem to be working,
and after that there could have been a reluctance to change it, and any
issues coming up would just be worked around. (Like, "hey, seems I need
to sort by values to get things in the right order?". Yes, seems a bit
crazy, but maybe the coder was new. I've seen equally crazy code from
new coders, or coders forced out of their normal coding environment!)
This sounds again like a new coder - perhaps the coder didn't realise it
was possible to retrieve a /reference/ to the part map? Maybe the coder
was searching for how to retrieve a value from a map, and the first
function encountered was a "retrieve by value" function. Under time
pressure perhaps the thought was, "ok I can make it work using this...".
Or perhaps the coder saw how to retrieve a reference, but it was a
reference to a QVariant, not a part map, and the coder was stuck on
this. (Put another way, maybe the part map in message map either was,
or seemed (to the coder) to effectively be a const entry that could only
be updated through replacement, not modification...)

Mike.

Stefan Große Pawig

unread,
Aug 28, 2019, 2:35:14 PM8/28/19
to
James Kuyper <james...@alumni.caltech.edu> writes:
> [...]
> The only explanation I could find for this peculiar design was a short
> comment: "I'd prefer using QMap<int,QString>, which would be faster -
> but then it wouldn't be a QVariantMap". The string "QVariantMap" occurs
> nowhere else anywhere in the relevant code base, and there's nothing
> about it at <https://doc.qt.io/>. I have no idea what he was thinking
> about when he wrote that comment. Any suggestions?

QVariant has a large set of toXXX() methods to get at the contained
type, including possible conversions. Among these is toMap(), which
returns a QMap<QString, QVariant>.

Is that map of yours stored in another QVariant? If so (now or sometime
back in the past), that could explain the requirement to use a
QMap<QString, QVariant>.

-Stefan

Jorgen Grahn

unread,
Aug 28, 2019, 3:33:28 PM8/28/19
to
On Wed, 2019-08-28, James Kuyper wrote:
> I had to fix a bug recently in poorly documented code written by someone
> else, who was not available for consultation. This code is written in
> C++, but uses Qt, which I'm not as familiar with as I'd like. However, I
> believe that QMap, QString, and QVariant do not differ from std::map,
> std::string, and std::variant in ways that are significant to the
> following explanation - please correct me if I'm wrong about this. I
> can't show the actual code (it's proprietary), but if you find the
> following description confusing, I could post equivalent code.
>
> The context is a message string that gets broken up into multiple parts
> for transmission. Each part is transmitted along with a message ID, a
> part number, and the total number of parts. The parts were not
> guaranteed to arrive at their destination in any particular order. Parts
> of multiple different messages might be intermingled with each other.
> The receiving code stored, for each a message, a map containing the
> parts of the message received so far, to be recombined after all parts
> have been received. Here's where it got decidedly odd: the code uses a
> QMap<QString, QVariant), using the text of each message part as a key,
> and passing the part number as the corresponding value. When all parts
> had arrived, it would then sort the map by it's values, and then extract
> the corresponding keys to reconstruct the original message.
>
> The bug was that if one part of a message has the same exact contents as
> a previously processed part, then the new part number would replace the
> existing part number, rather than adding a new entry to the map.

That's such an obvious flaw that I'd not trust the rest of that
programmers choices to make sense. It's simply not a sane way to do
the trivial task of message reassembly.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Chris M. Thomasson

unread,
Aug 28, 2019, 5:54:22 PM8/28/19
to
On 8/28/2019 9:00 AM, Öö Tiib wrote:
> On Wednesday, 28 August 2019 16:30:51 UTC+3, James Kuyper wrote:
>
>> The context is a message string that gets broken up into multiple parts
>> for transmission. Each part is transmitted along with a message ID, a
>> part number, and the total number of parts.
>
> How can receiver realize that all parts have been received?


Think of sending a file using packets. Each one would have an offset
into the file, where it physically belongs. So, the sender would send
the file size to the receiver, get an ack to make sure it got it, retry
if not. Once this is completed, the receiver would create a memory
mapped file of the same size it received. The sender blasts packets each
containing a file offset and payload. The receiver gets a packet, and
writes its payload to the offset in the memory mapped file. The receiver
can ask for missing offsets for any dropped, missing packets. Duplicate
packets will simply overwrite existing offsets.

Here is the idea, I posted as SenderX:

https://groups.google.com/forum/#!original/alt.winsock.programming/-QXuP9174Gg/Qt1LB0tPt-8J

Afaict, it kind of sounds like Zmodem.

James Kuyper

unread,
Aug 28, 2019, 8:46:57 PM8/28/19
to
On 8/28/19 10:22 AM, Jens Kallup wrote:
> Hello,
>
> QMap, and other template class "extend" the
> corespond stdC++ std::map<k,v> class.

Yes, I know that much about Qt. My point was that, as far as I can tell,
the differences between the Qt classes and the standard classes are not
relevant to the issues I was raising. If I'm wrong about that, please
let me know.

James Kuyper

unread,
Aug 28, 2019, 8:58:40 PM8/28/19
to
On 8/28/19 12:00 PM, Öö Tiib wrote:
> On Wednesday, 28 August 2019 16:30:51 UTC+3, James Kuyper wrote:
>
>> The context is a message string that gets broken up into multiple parts
>> for transmission. Each part is transmitted along with a message ID, a
>> part number, and the total number of parts.
>
> How can receiver realize that all parts have been received?

All parts have been received as soon as the number of different parts
that have been collected for a given message ID matches the part count.

> Is the message part count known compile time

No, it varies with the length of the message provided by the user.

> ... and part numbers
> sequential and zero-based?

The part numbers are sequential in the order that the parts should occur
in. They are sent out sequentially, but in general, they don't arrive
sequentially. They are 1-based, not 0-based.

...
>> Can anyone suggest any good reason why
>> someone might use the more elaborate approach that was actually chosen?
>
> It can be anything. For example initially ambitious requirements
> did change into simpler and so it was cut out from something
> way more overcomplicated. The file change history from repo
> (and commit comments) may illuminate it a bit (if such exist).

The entire block of code containing all of the features I described was
added in 2015, and hasn't been changed since. Prior to that, multi-part
messages weren't supported. The commit message didn't provide any
explanation.

I was wrong about the person responsible being unavailable - but I'm
having trouble figuring out how to word an inquiry that doesn't make it
sound like I think he wrote something stupid (since, in fact, that is
what I'm thinking).

> Results of developers with different skill levels differ by several
> orders of magnitude, but no one notices when that inefficiency
> is in some non-critical part of program.

Agreed - the messages are provided by users, and therefore don't arrive
frequently enough for the inefficiency of this approach to be a
significant problem. The inefficiency bothers me, but my boss wouldn't
have let me work on something this low in priority if I weren't
currently stuck between a project I've finished and a project I'm not
allowed to start work on (we're waiting for approval from the client).

James Kuyper

unread,
Aug 28, 2019, 9:05:30 PM8/28/19
to
On 8/28/19 2:19 PM, Mike Terry wrote:
> On 28/08/2019 14:30, James Kuyper wrote:
...
>> The context is a message string that gets broken up into multiple parts
>> for transmission. Each part is transmitted along with a message ID, a
>> part number, and the total number of parts. The parts were not
>> guaranteed to arrive at their destination in any particular order. Parts
>> of multiple different messages might be intermingled with each other.
>> The receiving code stored, for each a message, a map containing the
>> parts of the message received so far, to be recombined after all parts
>> have been received. Here's where it got decidedly odd: the code uses a
>> QMap<QString, QVariant), using the text of each message part as a key,
>> and passing the part number as the corresponding value. When all parts
>> had arrived, it would then sort the map by it's values, and then extract
>> the corresponding keys to reconstruct the original message.
>
> That sounds so strange, that I wonder if there's a simple explanation:
> the original coder just confused the ordering of the two map parameters?

No, that explanation doesn't work. The code includes a comment
indicating that he would have preferred to use QMap<int, QString>, but
failed to clearly explain why he felt that he couldn't use it.


> Perhaps the intention was to have a map by part number. If multiple
> part numbers were encountered they would be deemed retransmissions and
> it would be OK to just keep the latest one. ...

I was rather confused by the fact that every part arrives at least three
times - I suspect that's a bug, but I haven't tracked it down yet.

...
> issues coming up would just be worked around. (Like, "hey, seems I need
> to sort by values to get things in the right order?". Yes, seems a bit
> crazy, but maybe the coder was new.

That's what I suspect.

...
>> However, if it does find the messageID in the message map, it copies the
>> corresponding part map into a local variable, inserts the new part into
>> that map, and then copies the part map back into the message map,
>> replacing the old part map for that same message ID. I find this quite
>> strange: it wastes time and memory creating a copy of the part map, it
>> wastes time destroying the old part map when the new part map is copied
>> in it's place, and finally wastes time when the new part map gets destroyed.
>> It seems more sensible to me to insert the new part directly into the
>> part map while it is stored inside the message map. In principle, a
>> compiler might optimize the actual code accordingly, but I wouldn't
>> recommend relying on that. Can anyone suggest any good reason why
>> someone might use the more elaborate approach that was actually chosen?
>
> This sounds again like a new coder - perhaps the coder didn't realise it
> was possible to retrieve a /reference/ to the part map? Maybe the coder
> was searching for how to retrieve a value from a map, and the first
> function encountered was a "retrieve by value" function. Under time
> pressure perhaps the thought was, "ok I can make it work using this...".
> Or perhaps the coder saw how to retrieve a reference, but it was a
> reference to a QVariant, not a part map, and the coder was stuck on
> this. (Put another way, maybe the part map in message map either was,
> or seemed (to the coder) to effectively be a const entry that could only
> be updated through replacement, not modification...)

Again, that's pretty much the confusion that I suspect happened.

James Kuyper

unread,
Aug 28, 2019, 9:08:35 PM8/28/19
to
On 8/28/19 2:22 PM, Stefan Große Pawig wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
>> [...]
>> The only explanation I could find for this peculiar design was a short
>> comment: "I'd prefer using QMap<int,QString>, which would be faster -
>> but then it wouldn't be a QVariantMap". The string "QVariantMap" occurs
>> nowhere else anywhere in the relevant code base, and there's nothing
>> about it at <https://doc.qt.io/>. I have no idea what he was thinking
>> about when he wrote that comment. Any suggestions?
>
> QVariant has a large set of toXXX() methods to get at the contained
> type, including possible conversions. Among these is toMap(), which
> returns a QMap<QString, QVariant>.
>
> Is that map of yours stored in another QVariant?

No. The original message map was declared as QMap<QString, QVariant>,
and the QVariant values were filled in with part maps of the type
QMap<QString, int>. I successfully changed the message map to
QMap<QString, QMap<int, QString> > without breaking anything, and that's
because it doesn't interact with any other code that expected QVariants.

James Kuyper

unread,
Aug 28, 2019, 9:11:07 PM8/28/19
to
On 8/28/19 3:33 PM, Jorgen Grahn wrote:
> On Wed, 2019-08-28, James Kuyper wrote:
...
>> The bug was that if one part of a message has the same exact contents as
>> a previously processed part, then the new part number would replace the
>> existing part number, rather than adding a new entry to the map.
>
> That's such an obvious flaw that I'd not trust the rest of that
> programmers choices to make sense. It's simply not a sane way to do
> the trivial task of message reassembly.

I agree.

James Kuyper

unread,
Aug 28, 2019, 9:16:38 PM8/28/19
to
On 8/28/19 9:08 PM, James Kuyper wrote:
> No. The original message map was declared as QMap<QString, QVariant>,
> and the QVariant values were filled in with part maps of the type
> QMap<QString, int>. I successfully changed the message map to
Correction: QMap<QString, QVariant>.

David Brown

unread,
Aug 29, 2019, 4:25:06 AM8/29/19
to
On 29/08/2019 02:58, James Kuyper wrote:
> On 8/28/19 12:00 PM, Öö Tiib wrote:
>> On Wednesday, 28 August 2019 16:30:51 UTC+3, James Kuyper wrote:
>>
>>> The context is a message string that gets broken up into multiple parts
>>> for transmission. Each part is transmitted along with a message ID, a
>>> part number, and the total number of parts.
>>
>> How can receiver realize that all parts have been received?
>
> All parts have been received as soon as the number of different parts
> that have been collected for a given message ID matches the part count.
>
>> Is the message part count known compile time
>
> No, it varies with the length of the message provided by the user.
>
>> ... and part numbers
>> sequential and zero-based?
>
> The part numbers are sequential in the order that the parts should occur
> in. They are sent out sequentially, but in general, they don't arrive
> sequentially. They are 1-based, not 0-based.

I would expect that while the parts /could/ arrive out of order, they
will come mostly roughly in-order.

Since you know how many parts you will get (once you have received one
part), and you know the part numbers are all 1 .. n, it seems strange to
me that you have a map at all. I'd expect to use a vector,
pre-allocated to the right size when the first part of a new message
arrives. Perhaps you'd want the vector elements to be optional<string>,
or just have empty strings for parts that haven't yet arrived. You
might also want to track a counter of "parts waiting" so that you can
easily see that the last part has arrived, without running through the
vector.

>
> ...
>>> Can anyone suggest any good reason why
>>> someone might use the more elaborate approach that was actually chosen?
>>
>> It can be anything. For example initially ambitious requirements
>> did change into simpler and so it was cut out from something
>> way more overcomplicated. The file change history from repo
>> (and commit comments) may illuminate it a bit (if such exist).
>
> The entire block of code containing all of the features I described was
> added in 2015, and hasn't been changed since. Prior to that, multi-part
> messages weren't supported. The commit message didn't provide any
> explanation.
>
> I was wrong about the person responsible being unavailable - but I'm
> having trouble figuring out how to word an inquiry that doesn't make it
> sound like I think he wrote something stupid (since, in fact, that is
> what I'm thinking).
>

There are other possibilities for this odd data structure - other than
stupidity or inexperience. It could be that the requirements or planned
handling system changed underway - maybe the parts had a different
identification system originally rather than simple integers. It could
be that the code was prototyped in a more dynamic language, such as
Python, or at least one where "variant" types are more common, such as
Visual Basic, and the code then got translated into C++. And of course
it could have been written under inhumane deadlines in the small hours
of the night while overdosed on coffee and cold pizza - a lot of odd
code has been written in such circumstances.


>> Results of developers with different skill levels differ by several
>> orders of magnitude, but no one notices when that inefficiency
>> is in some non-critical part of program.
>
> Agreed - the messages are provided by users, and therefore don't arrive
> frequently enough for the inefficiency of this approach to be a
> significant problem. The inefficiency bothers me, but my boss wouldn't
> have let me work on something this low in priority if I weren't
> currently stuck between a project I've finished and a project I'm not
> allowed to start work on (we're waiting for approval from the client).
>

Inefficiency is often not a huge problem - incorrectness is.

James Kuyper

unread,
Aug 29, 2019, 10:06:11 AM8/29/19
to
On 8/29/19 4:24 AM, David Brown wrote:
> On 29/08/2019 02:58, James Kuyper wrote:
...
>> The part numbers are sequential in the order that the parts should occur
>> in. They are sent out sequentially, but in general, they don't arrive
>> sequentially. They are 1-based, not 0-based.
>
> I would expect that while the parts /could/ arrive out of order, they
> will come mostly roughly in-order.

That was my expectation, too. Reality didn't cooperate. During
debugging, I never saw the parts arrive in strict sequential order, and
they often arrived completely jumbled. Sometimes, for small messages,
the final part would be the first one to arrive.

That triggered another bug, but I've simplified my explanation of the
code too much to explain why without adding a lot more details. Suffice
it to say that the author apparently expected the final part to always
be the one that arrived last - despite the fact that, if the parts were
guaranteed to arrive in order, he could have simply appended each part
to a string containing the parts that had already been received. That
bug didn't puzzle me as much as the other issues I've mentioned, which
is why I didn't mention it.

> Since you know how many parts you will get (once you have received one
> part), and you know the part numbers are all 1 .. n, it seems strange to
> me that you have a map at all. I'd expect to use a vector,
> pre-allocated to the right size when the first part of a new message
> arrives. Perhaps you'd want the vector elements to be optional<string>,
> or just have empty strings for parts that haven't yet arrived. You
> might also want to track a counter of "parts waiting" so that you can
> easily see that the last part has arrived, without running through the
> vector.

I thought about the vector approach, but with the map approach, the
size() member serves as parts_received, and the total count of parts is
received with every part, so parts_waiting is just the difference. With
the vector approach I'd have to keep track of one additional quantity
(either parts_received, as I thought, or parts_waiting, per your
suggestion). That adds just a little additional complexity, but enough
to discourage me from using that approach.
...
> There are other possibilities for this odd data structure - other than
> stupidity or inexperience. It could be that the requirements or planned
> handling system changed underway - maybe the parts had a different
> identification system originally rather than simple integers. It could
> be that the code was prototyped in a more dynamic language, such as
> Python, or at least one where "variant" types are more common, such as

That's plausible - some parts of this project are written in python.

...
>> Agreed - the messages are provided by users, and therefore don't arrive
>> frequently enough for the inefficiency of this approach to be a
>> significant problem. The inefficiency bothers me, but my boss wouldn't
>> have let me work on something this low in priority if I weren't
>> currently stuck between a project I've finished and a project I'm not
>> allowed to start work on (we're waiting for approval from the client).
>>
>
> Inefficiency is often not a huge problem - incorrectness is.

The code is incorrect, but cases which trigger the bug are extremely
unlikely. With the inefficiency being small, and the incorrectness
unlikely, this is justifiably a low-priority bug. I definitely have more
important things to work on as soon as the work gets approved (I was
investigating one of those more important things when I found this bug).

Stefan Große Pawig

unread,
Aug 29, 2019, 2:20:13 PM8/29/19
to
[some context restored]

James Kuyper <james...@alumni.caltech.edu> writes:
> On 8/28/19 9:08 PM, James Kuyper wrote:
>> On 8/28/19 2:22 PM, Stefan Große Pawig wrote:
>>> Is that map of yours stored in another QVariant?
>>>
>> No. The original message map was declared as QMap<QString, QVariant>,
>> and the QVariant values were filled in with part maps of the type
>> QMap<QString, int>. I successfully changed the message map to
> Correction: QMap<QString, QVariant>.

But then, IIUC, that fits the bill: the QMap<QString, QVariant> for the
parts /was/ stored in another QVariant (namely, as the value type of the
message map).

If the code used QVariant::value() to fetch the part map from the message
map, there is no restriction on the type of the part map. The restriction
only arises if the caller used QVariant::toMap().

Regardless of what the code originally used, maybe the original author
had just scanned the QVariant interface for anything related to maps,
found only the QMap<QString, QVariant> as baked-in return value and then
assumed that this was the only map type that could be stored in a
QVariant?

-Stefan

James Kuyper

unread,
Aug 29, 2019, 8:14:18 PM8/29/19
to
The key point is that the author of the code freely chose to declare the
part map as QMap<QString, QVariant>, where the QVariant was used
exclusively to hold integers. He also freely choose to declare the
message map as QMap<QString, QVariant>, where the QVariant was used to
store a part map. There was no code outside the code he wrote which gave
him any excuse for using QVariant for both purposes - no code outside of
what he wrote used QVariant to interface with the code he did wrote.
Yes, he did use toMap(), to extract the map from the QVariant, but he
was the one who wrote the code to put the map into the QVariant.

If he'd used QMap<QString, int> for the part map, and QMap<QSTring,
QMap<QString, int> > for the message map, it would all have worked fine
(except for the bug that I fixed), and he would have been able to
extract the map directly, rather than by calling toMap(). So why did he
use QVariant instead? I'm getting the impression that he used QVariant
without bothering to consider whether it was actually needed.

David Brown

unread,
Aug 30, 2019, 2:28:47 AM8/30/19
to
I can imagine a few possible explanations. One is that, as I mentioned
earlier, either the code or the programmer comes from a background of
dynamically typed language - he may simply have felt comfortable with
variant types. Perhaps he was less used to complex types and found
"QMap<QString, QVariant>" to be simpler (in his eyes) than
"QMap<QSTring, QMap<QString, int> >".

For a long time, templates had a reputation for program size bloat in
C++. (In the failed "EC++" "Embedded C++" standard, templates were
banned for that reason. Though $DEITY only knows why namespaces were
banned too.) Some people were left with the impression that you should
avoid instantiating too many different template types, to reduce code
bloat and duplication. Maybe he thought re-using a single map type was
more efficient in program size than using two different types.

(I'm not suggesting that this is a /good/ reason, even if it were the
case that it would give smaller code - something I doubt.)

Öö Tiib

unread,
Aug 30, 2019, 3:15:13 AM8/30/19
to
On Friday, 30 August 2019 09:28:47 UTC+3, David Brown wrote:
>
> I can imagine a few possible explanations. One is that, as I mentioned
> earlier, either the code or the programmer comes from a background of
> dynamically typed language - he may simply have felt comfortable with
> variant types. Perhaps he was less used to complex types and found
> "QMap<QString, QVariant>" to be simpler (in his eyes) than
> "QMap<QSTring, QMap<QString, int> >".

Yes, it may be was translated in his mind from Python's Qt support (like
PySide2.QtCore) where unneeded QVariant usage was harder to notice.
And what was needed was probably QMap<QSTring, QMap<int, QString> >"
anyway.

Manfred

unread,
Sep 13, 2019, 11:49:43 AM9/13/19
to
On 8/30/2019 2:14 AM, James Kuyper wrote:
> If he'd used QMap<QString, int> for the part map, and QMap<QSTring,
> QMap<QString, int> > for the message map, it would all have worked fine
> (except for the bug that I fixed), and he would have been able to
> extract the map directly, rather than by calling toMap(). So why did he
> use QVariant instead? I'm getting the impression that he used QVariant
> without bothering to consider whether it was actually needed.

An abstract reason (i.e. with no sight of the code) is to distinguish
between an empty message and a non existing message:
You know that if you call messageMap["foo"] and the is no "foo" element,
the map creates a default object for you (at least std::map does, I
guess QtMap does it too)
If the value is a variant you get a default variant (which obviously
isn't a map), and if the value is a map you get an empty map, i.e. an
empty message.

Just guessing, as said I have no idea if this might even remotely be the
case.
(BTW of course if I had to try and access an element without
constructing one find() would be the right thing to do)

A similar argument would apply if you chose to use a vector: if you have
a vector of variants you could count how many fragments have been
received by counting how many elements contain strings, instead of using
empty strings for non-received elements, which could be wrong depending
on the transmission protocol.

Disclaimer: even if any of the above could explain what happened with
the code, there is no way to say if it could be in any way a good choice
without more info on the code and its purpose.
From what you have posted it seems to be an overcomplicated solution
for a relatively easy problem anyway, so caution is in order.

James Kuyper

unread,
Sep 13, 2019, 3:04:19 PM9/13/19
to
On 9/13/19 11:49 AM, Manfred wrote:
> On 8/30/2019 2:14 AM, James Kuyper wrote:
>> If he'd used QMap<QString, int> for the part map, and QMap<QSTring,
>> QMap<QString, int> > for the message map, it would all have worked fine
>> (except for the bug that I fixed), and he would have been able to
>> extract the map directly, rather than by calling toMap(). So why did he
>> use QVariant instead? I'm getting the impression that he used QVariant
>> without bothering to consider whether it was actually needed.
>
> An abstract reason (i.e. with no sight of the code) is to distinguish
> between an empty message and a non existing message:

No entry is placed into the message map until the first part of a
message is received (and even then, only if it's a multi-part message -
single-part messages are handled immediately rather than being saved up
waiting for the remaining parts). Therefore, there will always be at
least one part of one message for every entry in the map.
0 new messages