I had to fix a bug recently in poorly documented code written by someone
else, who was not available for consultation. This code is written in
C++, but uses Qt, which I'm not as familiar with as I'd like. However, I
believe that QMap, QString, and QVariant do not differ from std::map,
std::string, and std::variant in ways that are significant to the
following explanation - please correct me if I'm wrong about this. I
can't show the actual code (it's proprietary), but if you find the
following description confusing, I could post equivalent code.
The context is a message string that gets broken up into multiple parts
for transmission. Each part is transmitted along with a message ID, a
part number, and the total number of parts. The parts were not
guaranteed to arrive at their destination in any particular order. Parts
of multiple different messages might be intermingled with each other.
The receiving code stored, for each a message, a map containing the
parts of the message received so far, to be recombined after all parts
have been received. Here's where it got decidedly odd: the code uses a
QMap<QString, QVariant), using the text of each message part as a key,
and passing the part number as the corresponding value. When all parts
had arrived, it would then sort the map by it's values, and then extract
the corresponding keys to reconstruct the original message.
The bug was that if one part of a message has the same exact contents as
a previously processed part, then the new part number would replace the
existing part number, rather than adding a new entry to the map. As a
result, the number of parts collected would never match the number
originally sent, so the message could never be completed, and eventually
it got discarded. In practice, this situation isn't likely to come up; I
found the bug during testing only because I was testing to confirm
limits on the total message size, so I used a test message containing
many repetitions of "1234567890", in order to more easily count how many
characters I was sending. As soon as I tested with a message that needed
more than 10 parts, a duplicate was guaranteed.
The solution was trivial: use a QMap<int,QString> rather than
QMap<QString,QVariant>, using the part number as the key. There was
absolutely no need in this code to generalize by using QVariant - it was
used solely to store int values. This rearrangement had the added
advantage of avoiding the slow sort step, because QMap already allows
fast extraction of values in key order.
The only explanation I could find for this peculiar design was a short
comment: "I'd prefer using QMap<int,QString>, which would be faster -
but then it wouldn't be a QVariantMap". The string "QVariantMap" occurs
nowhere else anywhere in the relevant code base, and there's nothing
about it at <
https://doc.qt.io/>. I have no idea what he was thinking
about when he wrote that comment. Any suggestions?
At a higher level in the same code, all of the messages currently being
received are stored in a QMap<QString, QVariant>, where the key is the
message ID, and value is the QMap containing the corresponding part map.
That code does something that also strikes me as inefficient and
strange, but at least it's not actually buggy:
It first searches for the messageID in the message map. If it is not
found, it creates a new part map, inserts the current part into that
map, and then inserts that part map into the message map. That code
makes sense to me.
However, if it does find the messageID in the message map, it copies the
corresponding part map into a local variable, inserts the new part into
that map, and then copies the part map back into the message map,
replacing the old part map for that same message ID. I find this quite
strange: it wastes time and memory creating a copy of the part map, it
wastes time destroying the old part map when the new part map is copied
in it's place, and finally wastes time when the new part map gets destroyed.
It seems more sensible to me to insert the new part directly into the
part map while it is stored inside the message map. In principle, a
compiler might optimize the actual code accordingly, but I wouldn't
recommend relying on that. Can anyone suggest any good reason why
someone might use the more elaborate approach that was actually chosen?
Note that the code does not do anything else with either the message map
or the local copy of the part map. While the code as a whole is
multi-threaded, everything I described happens within a single thread.
The code contains none of the protections it would have to have if the
message map were shared between threads.