Sounds like we should switch those preprocessor defines to numeric_limits<> functions …
> and there was no Unicode support, which is
> compiled for in Windows by default…
Not sure what this means.
> I have yet to get the cpp port to correctly scan any image but I've
> been working strictly with QR codes so perhaps bar codes will work
> better...
Actually, the qr code support is much more solid in C++ than the 1D codes. Not sure what's failing in your port. Sounds like it's gonna take some tracking down ...
(Presumably) this is okay and VC++ is being overly conservative. We could probably quiet it with a cast but see below.
> Sadly, I won't say that it works as it
> should because, as rare as it is for me to agree with MS, it is wrong
> to take the unary minus of an unsigned type and get a signed one.
You don't. Negating an unsigned is an unsigned of the same width. It's completely legitimate to take the 2s complement of an unsigned. It'll never compare less than zero, but we aren't expecting it to … I hope. (See below.)
The warning is just saying what the standard says. I presume it's there because it's a common mistake but it is an entirely valid expression and entirely well defined (modulo my misinterpretation of the standard.)
What this does point out, though, is that (in my opinion) this code should not be using size_t's. We really want to keep the C++ as close to the Java code as we can so weird (and hard to debug) differences don't crop up. Since Java doesn't have unsigned types, this means we should be using signed types even where unsigned types make sense/seem safe.
In this case, it doesn't make any difference because the final values are always non-negative (since i is used to deference the image and that will fail for negative values) but it's certainly possible for Java code to use negative values as a flag and if we copy expressions like that but change the type to unsigned, we'll definitely get different behavior. Generally this will generate a warning, at least in gcc, but it's just too error prone, too easy to silence with an incorrect cast.
My thinking is that this (and probably other) uses of size_t should be switched to ptrdiff_t. It's more or less that signed equivalent of size_t. ssize_t is the other obvious possibility but the theoretical definition of ssize_t (close to size_t, but also -1) is not as general as Java int. ptrdiff_t is more or less supposed to be able to represent the distance between the beginning and end of large arrays in either direction, which is seems right.
Thoughts?
Any size smaller than int undergoes integral promotion before being used in an expression, so an unsigned char becomes an unsigned int then the - is applied to the unsigned int and it stays an unsigned int, so I'm not sure why VC++ would warn for size_t and not unsigned char. I would understand if - on an unsigned promoted it to the next larger signed type but it doesn't.
So I still hold that it's an invalid warning, so who knows how they decide when to put up invalid warnings.
> It depends on what you mean by "legitimate"
By the definition in the standard.
Strictly speaking the standard doesn't say twos complement, it says
"The negative of an unsigned quantity is computed by subtracting its value from 2^n,
where n is the number of bits in the promoted operand."
> I won't pretend to know the standards specification but I'll just
> point out that you can omit returning a value for a non-void function
> but I don't think you'd call that "valid…"
The standard says that not providing a return value in a value-returning function results in undefined behavior.
There's nothing undefined about the result of unary minus on an unsigned.
> Incidentally, gcc doesn't bother giving a warning for that, either!
You need to use -Wall more often. We compile zxing with -Wall -Wextra -Werror, which is the way I compile all my C/C++/ObjectiveC/ObjectiveC++ code. I usually throw -pedantic in there, too, but zxing relies on auto arrays of variable size which is a non-standard extension.
> I must say that I'd object (not that I'm anyone important. I'm just
> stating my opinion here) to using negative numbers as flags.
I didn't mean to imply that it does (and whether you do or don't isn't an argument I'm personally particularly interested). I only raised it because it would be a compatibility issue if it did.
> ptrdiff_t is supposed to be the result of pointer arithmetic which,
> admittedly, is usually done only for array manipulation but still...
> not all pointer arithmetic is done on arrays.
Actually, ptrdiff_t is defined to be the type of pointer differences; all other valid pointer-pointer operations return bool. And pointer difference is only defined for arrays:
"Unless both pointers point to elements of the same array object, or
one past the last element of the array object, the behavior is undefined."
Of course, it's commonly used in lots of other ways, but strictly speaking that relies on undefined behavior.
> I think the most prudent choice for
> clarity is to simply use type int
sizeof(int) < sizeof(ptrdiff_t) for x86_64 so I still think ptrdiff_t is the right choice.
I just reread this and it reads wrong. It sounds like I'm (pretty obnoxiously) saying I'm not interested in nerdyvideo's opinion. I actually meant "whether we do or don't", in other words, I don't have a strong opinion/am not terribly interested in arguments on whether or not the project as whole uses this pattern. Not a hot button for me.
My apologies for sounding so rude ...
Perhaps. zxing seems to have tended away from STL, perhaps for portability, I don't know. It might require checking to see if it's in an inner loop as well, since it's possible the dynamic allocation could have a performance impact.
But for now it's more an issue of if it's not broke, don't fix it.
> Because none of the variables in question are
> actually the result of pointer arithmetic, I still contend that
> ptrdiff_t is not the right choice.
ptrdiff_t is a signed integral type with range on the same order as that of a pointer. And that's the same definition for valid array indices since array dereference is just pointer arithmetic.
To make the C++ code entirely equivalent to the Java, we'd use int32_t since Java always uses the same widths. But Java also can't have arrays with more than 2G elements and C++ can (not that zxing will ever use it.) Some internal operations return these types (sizeof, pointer differences) so it still makes more sense to me to use them. Word-size related bugs are not common, not as common as signedness bugs.
I think it's in the 1D codes.
> I get the impression that portability is a greater priority for zxing
> than performance.
True. It'd have to be a significant performance difference to outweigh portability. I really doubt it is, but it's worth sanity-checking. Glancing at the Java code should make it clear quickly.
> If so then then it really is "broke" because VC++
> doesn't support this extension.
Sure. I haven't seen anyone raise VC++ issues. I support making it compatible if someone cares enough to file a bug. I only stumbled across it because I was tweaking the warnings to make the compilation more conservative.
> So far, it appears that the STL,
> which is part of the C++ standard, is more portable than compiler
> extensions…
Yeah. I don't know the history here. There are a number of classes in zxing that are pretty close to duplicates of STL. Not sure why STL wasn't used. Not sure if perhaps originally not requiring STL was a goal (but as you mention, as Array now uses std::vector, there is an STL dependence).
> Unless you are actually trying to spite MS programmers
Nope. I don't know when that code was written. It might simply have been an oversight, added without noticing it was an extension since the warnings that would have surfaced it weren't turned on.
I looked at making zxing compile with -pedantic but personally tabled it because I didn't really have a need and as I recall, it was more than just a few lines of code. Don't think I ever did compare the C++ to the Java in those cases. Now I'm a little curious …
In any case, I for one would welcome a patch …
As to the ptrdiff_t issue, in the the case of zxing where commonality with the Java is valuable, I wouldn't object to using int32_t and never use int or size_t. It limits some things, but only in the theoretical sense. There's nothing in zxing that I'm aware of the would benefit from taking advantage of a 64 bit architecture.
> ptrdiff_t is
> not a bad choice, I just think it's misleading 'cause it implies
> something that isn't true (pointer arithmetic)
Personally, I think you're taking too literal of an interpretation of the name. The name matters less. What matters is what it is, which is the signed type of the same width as size_t (at least in every case I know of).
Mixing operands that differ in both signed-ness and width at the same time can be the source of lots of bugs. (Google around … not only will you find people talking about this, at least some of the high ranked search results have their own subtle bugs.)
The rule of thumb is for things that use lengths and indexes, use same-width pairs: e.g., int/unsigned, int32_t/uint32_t, ptrdiff_t/size_t.
If you use size_t and think your collections scale up in size on 64 bit architectures but then use int to represent indexes or index arithmetic, you're likely to have bugs if you ever do scale up over 2G elements. If you're going to use int for indexes, you should use unsigned for sizes, not size_t.
The Java code uses int everywhere because it is, in some sense, the natural size of collections in Java. There are a lot of places in the zxing code where shorts and maybe even bytes could be used. But people don't really bother to be too precise. On the other hand, longs aren't used because Java collections don't deal in longs. So int is the correct native type.
C doesn't have a standard width and size_t and ptrdiff_t were created as types that would represent the natural/maximum sizes for arrays (and in essence, all collections) for the implementation. You can use other pairs as long as your consistent and you don't go over the bounds of size_t/ptrdiff_t.
Ah, right. I remember now. Ref/Counted are intrusive. C++11 adds non-intrusive std::shared_ptr (based on boost::shared_ptr which I'm familiar with.) Maybe some day we can switch to that.