Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Getting the smallest signed type that can represent all the values of a given unsigned type

76 views
Skip to first unread message

Seungbeom Kim

unread,
May 16, 2012, 12:23:50 AM5/16/12
to
Hi all,

The subject explains itself. Let's say you have a pair of objects, a and b,
of an unsigned integer type U, and you want to calculate their difference
in a signed integer type.

Obviously you cannot simply write "a - b", because that will yield an
unsigned integer value in modulo arithmetic. So you want to write

S d = static_cast<S>(a) - static_cast<S>(b);

for a signed integer type S.

The problem is, how do you determine S? Of course, S should be large enough
to represent all the values of U. To be safe, you could just use S=intmax_t
for every U, but that might be an overkill; if U=uint16_t, then S=uint32_t
is enough, and you don't want to use more expensive arithmetic in uint64_t
or uint128_t. So, something like S=int_least[N+1]_t where N=numeric_limits
<U>::digits() would be desirable (+1 because of the sign bit).

Is there a way to automatically determine S from U?

--
Seungbeom Kim


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Alf P. Steinbach

unread,
May 16, 2012, 3:21:11 AM5/16/12
to
On 16.05.2012 06:23, Seungbeom Kim wrote:
> Hi all,
>
> The subject explains itself. Let's say you have a pair of objects, a and b,
> of an unsigned integer type U, and you want to calculate their difference
> in a signed integer type.
>
> Obviously you cannot simply write "a - b", because that will yield an
> unsigned integer value in modulo arithmetic. So you want to write
>
> S d = static_cast<S>(a) - static_cast<S>(b);
>
> for a signed integer type S.
>
> The problem is, how do you determine S? Of course, S should be large enough
> to represent all the values of U. To be safe, you could just use S=intmax_t
> for every U, but that might be an overkill; if U=uint16_t, then S=uint32_t
> is enough, and you don't want to use more expensive arithmetic in uint64_t
> or uint128_t. So, something like S=int_least[N+1]_t where N=numeric_limits
> <U>::digits() would be desirable (+1 because of the sign bit).
>
> Is there a way to automatically determine S from U?
>

I think this should work, although I recommend primarily to avoid using
unsigned types for numbers, and secondly, if one must, use `ptrdiff_t`
for differences (after all, that's what you get for a pointer difference):


<code>
#include <iostream>
#include <typeinfo>
using namespace std;

struct CompleteVoid {};

template< class Type > struct Next;
template<> struct Next<char> { typedef unsigned char T; };
template<> struct Next<signed char> { typedef unsigned char T; };
template<> struct Next<unsigned char> { typedef short T; };
template<> struct Next<short> { typedef unsigned short T; };
template<> struct Next<unsigned short> { typedef int T; };
template<> struct Next<int> { typedef unsigned T; };
template<> struct Next<unsigned> { typedef long T; };
template<> struct Next<long> { typedef unsigned long T; };
template<> struct Next<unsigned long> { typedef long long T; };
template<> struct Next<long long> { typedef unsigned long long T; };
template<> struct Next<unsigned long long> { typedef CompleteVoid T; };

template< bool condition, class A, class B >
struct Choose;

template< class A, class B >
struct Choose< true, A, B > { typedef A T; };

template< class A, class B >
struct Choose< false, A, B > { typedef B T; };

template< class Type >
struct NextLarger
{
private:
typedef typename Next<Type>::T NextUp;
public:
typedef typename Choose<
(sizeof(NextUp) > sizeof(Type)),
NextUp,
typename NextLarger<NextUp>::T
>::T T;
};

template<>
struct NextLarger<CompleteVoid>
{
public:
typedef void T;
};

int main()
{
wcout
<< "The signed type above & covering 'unsigned' is '"
<< typeid( NextLarger<unsigned>::T ).name()
<< "'."
<< endl;
}
</code>


For g++ there is g++-specific function that produces readable type
names, I don't recall its name but you can easily find it.


Cheers & hth.,

- Alf


--

Francis Glassborow

unread,
May 16, 2012, 1:35:55 PM5/16/12
to
On 16/05/2012 05:23, Seungbeom Kim wrote:
> Hi all,
>
> The subject explains itself. Let's say you have a pair of objects, a and b,
> of an unsigned integer type U, and you want to calculate their difference
> in a signed integer type.
>
> Obviously you cannot simply write "a - b", because that will yield an
> unsigned integer value in modulo arithmetic. So you want to write
>
> S d = static_cast<S>(a) - static_cast<S>(b);
>
> for a signed integer type S.
>
> The problem is, how do you determine S? Of course, S should be large enough
> to represent all the values of U. To be safe, you could just use S=intmax_t
> for every U, but that might be an overkill; if U=uint16_t, then S=uint32_t
> is enough, and you don't want to use more expensive arithmetic in uint64_t
> or uint128_t. So, something like S=int_least[N+1]_t where N=numeric_limits
> <U>::digits() would be desirable (+1 because of the sign bit).
>
> Is there a way to automatically determine S from U?

There cannot be a universal solution and on some systems there may be no
solution at all (there is nothing in the standard to prevent all integer
types, both signed and unsigned from having the same number of bits
representing them)

Your best bet is to try using signed long long, which will work as long
as the unsigned type is not unsigned long long and that the long long
types have a larger range than the long types.

Francis

PS if this is a really critical issue you could always add your own
very_long_int type
>


--

Thiago Adams

unread,
May 16, 2012, 1:38:16 PM5/16/12
to
On May 16, 1:23 am, Seungbeom Kim <musip...@bawi.org> wrote:
> Hi all,
>
> The subject explains itself. Let's say you have a pair of objects, a and b,
> of an unsigned integer type U, and you want to calculate their difference
> in a signed integer type.
>
> Obviously you cannot simply write "a - b", because that will yield an
> unsigned integer value in modulo arithmetic. So you want to write
>
> S d = static_cast<S>(a) - static_cast<S>(b);
>
> for a signed integer type S.
>
> The problem is, how do you determine S? Of course, S should be large enough
> to represent all the values of U. To be safe, you could just use S=intmax_t
> for every U, but that might be an overkill; if U=uint16_t, then S=uint32_t
> is enough, and you don't want to use more expensive arithmetic in uint64_t
> or uint128_t. So, something like S=int_least[N+1]_t where N=numeric_limits
> <U>::digits() would be desirable (+1 because of the sign bit).
>
> Is there a way to automatically determine S from U?


I asked myself the same question while developing a bigint
library. I think the general question is auto-type-selection
considering some criteria like speed or size. I postponed
the problem using the worst case long long.

I think we can solve the problem having a conversion table
for all integral types. The table is built considering some
criteria. In your case I think int32 is fine for all cases,
unless you need int64.

[unsigned char, unsigned chart] -> int
[unsigned short, unsigned short] -> int
...more...
[unsigned long long, unsigned long long] -> long long

----

template<class T, class U> struct SelType;

template<>
struct SelType<unsigned char, unsigned char> {
typedef int type;
};

template<>
struct SelType<unsigned short, unsigned short> {
typedef int type;
};

...more...

template<>
struct SelType<unsigned long long, unsigned long long> {
typedef long long type;
};

template<class U>
void Diff(U a, U b)
{
typedef SelType<U, U>::type S;
S d = static_cast<S>(a) - static_cast<S>(b);
}


--

Seungbeom Kim

unread,
May 22, 2012, 2:40:43 PM5/22/12
to
On 2012-05-16 10:38, Thiago Adams wrote:
>
> I think we can solve the problem having a conversion table
> for all integral types. The table is built considering some
> criteria. In your case I think int32 is fine for all cases,
> unless you need int64.
>
> [unsigned char, unsigned chart] -> int
> [unsigned short, unsigned short] -> int
> ....more...
> [unsigned long long, unsigned long long] -> long long

This is not a portable solution because you don't know whether int
will be sufficient for unsigned char or unsigned short; it could be
that sizeof(char) = sizeof(short) = sizeof(int) < sizeof(long) and
that you need to cast unsigned char to signed long.

This is a property of the C++ implementation, so the implementation
trivially knows the solution to this problem, but it's quite hard to
come up with a satisfying solution in a user code; it's just like
trying to invent int_least32_t on your own (without <cstdint>).
I hoped there would be some help from the implementation that I didn't
know, but there apparently isn't.

> ----
>
> template<class T, class U> struct SelType;
>
> template<>
> struct SelType<unsigned char, unsigned char> {
> typedef int type;
> };
>
> template<>
> struct SelType<unsigned short, unsigned short> {
> typedef int type;
> };
>
> ....more...
>
> template<>
> struct SelType<unsigned long long, unsigned long long> {
> typedef long long type;
> };
>
> template<class U>
> void Diff(U a, U b)
> {
> typedef SelType<U, U>::type S;
> S d = static_cast<S>(a) - static_cast<S>(b);
> }

If you're always using the same argument for the two template parameters
for SelType, why do you define SelType so that it takes two parameters?

--
Seungbeom Kim

Thiago Adams

unread,
May 22, 2012, 8:22:58 PM5/22/12
to
On May 22, 3:40 pm, Seungbeom Kim <musip...@bawi.org> wrote:
> On 2012-05-16 10:38, Thiago Adams wrote:
> > I think we can solve the problem having a conversion table
> > for all integral types. The table is built considering some
> > criteria. In your case I think int32 is fine for all cases,
> > unless you need int64.
>
> > [unsigned char, unsigned chart] -> int
> > [unsigned short, unsigned short] -> int
> > ....more...
> > [unsigned long long, unsigned long long] -> long long
>
> This is not a portable solution because you don't know whether int
> will be sufficient for unsigned char or unsigned short; it could be
> that sizeof(char) = sizeof(short) = sizeof(int) < sizeof(long) and
> that you need to cast unsigned char to signed long.
>
> This is a property of the C++ implementation, so the implementation
> trivially knows the solution to this problem, but it's quite hard to
> come up with a satisfying solution in a user code; it's just like
> trying to invent int_least32_t on your own (without <cstdint>).
> I hoped there would be some help from the implementation that I didn't
> know, but there apparently isn't.
>

Yes, the table depends on a specific implementation considering
target and compiler. If it only depends on sizeofs then we can still
select tables automatically using templates, and in the worst case
using #ifdefs with compiler and information provided by user.

I didn’t understand exactly what criteria you want. The title suggests
you want the minimal size, but the sample suggests you want to use the
less expensive in performance. (I guess the solution is not the same)

My problem was a little bit different.
The criteria I need is fast and correct. size doesn't matter.

template<class A, class B>
void Add(A a, B b)
{
// A and B are any type you can imagine but
// they represent positive integers {0, 1, 2 ...}
// I need to suport C++ integral types for A and B.

// All I need is the correct result for a + b!
// overflow not allowed
// size is not important
}

The problem is very easy to understand, but I found it hard
to implement in a generic way. What could be simpler than that?
But the C++ compiler is not helping.

> > ----
>
> > template<class T, class U> struct SelType;
>
> > template<>
> > struct SelType<unsigned char, unsigned char> {
> > typedef int type;
> > };
>
> > template<>
> > struct SelType<unsigned short, unsigned short> {
> > typedef int type;
> > };
>
> > ....more...
>
> > template<>
> > struct SelType<unsigned long long, unsigned long long> {
> > typedef long long type;
> > };
>
> > template<class U>
> > void Diff(U a, U b)
> > {
> > typedef SelType<U, U>::type S;
> > S d = static_cast<S>(a) - static_cast<S>(b);
> > }
>
> If you're always using the same argument for the two template parameters
> for SelType, why do you define SelType so that it takes two parameters?

Sorry, my suggestion was influenced by my problem and the table would
require two input types.
If you can use a single type without loss of generacity I think it's
much simpler.


--

Seungbeom Kim

unread,
May 23, 2012, 3:09:39 PM5/23/12
to
On 2012-05-22 17:22, Thiago Adams wrote:
>
> I didn’t understand exactly what criteria you want. The title suggests
> you want the minimal size, but the sample suggests you want to use the
> less expensive in performance. (I guess the solution is not the same)

Sorry for not being clear. I was thinking that intmax_t would be an
"easy" answer but also that it could be both larger and more expensive
than necessary. It may be that we need two different solutions, just as
<cstdint> provides both int_leastN_t and int_fastN_t.

--
Seungbeom Kim

Seungbeom Kim

unread,
May 23, 2012, 3:12:51 PM5/23/12
to
On 2012-05-15 21:23, Seungbeom Kim wrote:
>
> The problem is, how do you determine S? Of course, S should be large
> enough to represent all the values of U. To be safe, you could just
> use S=intmax_t for every U, but that might be an overkill; if
> U=uint16_t, then S=uint32_t is enough, and you don't want to use
> more expensive arithmetic in uint64_t or uint128_t. So, something
> like S=int_least[N+1]_t where N=numeric_limits <U>::digits() would
> be desirable (+1 because of the sign bit).

A useful tool for programmers to have is a template version of
[u]int_{least,fast}[N]_t: if we had

namespace std
{
template <size_t N> struct int_least
{
typedef ... type; // defined only if a signed integer type
// at least N bits wide exists
};

template <size_t N> struct int_fast; // similarly
}

then the original problem is solved:

typedef std::int_least<std::numeric_limits<U>::digits()+1>::type S;

And it's quite easy for implementations to provide as well, using
recursion and a few specializations for those values of N for which
the implementation has std::int_leastN_t, including 8, 16, 32 and 64.
0 new messages