I want to devise a C89 preprocessor macro MULDIV(a,b,c) which, given positive integer constant expressions a b c, produce an integer constant expression evaluable by the preprocessor and equal to a*b/c even though a*b might grossly overflow the range of long. I can assume a, b, c, a*b/c are all in range [1..2000000000].
I came up with something that works well enough for my current application, by combining two techniques:
1) Basic arithmetic tells that
a*b/c == ((a/c)*c + a%c) * b) / c
== (a/c)*b + (a%c)*((b/c)*c + b%c))/c
== (a/c)*b + (b/c)*(a%c) + (a%c)*(b%c)/c
This is enough to solve the problem when c<46342:
#define MULDIV1(a,b,c) ( ((a)/(c))*(b) + (b)/(c)*((a)%(c)) + \
((a)%(c))*((b)%(c))/(c) )
#if MULDIV1(18536400,4634100,46341)!=1853640000
#error "MULDIV1(18536400,4634100,46341) fails"
#endif
#if MULDIV1(345437,268303627,46341)!=1999999999
#error "MULDIV1(345437,268303627,46341) fails"
#endif
note: at the expense of simplicity, we can reach c<92682 with:
#define MULDIV2(a,b,c) ( ((a)/(c))*(b)+(b)/(c)*((a)%(c))+\
MD__M( (c), (a)%(c), (b)%(c), (a)%(c)>(c)/2, (b)%(c)>(c)/2 ) )
#define MD__M(c,d,e,f,g) d*(g)+e*(f)-(f&&g)*c+\
(((f?c-d:d)*(g?c-e:e)-(f!=g))/c+(f!=g))*(d&&e?f!=g?-1:1:0)
#if MULDIV2(260756292,710863,92681)!=1999999999
#error "MULDIV2(260756292,710863,92681) fails"
#endif
#if MULDIV2(172572,1074113993,92681)!=1999999999
#error "MULDIV2(172572,1074113993,710863,92681) fails"
#endif
2) In my application, when c is big, it will share common primes factors with a and/or b and many of theses factors will be 2 3 or 5. Thus the fraction can be reduced. In particular, the powers of 2 can be removed relatively simply:
#define MULDIV3(a,b,c) MULDIV1( (a)/MD__A((a),(c)), (b)/MD__B((a),(b),(c)),\
(c)/MD__A((a),(c))/MD__B((a),(b),(c)) )
#define MD__A(a,c) (((a|c)^((a|c)-1))/2+1)
#define MD__B(a,b,c) MD__A(b,c/MD__A(a,c))
#if MULDIV3(1471048704,1032258064,759250944)!=1999999999
#error "MULDIV3(1471048704,1032258064,759250944) fails"
#endif
#if MULDIV3(944816128,1984188898,1518501888)!=1234567889
#error "MULDIV3(944816128,1984188898,1518501888) fails"
#endif
With slightly more math, and a preprocessor tolerant of huge macros, I managed to also remove enough powers of 3 and 5.
But I'm looking for something simpler and more general. Any idea? I could live with a slightly inaccurate result, provided the relative error is always at most 1/10000.
[Actually I want a*b/c by excess, that is (a*b-1)/c+1, and I have the problem that a and b can be 0, but these are minor details.]
TIA,
Francois Grieu
Why not cast each of a, b, c to double,
then cast result back to long at the end?
(Or use long double or long long instead of
double, if you know your compiler has them.)
Alternatively, ...
At execute-time you'd use Euclid's gcd algorithm
on (a,c), then (b,c/gcd(a,c)), right? You can
do this at compile-time if you unroll Euclid's
loop. Unfortunately I think a few *dozen* passes
through Euclid's loop will be needed in the worst
case. But maybe that worst-case won't apply to
your data, or maybe there's a short-cut....
James Dow Allen
Francois, this is exactly what you *don't do* with a preprocessor. The
simple, more general solution is: don't do it.
--
fred
Because that's not possible in a preprocessor expression?
> (Or use long double or long long instead of
> double, if you know your compiler has them.)
Same?
> Alternatively, ...
> At execute-time you'd use Euclid's gcd algorithm
> on (a,c), then (b,c/gcd(a,c)), right? You can
> do this at compile-time if you unroll Euclid's
> loop. Unfortunately I think a few *dozen* passes
> through Euclid's loop will be needed in the worst
> case. But maybe that worst-case won't apply to
> your data, or maybe there's a short-cut....
That works in the preprocessor, with a fixed depth. But it generates
huge expressions, and won't cut it when c is coprime with a and b.
Francois Grieu
> Because that's not possible in a preprocessor expression?
?
#define long_of_double_muldiv(a, b, c) \
((long) ((((double)(a)) * ((double)(b))) / ((double)(c))))
-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
He's looking for "an integer constant expression evaluable by the
preprocessor", i.e., something usable in a #if.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
I'm programming for an embedded platform, with not floats, 8-bit ALU,
2kB RAM, 1 MIPS, and 0.005 s mandated boot time including 0.0035 s
already used. Thus pre-computing tables for timing constants is a must.
Of course I can do it with a spreadsheet, or an auxiliary program, or I
could express the various clock frequencies in some strange unit rather.
But I like to change a frequency expressed in Hertz in a C header and
recompile. I think that the pre-processor and compiler are meant
precisely for that.
Admittedly, when I make a constant table, I use the compiler's
expression evaluator rather than the preprocessor's, and it can handle
double. And I know how to make the compiler generate an error at
compiler time. So instead of
#define kTimerValue MULDIV(kf1,kt2*12345,kf2)
#if kTimerValue>0x10000 /* 0x10000 is OK */
#error "kTimerValue out of range for Timer2Value"
#endif
Timer2Value=(unsigned)(kTimerValue); /* 0x10000 may be changed to 0 */
I could use (not tested)
#define ASSERT_S(condition) extern char assert__fail[(condition)?1:-1];
#define kTimerValue ((unsigned long)((double)kf1*kt2*12345/kf2))
ASSERT_S(kTimerValue<=0x10000) /* 0x10000 is OK */
Timer2Value=(unsigned)(kTimerValue); /* 0x10000 may be changed to 0 */
But I loose
- meaningful error message
- control of rounding (at least my ignorance makes me uncertain of
rounding direction, and I fear different conformant compilers may round
differently).
Francois Grieu
#define long_of_double_muldiv(a, b, c) \
((long) ((((double)(a)) * ((double)(b))) / ((double)(c))))
#if long_of_double_muldiv(1, 1, 1)!=1
#error "long_of_double_muldiv broken"
#endif
That fragment won't compile on most C89 platforms because neither cast
nor double are permitted in preprocessor expressions.
Francois Grieu
I detected some of my own stupidities after clicking Send,
but you beat me to the debunker. :-)
Hoping to make amends, let me outline another solution that
*might* work.
On Mar 30, 4:52 am, Francois Grieu <fgr...@gmail.com> wrote:
> I want to devise a C89 preprocessor macro MULDIV(a,b,c)
> ... a, b, c, a*b/c are all in range [1..2000000000].
> ...
> I could live with a slightly inaccurate result,
> provided the relative error is always at most 1/10000.
You've got 31-bit inputs, but only need 14-bit precision.
Create macro to give crude estimate of log(x); shift
a and b right enough so that log(a) + log(b) < 32;
do the arithmetic; shift back at the end.
This might be very cumbersome (various cases depending
on log(a)) but should avoid the other difficulties.
As for the preprocessor-usable log(x) estimator,
one of two of the methods under
"Finding integer log base 2 of an integer" at
http://graphics.stanford.edu/~seander/bithacks.html
might work.
James
True, but if you only want to error out, then why not do a
'compile time' assert...
#define cat(x,y) x ## y
#define cat2(x,y) cat(x,y)
#define cassert(x) \
typedef char cat2(cassert_, __LINE__) [(x) ? 1 : -1]
#define long_of_double_muldiv(a, b, c) \
((long) ((((double)(a)) * (b)) / (c)))
cassert(long_of_double_muldiv(1,1,1) == 1); /* ok */
cassert(long_of_double_muldiv(1,2,3) == 2); /* boom */
--
Peter
> I'm programming for an embedded platform, with not floats, 8-bit ALU,
> 2kB RAM, 1 MIPS, and 0.005 s mandated boot time including 0.0035 s
> already used. Thus pre-computing tables for timing constants is a must.
So write a little program or programs to generate your constant tables
as include files - simples.
Phred is correct that the (C) preprocessor isn't the place for this.
You could look at whether a more advanced text manipulation tool such
as "m4" would do the job, but the natural approach is to write a
header generating program.
Yes. I could lower precision to 1/8192 (13 bits).
> Create macro to give crude estimate of log(x); shift
> a and b right enough so that log(a) + log(b) < 32;
> do the arithmetic; shift back at the end.
>
> This might be very cumbersome (various cases depending
> on log(a)) but should avoid the other difficulties.
I fear it does not give the required precision when the result is small; in my mind 65535*65535/858967245 should give exactly 5 (either 4 or 6 is off by 20%) and 65534*65535/858967245 should give exactly 4 (either 3 or 5 is off by 25%).
> As for the preprocessor-usable log(x) estimator,
> one of two of the methods under
> "Finding integer log base 2 of an integer" at
> http://graphics.stanford.edu/~seander/bithacks.html
> might work.
I fear these expand to huge things. The least expansion that I have found so far is the straightforward (not tested)
#define LOG2(x) (30-((x)<2)-((x)<4)-((x)<8)-((x)<16)\
-((x)<32)-((x)<64)-((x)<128)-((x)<256)-((x)<512)-((x)<1024)\
-((x)<2048)-((x)<4096)-((x)<8192)-!((x)>>14)-!((x)>>15)\
-!((x)>>16)-!((x)>>17)-!((x)>>18)-!((x)>>19)-!((x)>>20)\
-!((x)>>21)-!((x)>>22)-!((x)>>23)-!((x)>>24)-!((x)>>25)\
-!((x)>>26)-!((x)>>27)-!((x)>>28)-!((x)>>29)-!((x)>>30))
This macro is needed several times, x itself can be an expression, and thus the whole thing will expand terribly.
François Grieu
Yes. I do that when I can't avoid it (such as to massage tables in preparation of faster-than-dichotomic search). But integrating that in the build/make chain is platform-dependent. On the other hand, I have loads of preprocessor-driven table constructed by the C compiler in a way that has proven perfectly portable across 3 very different embedded processors with C compilers from 3 vendors (Metrowerks, Keil, Cosmic), plus GCC & Visual Studio for test on PC. So I try to use the preprocessor when it is possible.
Francois Grieu
What's so terrible about it? The size of the expansion shouldn't make
any real difference unless it exceeds the capacity of your compiler.
Of course that's a real issue if you care about portability to
multiple compilers, some of which might have more restrictive limits.
Well. It's Wednesday 32 o'clock for me here so I'm a bit too tired to
give an actual answer, but just to comment on doing way-too-complex
preprocessor programming in general:
One solution would be to use C++, and use templates for your
computation. These are a Turing-complete language evaluated at compile
time. Anyway, back to C:
Write your solution in non-preprocessor C first, using at most 'long'
variables since that's what the preprocessor can handle. That way you
can at least solve the "real" problem in a language which isn't outright
hostile. Then move the result into macros which call each other. There
are multiprecision libraries out there with code you can likely copy.
Basically "division by hand" in base 2**16, I guess.
If you want recursion or loops, that's OK as far as it goes: Just write
several equal macros calling each other, or a "master" macro which calls
the same macro several times. E.g. for rough approx of square root:
#define ISQRT(x, a) ((x) ? ISQRT_((x), ISQRT_((x), ISQRT_((x), a))) : 0)
#define ISQRT_(x, a) (((a) + (x)/(a)) / 2)
This has the problem that the same expression is expanded several times,
which can hit implementation limits like nesting level of parenthesis,
or parse tree depth. OTOH the size of the expansion itself isn't a
problem as far as the C standard is concerned. Likely it's not a worry
until at least reaching the size of a large C file fed through the
preprocessor. (Nobody reported problems when I posted macros expanding
to a megabyte of code just for fun.)
If the expansion gets too big and slow to compile: Do you really need
preprocessor constants, or just compile-time constants? Then, for
values that will be expanded several times in a macro, thus causing the
macro expansion to explode: Instead stuff the values into enum
constants. Then you only need to expand the constant name several
times. Use 'if(enum_constant)' instead of '#if MACRO_CONSTANT' and
trust the compiler to optimize away the appropriate branch.
Enums can have values in the range of ints, so if you need long values,
split them in int-sized components. Use '##' to generate enum names -
e.g. different names for each iteration or recursion depth level: Where
a C program would say enum { hiho = f(blah) } a macro would say #define
FOOBAR(n, blah) enum { FOOBAR_hiho_##n = F(blah) }.
Instead of an #if ... #error #endif test, if you want one, with enums
you'd need to force a constraint violation instead and hope the compiler
catches it, e.g.
#define ASSERT_DECL(name, c) struct assert_##name { assert_: (c) ? 1 : -1; }
BTW, if the result gets complex and unreadable, keep the original C
program and add it to the testsuite. Even if you don't want to use it
to generate a file, you can still turn it into a little test program
which verifies that your preprocessor magic got it right.
--
Hallvard
Francois,
if you don't dare to throw awkwardly big and complex expressions at
the preprocessor you could as well stop right now. There simply is no
way to have it nice and easy with a so-so functional expansion
language.
Returning to your original problem: why not resorting to true 64 Bit
math? The expression of the division will be tedious (and for sure it
will blow an implementation-limit-wise exactly conforming preprocessor
to smithereens) but it should be doable.
#define PP_CMP_U64(a,b) PP_CMP_U64_(a,b)
#define PP_CMP_U64_(a_hi,a_lo,b_hi,b_lo) (((a_hi)>(b_hi)) ||
((a_hi)==(b_hi))&&((a_lo)>(b_lo)))
#define PP_U64(a_hi,a_lo) (a_hi),(a_lo)
#define PP_ADD_U64(a,b) PP_ADD_U64_((a),(b))
#define PP_ADD_U64_(a_hi,a_lo,b_hi,b_lo) ((a_hi)+(b_hi)+((a_lo)+
(b_lo)<(a_lo))),((a_lo)+(b_lo))
#define A PP_U64(0UL,-1UL)
#define B PP_U64(0UL,1UL)
#if PP_CMP_U64(PP_ADD_U64(A,B),A)
# error "A+B > A, ok!"
#endif
...etc...
regards,
Mark
PS: Which is that fantastic 2kB platform that lets you select between
different, and on top of that, conforming (!) C compilers??? I would
like to use that too...
Looks like it. On the other hand, a little math can help immensely (in fact my problem is 99% solved with the modular reduction trick in my original post).
> Returning to your original problem: why not resorting to true 64 Bit
> math? The expression of the division will be tedious (and for sure it
> will blow an implementation-limit-wise exactly conforming preprocessor
> to smithereens) but it should be doable.
> #define PP_CMP_U64(a,b) PP_CMP_U64_(a,b)
> #define PP_CMP_U64_(a_hi,a_lo,b_hi,b_lo) (((a_hi)>(b_hi)) ||
> ((a_hi)==(b_hi))&&((a_lo)>(b_lo)))
> #define PP_U64(a_hi,a_lo) (a_hi),(a_lo)
> #define PP_ADD_U64(a,b) PP_ADD_U64_((a),(b))
> #define PP_ADD_U64_(a_hi,a_lo,b_hi,b_lo) ((a_hi)+(b_hi)+((a_lo)+
> (b_lo)<(a_lo))),((a_lo)+(b_lo))
>
> #define A PP_U64(0UL,-1UL)
> #define B PP_U64(0UL,1UL)
>
> #if PP_CMP_U64(PP_ADD_U64(A,B),A)
> # error "A+B > A, ok!"
> #endif
>
> ...etc...
>
> regards,
Thanks for the idea. Multiplication is the easy part, but I did not find a reasonable way to perform division. A serious problem is the wide dynamic range of the divisor.
> PS: Which is that fantastic 2kB platform that lets you select between
> different, and on top of that, conforming (!) C compilers??? I would
> like to use that too...
I program for several smart cards ICs (my typical platform would be like
http://www.st.com/stonline/products/literature/bd/11463.pdf) from several silicon vendors using different CPUs, and compilers from different compiler vendors. I also debug my code on a PC under Visual Studio. As far as I can tell all my C compilers have mostly conforming C preprocessors. One of the worst offender is Visual C that needs an extra parenthesis when nesting ?: operators in preprocessor expressions, and has different trouble with the ?: operator outside the preprocessor, e.g.
#define FOO(n) ( (n)>31 ? 0 : 0xFFFFFFFF>>(n) )
int main(void)
{
return FOO(42);
}
gives the silly
warning C4293: '>>' : shift count negative or too big, undefined behavior
Imperfect tools... That's all too common.
Francois Grieu
> I want to devise a C89 preprocessor macro MULDIV(a,b,c) which, given
> positive integer constant expressions a b c, produce an integer constant
> expression evaluable by the preprocessor and equal to a*b/c even though
> a*b might grossly overflow the range of long. I can assume a, b, c,
> a*b/c are all in range [1..2000000000].
>
> I came up with something that works well enough for my current
> application, by combining two techniques:
>
> 1) Basic arithmetic tells that
> a*b/c == ((a/c)*c + a%c) * b) / c
> == (a/c)*b + (a%c)*((b/c)*c + b%c))/c
> == (a/c)*b + (b/c)*(a%c) + (a%c)*(b%c)/c
>
> This is enough to solve the problem when c<46342:
>
> #define MULDIV1(a,b,c) ( ((a)/(c))*(b) + (b)/(c)*((a)%(c)) + \
> ((a)%(c))*((b)%(c))/(c) )
>
> [snip scheme involving removing common factors]
>
> But I'm looking for something simpler and more general. Any idea? I
> could live with a slightly inaccurate result, provided the relative
> error is always at most 1/10000.
I believe I have a solution to this problem. Is anyone
still interested?
> Francois Grieu <fgr...@gmail.com> writes:
>
>> I want to devise a C89 preprocessor macro MULDIV(a,b,c) which, given
>> positive integer constant expressions a b c, produce an integer
>> constant expression evaluable by the preprocessor and equal to a*b/c
>> even though a*b might grossly overflow the range of long. I can assume
>> a, b, c, a*b/c are all in range [1..2000000000].
>>
>> [snip]
>>
>> But I'm looking for something simpler and more general. Any idea? I
>> could live with a slightly inaccurate result, provided the relative
>> error is always at most 1/10000.
>
> I believe I have a solution to this problem. Is anyone still
> interested?
I am.
Thanks,
lacos
I imagine you spent some little time working on it. Posting
takes 10 seconds to format and send. So, why not just post
it?
--
Peter
The short answer is that providing an explanation along with
the macro definition would take a lot more than ten seconds,
and this isn't the sort of thing someone wants to read without
an explanation. (Also I have a rule about not talking when I
don't think anyone is listening.)
I am listening. In fact, I would really appreciate it.
Jason
Here is a sketch.
We start with the identity given in the first posting
a*b/c == (a/c)*b + (b/c)*(a%c) + (a%c)*(b%c)/c
or writing in something more C-like
#define MD(a,b,c) (a/c*b + b/c*(a%c) + a%c *(b%c) /c)
This gets some part of the result, because the first two
terms can be evaluated directly; only the last term might
cause overflow. (Incidentally, normal good practice in
writing macro definitions will put parentheses around
macro paramters where ever they occur in the macro body.
I'm not going to do that, to make the presentation more
compact. Obviously these parentheses should be put in
when writing the actual macro definitions.)
Re-writing the above just slightly,
#define MD(a,b,c) (a/c*b + b/c*(a%c) + Q( (a%c), (b%c), c) )
where Q(a,b,c) means "a*b/c", we have a nice recursive
formulation of the problem. So we would like to do again
something like MD (through a succession of Q0, Q1, Q2, ...)
to get more exact answers. Unfortunately using the same
identity again doesn't help, because (a%c) and (b%c) are
both less than c, so the first two terms of MD (if used
in Q) would be zero.
Suppose we consider an actual example. Suppose we have
a = 186737000, b = 116080197, c = 1430650345. We're looking
to get a*b/c = 15151478. Choose K to be (2**32-1)/a. We
have the rough identity
a*b/c ~~= a*K/c * (b/K)
(because the K's cancel, and we've divided by c before
multiplying, but that's just part of what makes it
an approximation). If we calculate, we get
186737000 * (4294967295 / 186737000) / 1430650345
* (116080197 / (4294967295 / 186737000))
== 15140895
which is a very nice approximation! (Of course, we
got lucky with this one, but we'll deal with that
shortly.) Notice the nice thing about the approximation
formula -- both 'a*K/c' and 'b/K' can be calculated without
fear of overflow in 32 bits. Because K == (2**32-1)/a,
a*K will always fit in 32 bits. So this term can be
computed without ever overflowing (and hence also the
product 'a*K/c * (b/K)' as long as we know a*b/c fits
in 32 bits).
Here is a more fleshed out version of the approximation
given above
a*b/c ~= K*a /c *(b/K) + b%K *a /c + K*a %c *(b/K) /c
What's missing in this second version is the remainders.
Using the identity
(a+b)/c == a/c + b/c + (a%c + b%c)/c
we get an exact identity
a*b/c == K*a /c *(b/K) + b%K *a /c + K*a %c *(b/K) /c
+ (b%K *a %c + K*a %c *(b/K) %c) /c
We can use this identity as the basis of macro definitions
for a series of Q's. To illustrate,
#define QK0(K,a,b,c) ( \
K*a /c *(b/K) + b%K *a /c + Q1( K*a %c, b/K, c ) \
+ (b%K *a %c + R1( K*a %c, b/K, c ) /c \
)
#define Q1(a,b,c) QK1( (0xffffffffUL/a), a, b, c )
... etc ...
where Rn(a,b,c) means "a*b%c". We use a similar recursive
formulation for the Rn's (details left to the reader). The
series terminates when we've gotten the approximation close
enough, eg, perhaps
#define Q5(a,b,c) (a*b/c)
#define R5(a,b,c) (a*b%c)
Well, that's most of it. What's left is some protection
against problem conditions and some engineering. Here
are the main considerations:
1. Have to protect against dividing by a if a is 0.
If a is 0, a*b/c == 0, and a*b%c == 0. Use ?:.
2. It turns out the convergence is faster if we switch
the order of the arguments in the Qn/Rn macros.
Mathematically it doesn't matter, but practically
we want the calls to be
Qn( b/K, K*a %c, c )
and
Rn( b/K, K*a %c, c )
rather than how they are written above.
3. This formulation will give an exact answer when the
result is under about 20,000, provided we choose
the smaller of (a%c) and (b%c) as the first argument
to Q0. Use ?: to do that.
4. When c is large and a*b/c is large, this approach
doesn't quite converge fast enough. Fortunately
there is a simple approximation that works
a*b/c ~= (a/25)*b/(c/25)
Applying Q0 to these arguments yields an approximate
value to better than one part in 10,000. (This time
we want to use the larger of (a%c) and (b%c) as the
first argument).
5. Deciding which realm we are in (and so whether to
use the approximation explained in (4) can be done
by calculating an approximate answer. It's easy
to get an approximate answer that good to within
a factor of, oh, 25%, and that's easily good enough
to choose between case (4) and all the others.
Also: there is a more elaborate approach that produces an exact
answer for all values in the range desired (a,b,c <= 2000000000),
but that's more complicated so I thought I'd just give this one.
Folks who are mathematically minded may enjoy working out the
more elaborate approach using the sketched approach as a starting
point.