Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

itoc function - can it be simplified?

85 views
Skip to first unread message

Clive Page

unread,
Mar 31, 2012, 5:15:06 AM3/31/12
to
From time to time I find it convenient to have a function available
which converts from a default integer to a decimal digit string of
minimum length, for example for doing a data type conversion on the fly
when calling a library routine, or in an I/O statement.

My old solution to this did the binary-to-decimal conversion by repeated
division and an ACHAR intrinsic, but now there is no restriction on
using an itoc function in an I/O statement even if it performs internal
file I/O. So the conversion becomes trivial, but determining the
string length in the function declaration is no simpler. Unless I
missed something.

Basically the LOG10 function finds the number of digits, but converting
this to the appropriate integer which works in boundary cases, and for
negative numbers, is a bit more messy. Here's what I use at present,
but I wonder if it can be simplified. By the way, the constant of 5.0
which multiplies the EPSILON value is a bit arbitrary and I found it
only with a bit of trial and error. The part on the continuation line
handles the adjustment in the case of negative integers.


module itoc_mod
implicit none
CONTAINS

function itoc(ivalue) result (string)
integer, intent(in) :: ivalue
integer, parameter :: dp = selected_real_kind(10,50)
character(len=ceiling(log10(real(max(abs(ivalue),1),dp))+5.0*epsilon(1.0_dp))
&
- min(0,max(-1,ivalue))) :: string
write(string, '(i0)') ivalue
end function itoc

end module itoc_mod
!-----------------------------------------------------------
program itoc_test
use itoc_mod
implicit none
integer :: j, n
do j = 0, 9
n = 10**j
print *, '"', itoc(n-1), '", "', itoc(n), '"'
end do
end program


--
Clive Page

mecej4

unread,
Mar 31, 2012, 7:02:12 AM3/31/12
to
Why not use the i0 format?

program itoc_test
implicit none
integer :: j, n
do j = 0, 9
n = 10**j
write (*,'(1x,i0)') n
end do
end program

-- mecej4

Clive Page

unread,
Mar 31, 2012, 7:34:41 AM3/31/12
to
On 31/03/2012 12:02, mecej4 wrote:

> Why not use the i0 format?

I DID use i0 format, the problem is getting this to work in a function.


--
Clive Page

mecej4

unread,
Mar 31, 2012, 7:56:44 AM3/31/12
to
On 3/31/2012 6:34 AM, Clive Page wrote:
> On 31/03/2012 12:02, mecej4 wrote:
>
>> Why not use the i0 format?
>
> I DID use i0 format, the problem is getting this to work in a function.
>
>
Ah, I missed that!

A thought that occurred to me is that computing the logarithm could be
avoided given that only the integer part is of interest.

With IEEE floating point one already has the variable represented with a
mantissa between 1 and 2 plus a biased binary exponent. You should be
able to compute the necessary minimal string length by taking the
exponent, subtracting the bias value and multiplying by the common
logarithm of 2.

-- mecej4

James Van Buskirk

unread,
Mar 31, 2012, 9:13:19 AM3/31/12
to
"Clive Page" <use...@page2.eu> wrote in message
news:9to8el...@mid.individual.net...

> On 31/03/2012 12:02, mecej4 wrote:

>> Why not use the i0 format?

> I DID use i0 format, the problem is getting this to work in a function.

C:\gfortran\clf\itoc>type itoc.i90
pure function itoclen(x)
integer itoclen
integer(ik), intent(in) :: x
character(range(x)+3) tryme

write(tryme,'(i0)') x
itoclen = len_trim(tryme)
end function itoclen

function itoc(x)
integer(ik), intent(in) :: x
character(itoclen(x)) itoc

write(itoc,'(i0)') x
end function itoc

C:\gfortran\clf\itoc>type itoc_test.f90
module mykinds
implicit none
integer, parameter :: ik1 = selected_int_kind(2)
integer, parameter :: ik2 = selected_int_kind(4)
integer, parameter :: ik4 = selected_int_kind(9)
integer, parameter :: ik8 = selected_int_kind(18)
end module mykinds

module ik1_mod
use mykinds, only: ik => ik1
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc.i90'
end module ik1_mod

module ik2_mod
use mykinds, only: ik => ik2
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc.i90'
end module ik2_mod

module ik4_mod
use mykinds, only: ik => ik4
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc.i90'
end module ik4_mod

module ik8_mod
use mykinds, only: ik => ik8
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc.i90'
end module ik8_mod

module generic_recombination
use ik1_mod, only: itoc => itoc_generic
use ik2_mod, only: itoc => itoc_generic
use ik4_mod, only: itoc => itoc_generic
use ik8_mod, only: itoc => itoc_generic
end module generic_recombination

program itoc_test
use mykinds
use generic_recombination
implicit none
integer(ik1) j1
integer(ik2) j2
integer(ik4) j4
integer(ik8) j8

j1 = -huge(j1)
write(*,'(3(i0,1x),a)') kind(j1),j1,len(itoc(j1)),itoc(j1)
j2 = -huge(j2)
write(*,'(3(i0,1x),a)') kind(j2),j2,len(itoc(j2)),itoc(j2)
j4 = -huge(j4)
write(*,'(3(i0,1x),a)') kind(j4),j4,len(itoc(j4)),itoc(j4)
j8 = -huge(j8)
write(*,'(3(i0,1x),a)') kind(j8),j8,len(itoc(j8)),itoc(j8)
end program itoc_test

C:\gfortran\clf\itoc>gfortran itoc_test.f90 -oitoc_test

C:\gfortran\clf\itoc>itoc_test
1 -127 4 -127
2 -32767 6 -32767
4 -2147483647 11 -2147483647
8 -9223372036854775807 20 -9223372036854775807

--
write(*,*) transfer((/17.392111325966148d0,6.5794487871554595D-85, &
6.0134700243160014d-154/),(/'x'/)); end


Aris

unread,
Mar 31, 2012, 9:25:52 AM3/31/12
to
Clive Page <use...@page2.eu> wrote:
> On 31/03/2012 12:02, mecej4 wrote:
>
>> Why not use the i0 format?
>
> I DID use i0 format, the problem is getting this to work in a function.

Call the function inside TRIM and ADJUSTL.

write(*,*) TRIM(ADJUSTL(itoc(n)))

Then len does not have to be precise, only large enough.

glen herrmannsfeldt

unread,
Mar 31, 2012, 1:24:51 PM3/31/12
to
mecej4 <mec...@nospam.operamail.com> wrote:

(snip)

> A thought that occurred to me is that computing the logarithm could be
> avoided given that only the integer part is of interest.

> With IEEE floating point one already has the variable represented with a
> mantissa between 1 and 2 plus a biased binary exponent. You should be
> able to compute the necessary minimal string length by taking the
> exponent, subtracting the bias value and multiplying by the common
> logarithm of 2.

I usually don't trust the log function not to round off, even
just a little bit.

A loop dividing by 10 each time until the result is less than 10
seems faster and won't have rounding problems.

n=0
do while(j.ge.10)
j=j/10
n=n+1
enddo

(save j first)

-- glen

James Van Buskirk

unread,
Mar 31, 2012, 2:23:42 PM3/31/12
to
"James Van Buskirk" <not_...@comcast.net> wrote in message
news:jl6vtc$q4u$1...@dont-email.me...

> C:\gfortran\clf\itoc>type itoc.i90
> pure function itoclen(x)
> integer itoclen
> integer(ik), intent(in) :: x
> character(range(x)+3) tryme

> write(tryme,'(i0)') x
> itoclen = len_trim(tryme)
> end function itoclen

> function itoc(x)
> integer(ik), intent(in) :: x
> character(itoclen(x)) itoc

> write(itoc,'(i0)') x
> end function itoc

OK, so that was a bit old-fashioned.

C:\gfortran\clf\itoc>type itoc1.i90
function itoc(x)
integer(ik), intent(in) :: x
character(:), allocatable :: itoc
character(range(x)+3) tryme
integer mylen

write(tryme,'(i0)') x
mylen = len_trim(tryme)
!allocate(character(len=mylen,kind=kind('A'))::itoc)
!allocate(itoc,source=trim(tryme))
allocate(itoc,mold=trim(tryme))
itoc = trim(tryme)
end function itoc

C:\gfortran\clf\itoc>type itoc_test1.f90
module mykinds
implicit none
integer, parameter :: ik1 = selected_int_kind(2)
integer, parameter :: ik2 = selected_int_kind(4)
integer, parameter :: ik4 = selected_int_kind(9)
integer, parameter :: ik8 = selected_int_kind(18)
end module mykinds

module ik1_mod
use mykinds, only: ik => ik1
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc1.i90'
end module ik1_mod

module ik2_mod
use mykinds, only: ik => ik2
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc1.i90'
end module ik2_mod

module ik4_mod
use mykinds, only: ik => ik4
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc1.i90'
end module ik4_mod

module ik8_mod
use mykinds, only: ik => ik8
implicit none
private
public itoc_generic
interface itoc_generic
module procedure itoc
end interface itoc_generic
contains
include 'itoc1.i90'
end module ik8_mod

module generic_recombination
use ik1_mod, only: itoc => itoc_generic
use ik2_mod, only: itoc => itoc_generic
use ik4_mod, only: itoc => itoc_generic
use ik8_mod, only: itoc => itoc_generic
end module generic_recombination

program itoc_test
use mykinds
use generic_recombination
implicit none
integer(ik1) j1
integer(ik2) j2
integer(ik4) j4
integer(ik8) j8

j1 = -huge(j1)
write(*,'(3(i0,1x),a)') kind(j1),j1,len(itoc(j1)),itoc(j1)
j2 = -huge(j2)
write(*,'(3(i0,1x),a)') kind(j2),j2,len(itoc(j2)),itoc(j2)
j4 = -huge(j4)
write(*,'(3(i0,1x),a)') kind(j4),j4,len(itoc(j4)),itoc(j4)
j8 = -huge(j8)
write(*,'(3(i0,1x),a)') kind(j8),j8,len(itoc(j8)),itoc(j8)
end program itoc_test

C:\gfortran\clf\itoc>gfortran itoc_test1.f90 -oitoc_test1

C:\gfortran\clf\itoc>itoc_test1
1 -127 0
2 -32767 0
4 -2147483647 4096 -2147483647
8 -9223372036854775807 4096 -9223372036854775807

I couldn't find a syntax that gfortran liked here. Allocate on
assignment didn't work, always resulting in a zero-length result
variable. gfortran didn't like the typespec in the allocate
statement, and neither a source nor a mold in the allocate
statement gave quite correct results, either. Is there a
syntax that works for current gfortran?

C:\gfortran\clf\itoc>gfortran -v
Built by Equation Solution <http://www.Equation.com>.
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=c:/gcc_equation/bin/../libexec/gcc/x86_64-w64-mingw32/4.8.0/
lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with:
../gcc-4.8-20120318-mingw/configure --host=x86_64-w64-mingw32 -
-build=x86_64-unknown-linux-gnu --target=x86_64-w64-mingw32 --prefix=/home/gfort
ran/gcc-home/binary/mingw32/native/x86_64/gcc/4.8-20120318 --with-sysroot=/home/
gfortran/gcc-home/binary/mingw32/cross/x86_64/gcc/4.8-20120318 --with-gcc --with
-gnu-ld --with-gnu-as --with-gmp=/home/gfortran/gcc-home/binary/mingw32/native/x
86_64/gmp --with-mpfr=/home/gfortran/gcc-home/binary/mingw32/native/x86_64/mpfr
--with-mpc=/home/gfortran/gcc-home/binary/mingw32/native/x86_64/mpc --with-ppl=/
home/gfortran/gcc-home/binary/mingw32/native/x86_64/ppl --with-cloog=/home/gfort
ran/gcc-home/binary/mingw32/native/x86_64/cloog --with-host-libstdcxx='-lstdc++
-lsupc++ -lm' --enable-targets=i686-w64-mingw32,x86_64-w64-mingw32 --enable-cloo
g-backend=ppl --enable-lto --enable-languages=c,c++,fortran --enable-libgomp
--e
nable-threads=win32 --enable-static --enable-shared=lto-plugin --enable-plugins
--enable-ld=yes --enable-libquadmath --enable-libquadmath-support --disable-nls
--disable-tls --disable-win32-registry
Thread model: win32
gcc version 4.8.0 20120318 (experimental) (GCC)

FX

unread,
Apr 1, 2012, 4:29:15 PM4/1/12
to
> A loop dividing by 10 each time until the result is less than 10
> seems faster and won't have rounding problems.

If you really want fast, you need the fast-logarithm approach:

count = 1

if (x >= 100000000) then
count = count + 8
x = x / 100000000
endif

if (x >= 10000) then
count = count + 4
x = x / 10000
endif

if (x >= 100) then
count = count + 2
x = x / 100
endif

if (x >= 10) count = count + 1

--
FX

robert....@oracle.com

unread,
Apr 2, 2012, 5:33:07 AM4/2/12
to
Given the typical speed of integer division instructions,
it would probably be faster to use a straight sequence of
IF statements. Also, because small integers are typically
more common than large integers, it would be best to start
by testing for small numbers of digits.

Robert Corbett

FX

unread,
Apr 5, 2012, 4:07:57 PM4/5/12
to
> Given the typical speed of integer division instructions, it would
> probably be faster to use a straight sequence of IF statements.

Well, I found that assertion hard to believe, so I tested both on 64-bit
random numbers and the division approach on my Intel Core i7 is faster
than the long series of IF by 50%.

> Also, because small integers are typically
> more common than large integers, it would be best to start
> by testing for small numbers of digits.

Yeah, that is true.

--
FX

#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>

uint64_t ilog10_1 (uint64_t x)
{
uint64_t count = 1;

if (x >= 10000000000000000)
count = count + 16,
x = x / 100000000;

if (x >= 100000000)
count = count + 8,
x = x / 100000000;

if (x >= 10000)
count = count + 4,
x = x / 10000;

if (x >= 100)
count = count + 2,
x = x / 100;

if (x >= 10) count = count + 1;

return count;
}

uint64_t ilog10_2 (uint64_t x)
{
if (x < 10UL) return 0;
if (x < 100UL) return 1;
if (x < 1000UL) return 2;
if (x < 10000UL) return 3;
if (x < 100000UL) return 4;
if (x < 1000000UL) return 5;
if (x < 10000000UL) return 6;
if (x < 100000000UL) return 7;
if (x < 1000000000UL) return 8;
if (x < 10000000000UL) return 9;
if (x < 100000000000UL) return 10;
if (x < 1000000000000UL) return 11;
if (x < 10000000000000UL) return 12;
if (x < 100000000000000UL) return 13;
if (x < 1000000000000000UL) return 14;
if (x < 10000000000000000UL) return 15;
if (x < 100000000000000000UL) return 16;
if (x < 1000000000000000000UL) return 17;
if (x < 10000000000000000000UL) return 18;
return 19;
}

int main (void)
{
unsigned s;
for (int i = 0; i < 10000000; i++)
{
uint64_t i = ((uint64_t) arc4random() << 32) + arc4random();
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
s += ilog10_1 (i);
}

printf ("%u\n", s);
}

glen herrmannsfeldt

unread,
Apr 5, 2012, 4:43:43 PM4/5/12
to
FX <cou...@alussinan.org> wrote:
>> Given the typical speed of integer division instructions, it would
>> probably be faster to use a straight sequence of IF statements.

> Well, I found that assertion hard to believe, so I tested both
> on 64-bit random numbers and the division approach on my Intel
> Core i7 is faster than the long series of IF by 50%.

Many now will do constant integer division as multiplication by
the reciprocal and then selecting the high half of the double length
product.

You can also write the loop as comparing against powers of 10
in a multiplication loop, but there is a complication due to
overflow that the divide loop doesn't have.

>> Also, because small integers are typically
>> more common than large integers, it would be best to start
>> by testing for small numbers of digits.

> Yeah, that is true.

Scaling law (see Zipf's law) says that you will find numbers with
a probability proportional to log(N). Among others, that has been
used as a test for faked data. People don't tend to select random
numbers following a scaling law.

To do some of the test mentioned, such as the successive IF method,
you need to know the largest possible value. Also, loops will take
advantage of branch prediction, where unrolled loops won't.

-- glen

robert....@oracle.com

unread,
Apr 5, 2012, 9:30:16 PM4/5/12
to
Your new program solves a different problem than the
one posed by the OP and solved by the code in your
previous posting. The original problem involved
32-bit integers. For 32-bit integers, at most 9
comparisons would be executed for the straight
sequence of ifs.

Also, your test is biased for large numbers rather
than small numbers.

Even for the 64-bit case, you would almost certainly
get better performance with a solution that did no
divides. An exception would be if code space were an
issue. Consider something along these lines:

int
ilog10_3(unsigned long long x)
{
if (x < 100000000ULL)
if (x < 10000ULL)
if (x < 100ULL)
if (x < 10ULL)
return 1;
else
return 2;
else
if (x < 1000ULL)
return 3;
else
return 4;
else
if (x < 1000000ULL)
if (x < 100000ULL)
return 5;
else

return 6;
else
if (x < 10000000ULL)
return 7;
else
return 8;
else if (x < 10000000000000000ULL)
if (x < 1000000000000ULL)
if (x < 10000000000ULL)
if (x < 1000000000ULL)
return 9;
else
return 10;
else
if (x < 100000000000ULL)
return 11;
else
return 12;
else
if (x < 100000000000000ULL)
if (x < 10000000000000ULL)
return 13;
else
return 14;
else
if (x < 1000000000000000ULL)
return 15;
else
return 16;
else if (x < 1000000000000000000ULL)
if (x < 100000000000000000ULL)
return 17;
else
return 18;
else
return 19;
}

I apologize for posting C code to this group.

Bob Corbett

glen herrmannsfeldt

unread,
Apr 6, 2012, 1:29:30 AM4/6/12
to
robert....@oracle.com wrote:

> On Apr 5, 1:07 pm, "FX" <coud...@alussinan.org> wrote:
>> > Given the typical speed of integer division instructions, it would
>> > probably be faster to use a straight sequence of IF statements.

>> Well, I found that assertion hard to believe, so I tested both on 64-bit
>> random numbers and the division approach on my Intel Core i7 is faster
>> than the long series of IF by 50%.

(snip of C code)

> Your new program solves a different problem than the
> one posed by the OP and solved by the code in your
> previous posting. The original problem involved
> 32-bit integers. For 32-bit integers, at most 9
> comparisons would be executed for the straight
> sequence of ifs.

But how do you know that they are 32 bits?

I suppose they could be C_INT32_T type, but it fails in possibly
bad ways if a larger value actually comes through.

> Also, your test is biased for large numbers rather
> than small numbers.

> Even for the 64-bit case, you would almost certainly
> get better performance with a solution that did no
> divides. An exception would be if code space were an
> issue. Consider something along these lines:

I suppose, but branch prediction won't help at all, and
so it might run pretty slow. If the loop predicts correctly,
it will be fast until the actual loop exit.

> {
> if (x < 100000000ULL)
> if (x < 10000ULL)
> if (x < 100ULL)
> if (x < 10ULL)
> return 1;
> else
> return 2;
> else
> if (x < 1000ULL)
> return 3;
> else
> return 4;

(snip)

and also, unless run a reeeally huge number of times, we already
discussed it more than the time that would be saved.

> I apologize for posting C code to this group.

-- glen

James Van Buskirk

unread,
Apr 6, 2012, 7:50:59 AM4/6/12
to
<robert....@oracle.com> wrote in message
news:2a1c2927-a22b-4eb0...@z5g2000yqj.googlegroups.com...

> I apologize for posting C code to this group.

I thought the point of the thread was to come up with high-level
constructs to make the original code simpler in some sense.
However, it seems to have devolved into an attempt at optimization.
There have been problems with this in that the attempts have been
written in C, the timing code was inappropriate because it didn't
attempt to strain branch prediction and the attempts were all
slow. Therefore we attempt to remedy the situation:

module funcs
use ISO_C_BINDING
implicit none
contains
function ilog10_1(x) bind(C,name='ilog10_1') result(count)
integer(C_INT64_T), value :: x
integer(C_INT64_T) count

count = 1
if (x >= 10000000000000000_C_INT64_T) then
count = count + 16
x = x / 100000000
end if
if (x >= 100000000) then
count = count + 8
x = x / 100000000
end if
if (x >= 10000) then
count = count + 4
x = x / 10000
end if
if (x >= 100) then
count = count + 2
x = x / 100
end if
if (x >= 10) count = count + 1
end function ilog10_1

function ilog10_2(x) bind(C,name='ilog10_2')
integer(C_INT64_T), value :: x
integer(C_INT64_T) ilog10_2
ilog10_2 = 0

if(x < 10_C_INT64_T) then
ilog10_2 = 1
return
end if
if(x < 100_C_INT64_T) then
ilog10_2 = 2
return
end if
if(x < 1000_C_INT64_T) then
ilog10_2 = 3
return
end if
if(x < 10000_C_INT64_T) then
ilog10_2 = 4
return
end if
if(x < 100000_C_INT64_T) then
ilog10_2 = 5
return
end if
if(x < 1000000_C_INT64_T) then
ilog10_2 = 6
return
end if
if(x < 10000000_C_INT64_T) then
ilog10_2 = 7
return
end if
if(x < 100000000_C_INT64_T) then
ilog10_2 = 8
return
end if
if(x < 1000000000_C_INT64_T) then
ilog10_2 = 9
return
end if
if(x < 10000000000_C_INT64_T) then
ilog10_2 = 10
return
end if
if(x < 100000000000_C_INT64_T) then
ilog10_2 = 11
return
end if
if(x < 1000000000000_C_INT64_T) then
ilog10_2 = 12
return
end if
if(x < 10000000000000_C_INT64_T) then
ilog10_2 = 13
return
end if
if(x < 100000000000000_C_INT64_T) then
ilog10_2 = 14
return
end if
if(x < 1000000000000000_C_INT64_T) then
ilog10_2 = 15
return
end if
if(x < 10000000000000000_C_INT64_T) then
ilog10_2 = 16
return
end if
if(x < 100000000000000000_C_INT64_T) then
ilog10_2 = 17
return
end if
if(x < 1000000000000000000_C_INT64_T) then
ilog10_2 = 18
return
end if
ilog10_2 = 19
return
end function ilog10_2

function ilog10_3(x) bind(C,name='ilog10_3')
integer(C_INT64_T), value :: x
integer(C_INT64_T) ilog10_3

if (x < 100000000_C_INT64_T) then
if (x < 10000_C_INT64_T) then
if (x < 100_C_INT64_T) then
if (x < 10_C_INT64_T) then
ilog10_3 = 1
return
else
ilog10_3 = 2
return
end if
else if (x < 1000_C_INT64_T) then
ilog10_3 = 3
return
else
ilog10_3 = 4
return
end if
else if (x < 1000000_C_INT64_T) then
if (x < 100000_C_INT64_T) then
ilog10_3 = 5
return
else
ilog10_3 = 6
return
end if
else if (x < 10000000_C_INT64_T) then
ilog10_3 = 7
return
else
ilog10_3 = 8
return
end if
else if (x < 10000000000000000_C_INT64_T) then
if (x < 1000000000000_C_INT64_T) then
if (x < 10000000000_C_INT64_T) then
if (x < 1000000000_C_INT64_T) then
ilog10_3 = 9
return
else
ilog10_3 = 10
return
end if
else if (x < 100000000000_C_INT64_T) then
ilog10_3 = 11
return
else
ilog10_3 = 12
return
end if
else if (x < 100000000000000_C_INT64_T) then
if (x < 10000000000000_C_INT64_T) then
ilog10_3 = 13
return
else
ilog10_3 = 14
return
end if
else if (x < 1000000000000000_C_INT64_T) then
ilog10_3 = 15
return
else
ilog10_3 = 16
return
end if
else if (x < 1000000000000000000_C_INT64_T) then
if (x < 100000000000000000_C_INT64_T) then
ilog10_3 = 17
return
else
ilog10_3 = 18
return
end if
else
ilog10_3 = 19
return
end if
end function ilog10_3

function ilog10_4(x) bind(C,name='ilog10_4')
integer(C_INT64_T), value :: x
integer(C_INT64_T) ilog10_4
integer(C_INT64_T) alternate
integer lz2
integer(C_INT8_T), parameter :: a(-1:32) = [&
19,19,19,18,17,17,16,16,15,14,14,13,13,12, &
11,11,10,10,9,8,8,7,7,6,5,5,4,4,3,2,2,1,1,0]
integer(C_INT64_T), parameter :: b(0:32) = [&
1000000000000000000_C_INT64_T,1000000000000000000_C_INT64_T,&
1000000000000000000_C_INT64_T, 100000000000000000_C_INT64_T,&
10000000000000000_C_INT64_T, 10000000000000000_C_INT64_T,&
1000000000000000_C_INT64_T, 1000000000000000_C_INT64_T,&
100000000000000_C_INT64_T, 10000000000000_C_INT64_T,&
10000000000000_C_INT64_T, 1000000000000_C_INT64_T,&
1000000000000_C_INT64_T, 100000000000_C_INT64_T,&
10000000000_C_INT64_T, 10000000000_C_INT64_T,&
1000000000_C_INT64_T, 1000000000_C_INT64_T,&
100000000_C_INT64_T, 10000000_C_INT64_T,&
10000000_C_INT64_T, 1000000_C_INT64_T,&
1000000_C_INT64_T, 100000_C_INT64_T,&
10000_C_INT64_T, 10000_C_INT64_T,&
1000_C_INT64_T, 1000_C_INT64_T,&
100_C_INT64_T, 10_C_INT64_T,&
10_C_INT64_T, 1_C_INT64_T, 0_C_INT64_T]
lz2 = ishft(leadz(x),-1)
ilog10_4 = a(lz2)
alternate = a(lz2-1)
if(x >= b(lz2)) ilog10_4 = alternate
end function ilog10_4

subroutine sub(a,n)
integer n
integer(C_INT64_T) a(n)

a = cshift(a,1)
end subroutine sub
end module funcs

module utils
use ISO_C_BINDING
implicit none
private
public utils_init, rdtsc
integer(C_INT8_T) BAD_STUFF(32)
data BAD_STUFF / &
Z'31', Z'C0', Z'40', Z'75', Z'0A', Z'0F', Z'31', Z'48', &
Z'C1', Z'E2', Z'20', Z'48', Z'09', Z'D0', Z'C3', Z'0F', &
Z'31', Z'C3', Z'90', Z'90', Z'90', Z'90', Z'90', Z'90', &
Z'90', Z'90', Z'90', Z'90', Z'90', Z'90', Z'90', Z'90'/
abstract interface
function rdtsc_template() bind(C)
use ISO_C_BINDING
implicit none
integer(C_INT64_T) :: rdtsc_template
end function rdtsc_template
end interface
procedure(rdtsc_template), pointer :: rdtsc => NULL()
interface
function VirtualAlloc(lpAddress, dwSize, flAllocationType, &
flProtect) bind(C, name = 'VirtualAlloc')
use ISO_C_BINDING
implicit none
!GCC$ ATTRIBUTES STDCALL :: VirtualAlloc
type(C_PTR) VirtualAlloc
type(C_PTR), value :: lpAddress
integer(C_SIZE_T), value :: dwSize
integer(C_LONG), value :: flAllocationType
integer(C_LONG), value :: flProtect
end function VirtualAlloc
function GetLastError() bind(C,name='GetLastError')
use ISO_C_BINDING
implicit none
!GCC$ ATTRIBUTES STDCALL :: GetLastError
integer(C_LONG) GetLastError
end function GetLastError
end interface
contains
subroutine utils_init()
type(C_PTR) address
integer(C_INTPTR_T) temp
integer(C_LONG) error
integer(C_INT8_T), pointer :: temp_ptr(:)
type(C_FUNPTR) fun_address
logical :: first = .TRUE.

if(.NOT.first) return
first = .FALSE.
address = VirtualAlloc(C_NULL_PTR, &
size(BAD_STUFF,kind=C_SIZE_T),int(Z'1000',C_LONG), &
int(Z'40',C_LONG))
if(.NOT.C_ASSOCIATED(address)) then
error = GetLastError()
write(*,'(a,z0,a)') "Error Z'",error,"' allocating memory"
stop
end if
call C_F_POINTER(address, temp_ptr, shape(BAD_STUFF))
temp_ptr = BAD_STUFF
temp = transfer(address, temp)
fun_address = transfer(temp, fun_address)
call C_F_PROCPOINTER(fun_address, rdtsc)
end subroutine utils_init
end module utils

program main
use funcs
use ISO_C_BINDING
use utils
implicit none
integer, parameter :: nmethod = 4
type(C_FUNPTR) cfp(4)
integer(C_INT64_T) pow2, pow10, n
integer col
procedure(ilog10_1), pointer :: ilog10
integer out(nmethod)
character(80) fmt
integer(C_INT64_T) ts1,ts2
integer,parameter :: ntrial = 6
integer(C_INT64_T) times(ntrial,nmethod)
integer, parameter :: ndata = 500
integer(C_INT64_T) data(ndata)
integer row, trial
integer nseed
integer, allocatable :: seed(:)
real(C_DOUBLE) harvest(3)
integer(C_INT64_T) chk(ntrial,nmethod)
integer(C_INT64_T) time(ntrial,nmethod)
integer(C_INT64_T) ndigit,lowbits

cfp(1) = C_FUNLOC(ilog10_1)
cfp(2) = C_FUNLOC(ilog10_2)
cfp(3) = C_FUNLOC(ilog10_3)
cfp(4) = C_FUNLOC(ilog10_4)
pow2 = 1
pow10 = 1
n = 0
write(fmt,'(a,i0,a)') '(a19,',nmethod,'(1x,i4),a)'
write(*,fmt) 'input\method',(col,col=1,nmethod)
write(fmt,'(a,i0,a)') '(i19,',nmethod,'(1x,i4),a)'
do
do col = 1, nmethod
call C_F_PROCPOINTER(cfp(col),ilog10)
out(col) = ilog10(n)
end do
write(*,fmt) n,out, &
trim(merge(' *',' ',any(out /= minval(out,1))))
if(n == huge(n)) then
exit
else if(mod(n,2_C_INT64_T) == 1) then
n = n+1
else if(pow10 == range(n)) then
n = 2**pow2-1
pow2 = pow2+1
else if(2**pow2 < 10**pow10) then
n = 2**pow2-1
pow2 = pow2+1
else
n = 10**pow10-1
pow10 = pow10+1
end if
end do
call utils_init
call random_seed()
call random_seed(size=nseed)
allocate(seed(nseed))
call random_seed(get=seed)
ts1 = rdtsc()
seed = ieor(int(seed,C_INT64_T),ts1)
call random_seed(put=seed)
do trial = 1, ntrial
call random_number(harvest)
ndigit = 19*harvest(1)+1
ndigit = min(19,ndigit)
if(ndigit == 19) then
data(trial) = harvest(2)*(huge(ndigit)-10**(ndigit-1))
data(trial) = min(huge(ndigit)-10**(ndigit-1),data(trial))
data(trial) = 10**(ndigit-1)+data(trial)
lowbits = harvest(3)*2_C_INT64_T**30
data(trial) = ieor(data(trial),lowbits)
else
data(trial) = harvest(2)*(10**ndigit-10**(ndigit-1))
data(trial) = 10**(ndigit-1)+data(trial)
lowbits = harvest(3)*2_C_INT64_T**30
if(2_C_INT64_T**30 < data(trial)) then
data(trial) = ieor(data(trial),lowbits)
end if
data(trial) = min(data(trial),10**ndigit-1)
end if
end do
do row = 1, ntrial
do col = 1, nmethod
call C_F_PROCPOINTER(cfp(col),ilog10)
ts1 = rdtsc()
chk(row,col) = 0
do trial = 1, ndata
chk(row,col) = chk(row,col)+ilog10(data(trial))
end do
ts2 = rdtsc()
time(row,col) = ts2-ts1
end do
call sub(data,ndata)
end do
write(fmt,'(a,i0,a)') '((a12,',nmethod,'(1x,i6)))'
write(*,fmt) 'trial\method',(col,col=1,nmethod)
write(fmt,'(a,i0,a)') '((i12,',nmethod,'(1x,i6)))'
write(*,fmt) (row,(time(row,col),col=1,nmethod),row=1,ntrial)
write(*,'(a,i0)') 'checksum = ',sum(chk)
end program main

Output with

gfortran -O2 -fno-range-check ilog10.f90 -oilog10

input\method 1 2 3 4
0 1 1 1 1
1 1 1 1 1
2 1 1 1 1
3 1 1 1 1
4 1 1 1 1
7 1 1 1 1
8 1 1 1 1
9 1 1 1 1
10 2 2 2 2
15 2 2 2 2
16 2 2 2 2
31 2 2 2 2
32 2 2 2 2
63 2 2 2 2
64 2 2 2 2
99 2 2 2 2
100 3 3 3 3
127 3 3 3 3
128 3 3 3 3
255 3 3 3 3
256 3 3 3 3
511 3 3 3 3
512 3 3 3 3
999 3 3 3 3
1000 4 4 4 4
1023 4 4 4 4
1024 4 4 4 4
2047 4 4 4 4
2048 4 4 4 4
4095 4 4 4 4
4096 4 4 4 4
8191 4 4 4 4
8192 4 4 4 4
9999 4 4 4 4
10000 5 5 5 5
16383 5 5 5 5
16384 5 5 5 5
32767 5 5 5 5
32768 5 5 5 5
65535 5 5 5 5
65536 5 5 5 5
99999 5 5 5 5
100000 6 6 6 6
131071 6 6 6 6
131072 6 6 6 6
262143 6 6 6 6
262144 6 6 6 6
524287 6 6 6 6
524288 6 6 6 6
999999 6 6 6 6
1000000 7 7 7 7
1048575 7 7 7 7
1048576 7 7 7 7
2097151 7 7 7 7
2097152 7 7 7 7
4194303 7 7 7 7
4194304 7 7 7 7
8388607 7 7 7 7
8388608 7 7 7 7
9999999 7 7 7 7
10000000 8 8 8 8
16777215 8 8 8 8
16777216 8 8 8 8
33554431 8 8 8 8
33554432 8 8 8 8
67108863 8 8 8 8
67108864 8 8 8 8
99999999 8 8 8 8
100000000 9 9 9 9
134217727 9 9 9 9
134217728 9 9 9 9
268435455 9 9 9 9
268435456 9 9 9 9
536870911 9 9 9 9
536870912 9 9 9 9
999999999 9 9 9 9
1000000000 10 10 10 10
1073741823 10 10 10 10
1073741824 10 10 10 10
2147483647 10 10 10 10
2147483648 10 10 10 10
4294967295 10 10 10 10
4294967296 10 10 10 10
8589934591 10 10 10 10
8589934592 10 10 10 10
9999999999 10 10 10 10
10000000000 11 11 11 11
17179869183 11 11 11 11
17179869184 11 11 11 11
34359738367 11 11 11 11
34359738368 11 11 11 11
68719476735 11 11 11 11
68719476736 11 11 11 11
99999999999 11 11 11 11
100000000000 12 12 12 12
137438953471 12 12 12 12
137438953472 12 12 12 12
274877906943 12 12 12 12
274877906944 12 12 12 12
549755813887 12 12 12 12
549755813888 12 12 12 12
999999999999 12 12 12 12
1000000000000 13 13 13 13
1099511627775 13 13 13 13
1099511627776 13 13 13 13
2199023255551 13 13 13 13
2199023255552 13 13 13 13
4398046511103 13 13 13 13
4398046511104 13 13 13 13
8796093022207 13 13 13 13
8796093022208 13 13 13 13
9999999999999 13 13 13 13
10000000000000 14 14 14 14
17592186044415 14 14 14 14
17592186044416 14 14 14 14
35184372088831 14 14 14 14
35184372088832 14 14 14 14
70368744177663 14 14 14 14
70368744177664 14 14 14 14
99999999999999 14 14 14 14
100000000000000 15 15 15 15
140737488355327 15 15 15 15
140737488355328 15 15 15 15
281474976710655 15 15 15 15
281474976710656 15 15 15 15
562949953421311 15 15 15 15
562949953421312 15 15 15 15
999999999999999 15 15 15 15
1000000000000000 16 16 16 16
1125899906842623 16 16 16 16
1125899906842624 16 16 16 16
2251799813685247 16 16 16 16
2251799813685248 16 16 16 16
4503599627370495 16 16 16 16
4503599627370496 16 16 16 16
9007199254740991 16 16 16 16
9007199254740992 16 16 16 16
9999999999999999 16 16 16 16
10000000000000000 25 17 17 17 *
18014398509481983 25 17 17 17 *
18014398509481984 25 17 17 17 *
36028797018963967 25 17 17 17 *
36028797018963968 25 17 17 17 *
72057594037927935 25 17 17 17 *
72057594037927936 25 17 17 17 *
99999999999999999 25 17 17 17 *
100000000000000000 26 18 18 18 *
144115188075855871 26 18 18 18 *
144115188075855872 26 18 18 18 *
288230376151711743 26 18 18 18 *
288230376151711744 26 18 18 18 *
576460752303423487 26 18 18 18 *
576460752303423488 26 18 18 18 *
1152921504606846975 27 19 19 19 *
1152921504606846976 27 19 19 19 *
2305843009213693951 27 19 19 19 *
2305843009213693952 27 19 19 19 *
4611686018427387903 27 19 19 19 *
4611686018427387904 27 19 19 19 *
9223372036854775807 27 19 19 19 *
trial\method 1 2 3 4
1 17220 14520 12220 8520
2 28120 24250 19660 14240
3 27480 23370 19140 14120
4 27330 23370 19230 14120
5 27330 23210 19220 14130
6 27260 22860 19060 14130
checksum = 78432

James Van Buskirk

unread,
Apr 6, 2012, 10:18:30 AM4/6/12
to
"James Van Buskirk" <not_...@comcast.net> wrote in message
news:jlmlb3$98h$1...@dont-email.me...

> do trial = 1, ntrial
> call random_number(harvest)

> trial\method 1 2 3 4
> 1 17220 14520 12220 8520
> 2 28120 24250 19660 14240
> 3 27480 23370 19140 14120
> 4 27330 23370 19230 14120
> 5 27330 23210 19220 14130
> 6 27260 22860 19060 14130
> checksum = 78432

Careless. Change above to:

do trial = 1, ndata
call random_number(harvest)

And new output is:

trial\method 1 2 3 4
1 25430 21080 17230 7680
2 24200 19820 16540 7300
3 24050 19160 16540 7380
4 24090 19180 16530 7380
5 23980 19240 16580 7360
6 24000 19100 16480 7360
checksum = 123960

At least this got the checksum right. The times are coming out
more consistent, too. The set of times from my first post seems
to have been an anomaly.

Terence

unread,
Apr 7, 2012, 2:59:42 AM4/7/12
to
Surely the fastest way to find the needed display width of a floating point
variable would be based on the following basic idea: to first locate the bit
representation of the exponent and get a first order guess from using this
binary number to index a table of needed field widths, then locate the
higher bits of the mantissa to adjust (by boundary table indexing) whether
to add one digit width or not.

I did something like this quite recently, to determine how to display some
extremely-variable astophysical data in table sets. The original code looped
some 130 billion times.

A less-optimal way is to write the value, formatted, to an internal string,
using a general E format, and analyse the resultant field to make the final
decision.


robert....@oracle.com

unread,
Apr 7, 2012, 7:37:05 PM4/7/12
to
On Apr 6, 11:59 pm, "Terence" <tbwri...@bigpond.net.au> wrote:
> Surely the fastest way to find the needed display width of a floating point
> variable would be based on the following basic idea: to first locate the bit
> representation of the exponent and get a first order guess from using this
> binary number to index a table of needed field widths, then locate the
> higher bits of the mantissa to adjust (by boundary table indexing) whether
> to add one digit width or not.

The OP was concerned with conversion integer values, not
floating-point.

I have seen the function you describe referred to as logd,
I assume in analogy to the IEEE function logb. The analogy
is not exact. I have seen such functions used in the
implementation of the G edit descriptor, where it almost
always causes trouble. The problem is that it does not
take rounding into account. I know of one implementation
that would print a field of asterisks for values close to
but less than powers of ten, because it did not account for
rounding.

Bob Corbett

glen herrmannsfeldt

unread,
Apr 7, 2012, 9:13:29 PM4/7/12
to
robert....@oracle.com wrote:

(snip)
> The OP was concerned with conversion integer values, not
> floating-point.

> I have seen the function you describe referred to as logd,
> I assume in analogy to the IEEE function logb. The analogy
> is not exact. I have seen such functions used in the
> implementation of the G edit descriptor, where it almost
> always causes trouble. The problem is that it does not
> take rounding into account.

I have seen many times the suggestiong of using LOG10(FLOAT(n))
to find the appropriate power of 10 for integers. I suspect that
rounding could cause problems there, though I don't know of
any actual implementations.

> I know of one implementation
> that would print a field of asterisks for values close to
> but less than powers of ten, because it did not account for
> rounding.

Oops.

-- glen

robin....@gmail.com

unread,
Apr 7, 2012, 10:56:02 PM4/7/12
to
On Sunday, 8 April 2012 11:13:29 UTC+10, glen herrmannsfeldt wrote:

> I have seen many times the suggestiong of using LOG10(FLOAT(n))
> to find the appropriate power of 10 for integers. I suspect that
> rounding could cause problems there, though I don't know of
> any actual implementations.

Such a suggestion sounds silly.
It's gross overkill.
All that's required is a simple division by 10
to extract each digit in turn.

robin....@gmail.com

unread,
Apr 7, 2012, 11:09:12 PM4/7/12
to
For fast, neither log nor integer division is required.
Log is silly for what is basically a trivial, straight-forward operation.

Division by 10 is accomplished by a few shifts and adds.
Successive divisions by 10 yield the decimal digits of the converted value in
turn.


Dick Hendrickson

unread,
Apr 8, 2012, 4:40:11 PM4/8/12
to
On 4/7/12 10:09 PM, robin....@gmail.com wrote:
> On Monday, 2 April 2012 06:29:15 UTC+10, FX wrote:
>>> A loop dividing by 10 each time until the result is less than 10
>>> seems faster and won't have rounding problems.
>>
>> If you really want fast, you need the fast-logarithm approach:
>>
>> count = 1
>>
>> if (x>= 100000000) then
>> count = count + 8
>> x = x / 100000000
>> endif
>>
>> if (x>= 10000) then
>> count = count + 4
>> x = x / 10000
>> endif
>>
>> if (x>= 100) then
>> count = count + 2
>> x = x / 100
>> endif
>>
>> if (x>= 10) count = count + 1
>
> For fast, neither log nor integer division is required.
> Log is silly for what is basically a trivial, straight-forward operation.
Today, memory is cheap. If you actually need fast, do a table look-up.
if (x > some big number) then
compute answer
else
answer = table(x)
endif

>
> Division by 10 is accomplished by a few shifts and adds.
> Successive divisions by 10 yield the decimal digits of the converted value in
> turn.
>
Are you sure? ;) Multiplication by 10 is easy, I've not so sure about
divide.

Dick Hendrickson
>

Terence

unread,
Apr 9, 2012, 7:10:52 PM4/9/12
to
I think Robin was thing of MULIPLY by ten, which was often done by adding a
single(2) and a triple (8) left bit shift, in the old days of small memory
and and slow-cpu .


robert....@oracle.com

unread,
Apr 12, 2012, 8:57:47 PM4/12/12
to
On Apr 5, 1:43 pm, glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
> FX <coud...@alussinan.org> wrote:
> >> Given the typical speed of integer division instructions, it would
> >> probably be faster to use a straight sequence of IF statements.
> > Well, I found that assertion hard to believe, so I tested both
> > on 64-bit random numbers and the division approach on my Intel
> > Core i7 is faster than the long series of IF by 50%.
>
> Many now will do constant integer division as multiplication by
> the reciprocal and then selecting the high half of the double length
> product.
>
> You can also write the loop as comparing against powers of 10
> in a multiplication loop, but there is a complication due to
> overflow that the divide loop doesn't have.
>
> >> Also, because small integers are typically
> >> more common than large integers, it would be best to start
> >> by testing for small numbers of digits.
> > Yeah, that is true.
>
> Scaling law (see Zipf's law) says that you will find numbers with
> a probability proportional to log(N). Among others, that has been
> used as a test for faked data. People don't tend to select random
> numbers following a scaling law.

I suspect that the scaling law you cite does not apply to the
distribution of integer values read or written by programs.
In my experience the two integer values that appear most often
are zero and one. My information is admittedly anecdotal.

Bob Corbett

glen herrmannsfeldt

unread,
Apr 13, 2012, 1:22:16 AM4/13/12
to
robert....@oracle.com wrote:

(snip)
>> Scaling law (see Zipf's law) says that you will find numbers with
>> a probability proportional to log(N). Among others, that has been
>> used as a test for faked data. People don't tend to select random
>> numbers following a scaling law.

> I suspect that the scaling law you cite does not apply to the
> distribution of integer values read or written by programs.
> In my experience the two integer values that appear most often
> are zero and one. My information is admittedly anecdotal.

Yes there are some cases where it doesn't work very well,
especially on the low end. Note that according to such scaling
law you can never have zero! (Can a city have a population of
zero and still be a city?)

Also, if you take something like the height of people in feet
you will find that it isn't very close.

Even so, I think you will be surprised how well it works for
some data sets when you might not expect it.

-- glen

robin....@gmail.com

unread,
Apr 14, 2012, 8:48:12 PM4/14/12
to
On Monday, 9 April 2012 06:40:11 UTC+10, Dick Hendrickson wrote:
> On 4/7/12 10:09 PM, ro...@gmail.com wrote:


> > Division by 10 is accomplished by a few shifts and adds.
> > Successive divisions by 10 yield the decimal digits of the converted value in
> > turn.
> >
> Are you sure? ;) Multiplication by 10 is easy, I've not so sure about
> divide.

As I said, division by 10 is easy.
It requires a few shifts and adds.

robin....@gmail.com

unread,
Apr 14, 2012, 8:50:31 PM4/14/12
to
I was not thinking of multiplication.

I was referring to division by 10,
which is accomplished by shifting and adding.

robin....@gmail.com

unread,
Apr 14, 2012, 8:57:09 PM4/14/12
to
Unbelievable.

In PL/I, all that's needed is i = c;

And if leading blanks are not required, i = trim(c); will do.

> --

robin....@gmail.com

unread,
Apr 14, 2012, 9:02:24 PM4/14/12
to
On Sunday, 1 April 2012 00:13:19 UTC+11, James Van Buskirk wrote:
Unbelievable!
All that's required in PL/I is c = i;

And if leading blanks are not required, then c = trim(i); will do.
; will do

robin....@gmail.com

unread,
Apr 14, 2012, 9:07:22 PM4/14/12
to
In PL/I, length(v) will do the job.


IN PL/I, length(v) will do the j
0 new messages