what's proper isdigit() on Go 1?

2,448 views
Skip to first unread message

dlin

unread,
Mar 1, 2012, 4:54:30 AM3/1/12
to golan...@googlegroups.com

line := []byte{'a','b','c'}

unicode.IsDigit(rune(line[0]))

line  is encoded as normal  ASCII.
Does that cause performance issue?

Or, there is another isdigit()?

roger peppe

unread,
Mar 1, 2012, 5:09:14 AM3/1/12
to dlin, golan...@googlegroups.com
On 1 March 2012 09:54, dlin <dli...@gmail.com> wrote:
>
> line := []byte{'a','b','c'}
>
> unicode.IsDigit(rune(line[0]))
>
> line  is encoded as normal  ASCII.
> Does that cause performance issue?

i doubt it will cause a performance issue (i think unicode.IsDigit
will be inlined), but if it is, you could always do the
traditional '0' <= r && r <= '9'.

Anthony Martin

unread,
Mar 1, 2012, 5:51:10 AM3/1/12
to roger peppe, dlin, golan...@googlegroups.com

unicode.IsDigit won't be inlined with the current compilers
since it declares a new variable in it's body. There's lots
of potential improvements that can made to the inlining code
and I'm sure we'll start to see them once Go 1 is out.

Anthony

Anthony Martin

unread,
Mar 1, 2012, 6:01:03 AM3/1/12
to roger peppe, dlin, golan...@googlegroups.com

To be precise, unicode.IsDigit won't be inlined because it
calls unicode.Is which declares a new variable in it's body.

A bit too quick on the trigger,
Anthony

roger peppe

unread,
Mar 1, 2012, 6:06:32 AM3/1/12
to Anthony Martin, dlin, golan...@googlegroups.com

yes, i was wondering that.

actually, i'm a bit surprised that it won't inline functions
that make calls that cannot be inlined. i suppose the
reasoning is that it wouldn't save much because the
call may be made anyway (it could, of course, save quite
a bit in this particular situation).

Stefan Nilsson

unread,
Mar 1, 2012, 6:41:35 AM3/1/12
to golan...@googlegroups.com
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. 
– Donald E. Knuth

I would suggest you do the following (in order):

1. Use unicode.IsDigit in your code whenever you need to check if a character is a digit.
2. If required, measure the performance of your code.
3. If unicode.IsDigit turns out to be a performance bottleneck, try to do something about it.

Anthony Martin

unread,
Mar 1, 2012, 6:57:01 AM3/1/12
to roger peppe, dlin, golan...@googlegroups.com

You can pass the -llll flag (yes, four 'l's) to the compiler
and it will try to inline non-leaf functions. Compiling the
unicode package like this will indeed cause IsDigit to be
inlined.

Anthony

roger peppe

unread,
Mar 1, 2012, 7:20:58 AM3/1/12
to Anthony Martin, dlin, golan...@googlegroups.com

i had tried this, but it didn't work for me. i'd expect to see a
"can inline" message below.

% cat tst.go
package main
import (
"unicode"
)

func main() {
for i := rune(0); i < 200; i++ {
if unicode.IsDigit(i) {
println(true)
}
}
}
% go tool 6g -lllll -m $%
%

Mue

unread,
Mar 1, 2012, 7:30:16 AM3/1/12
to golan...@googlegroups.com
On Thursday, March 1, 2012 12:41:35 PM UTC+1, snilsson wrote:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. 
– Donald E. Knuth

I would suggest you do the following (in order):

1. Use unicode.IsDigit in your code whenever you need to check if a character is a digit.
2. If required, measure the performance of your code.
3. If unicode.IsDigit turns out to be a performance bottleneck, try to do something about it.

Yep, full ack, no premature optimization. It often leads do bad maintainable code letting the "next generation" or oneself asking later, why the hell a solution has been chosen. 

If there's no real need for an optimization, just don't do it, And once there is analyze the real performance leaksm e.g. I/O, before. In case of the digit question above I would like to know which string sizes are needed to gain a difference of at least one second.

mue

Andy Balholm

unread,
Mar 1, 2012, 11:45:02 AM3/1/12
to golan...@googlegroups.com
Besides the performance question, there is the question of semantics.

Why are you checking for a digit? If you get a ٦, will you be able to do anything useful with it? If not, you're better off with the pure ASCII test ('0' <= r and r <= '9')

If your input is guaranteed to be pure ASCII, either way would work. I would use the ASCII test, but unless it's in a tight loop, firing up your text editor to change it will probably use more CPU cycles than you would save by changing it.

Russ Cox

unread,
Mar 8, 2012, 10:53:50 AM3/8/12
to roger peppe, Anthony Martin, dlin, golan...@googlegroups.com
On Thu, Mar 1, 2012 at 06:06, roger peppe <rogp...@gmail.com> wrote:
> actually, i'm a bit surprised that it won't inline functions
> that make calls that cannot be inlined.

s/ that cannot be inlined//

The goal is to preserve information on crashes.
I've spent too long cursing inliners that make
programs undebuggable.

Russ

roger peppe

unread,
Mar 8, 2012, 11:16:43 AM3/8/12
to r...@golang.org, Anthony Martin, dlin, golan...@googlegroups.com

is it possible that in the future we could have a one-to-many PC->line number
mapping and so remove this restriction?

or is that not the issue?

Russ Cox

unread,
Mar 8, 2012, 11:55:08 AM3/8/12
to roger peppe, Anthony Martin, dlin, golan...@googlegroups.com
On Thu, Mar 8, 2012 at 11:16, roger peppe <rogp...@gmail.com> wrote:
> is it possible that in the future we could have a one-to-many PC->line number
> mapping and so remove this restriction?
>
> or is that not the issue?

yes, it is possible. i am still hoping for jetpacks though.

roger peppe

unread,
Mar 8, 2012, 12:20:54 PM3/8/12
to r...@golang.org, Anthony Martin, dlin, golan...@googlegroups.com

jetpacks would be ideal, of course, but i'm easily satisfied.

Ian Lance Taylor

unread,
Mar 8, 2012, 1:19:04 PM3/8/12
to roger peppe, r...@golang.org, Anthony Martin, dlin, golan...@googlegroups.com
roger peppe <rogp...@gmail.com> writes:

> is it possible that in the future we could have a one-to-many PC->line number
> mapping and so remove this restriction?

For the record, DWARF version 4 has this feature. It has a way to
record inlining information for, in effect, each instruction. That
permits debuggers to show a complete backtrace at each instruction,
including inlined functions. You can see this in current versions of
gcc and gdb.

Ian

Reply all
Reply to author
Forward
0 new messages