Is there an "isAscii" function?

4,252 views
Skip to first unread message

Ken MacDonald

unread,
Jun 11, 2014, 10:33:50 AM6/11/14
to golan...@googlegroups.com
Hi,
Looking for a function that will tell me if a string is entirely valid ASCII. One field in our input has to be checked, I verify the other fields for being valid UTF-8 which has a nice function available.
Ken

Johann Höchtl

unread,
Jun 11, 2014, 10:46:22 AM6/11/14
to golan...@googlegroups.com
Technically and AFAIK the only valid way to check for ASCIIness is to check for 8bit cleanness. Highest bit in a byte set --> no ASCII.

James Bardin

unread,
Jun 11, 2014, 10:46:51 AM6/11/14
to golan...@googlegroups.com

Matthew Kane

unread,
Jun 11, 2014, 10:49:23 AM6/11/14
to James Bardin, golang-nuts
The unicode package has the MaxASCII constant for this purpose.
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
matt kane
twitter: the_real_mkb / nynexrepublic
http://hydrogenproject.com

James Bardin

unread,
Jun 11, 2014, 10:51:16 AM6/11/14
to golan...@googlegroups.com, j.ba...@gmail.com


On Wednesday, June 11, 2014 10:49:23 AM UTC-4, mkb wrote:
The unicode package has the MaxASCII constant for this purpose.


I knew it was there somewhere.
I answered too quickly, you should probably check for null bytes as well.


Kevin Gillette

unread,
Jun 11, 2014, 11:33:02 AM6/11/14
to golan...@googlegroups.com, j.ba...@gmail.com


On Wednesday, June 11, 2014 8:51:16 AM UTC-6, James Bardin wrote:
On Wednesday, June 11, 2014 10:49:23 AM UTC-4, mkb wrote:
The unicode package has the MaxASCII constant for this purpose. 
I knew it was there somewhere.

It's one of those nice-to-have things, but the reality is that the maximum ASCII value is well known, well defined, and will never change; it wouldn't have been atrocious to use a "magic number" for that (because if everyone know about it, it's not magic).

I answered too quickly, you should probably check for null bytes as well.

The OP asked for an IsASCII function, not an IsBinary function; _everything_ below 0x80 is ASCII, without exception. You might be thinking of IsPrintableASCII, which is not the same thing.

James Bardin

unread,
Jun 11, 2014, 11:48:45 AM6/11/14
to Kevin Gillette, golan...@googlegroups.com

On Wed, Jun 11, 2014 at 11:33 AM, Kevin Gillette <extempor...@gmail.com> wrote:
The OP asked for an IsASCII function, not an IsBinary function; _everything_ below 0x80 is ASCII, without exception. You might be thinking of IsPrintableASCII, which is not the same thing.

nah, I just had it in my head where I recently had a 0x00 byte and C string problem, but yeah, technically NUL is a valid ascii char. (and I wouldn't have ticked that magic number in my own code review either ;) )

rogerjd

unread,
Jun 11, 2014, 7:03:31 PM6/11/14
to golan...@googlegroups.com
import "code.google.com/p/go.exp/utf8string"

func (s *String) IsASCII() bool

IsASCII returns a boolean indicating whether the String contains only ASCII bytes.

Hope this is helpful.

Roger

On Wednesday, June 11, 2014 10:33:50 AM UTC-4, Ken MacDonald wrote:

Rui Ueyama

unread,
Jun 11, 2014, 7:09:48 PM6/11/14
to rogerjd, golang-nuts
On Wed, Jun 11, 2014 at 4:03 PM, rogerjd <rdem...@gmail.com> wrote:
import "code.google.com/p/go.exp/utf8string"


func (s *String) IsASCII() bool

IsASCII returns a boolean indicating whether the String contains only ASCII bytes.

IsASCII's receiver type is not string but utf8string.String. You have to write something like utf8String.NewString(s).IsASCII(). This would be too much to do such simple thing. I would just test if all bytes are less than 0x80.

Hope this is helpful.

Roger

On Wednesday, June 11, 2014 10:33:50 AM UTC-4, Ken MacDonald wrote:
Hi,
Looking for a function that will tell me if a string is entirely valid ASCII. One field in our input has to be checked, I verify the other fields for being valid UTF-8 which has a nice function available.
Ken

--

rogerjd

unread,
Jun 11, 2014, 7:29:59 PM6/11/14
to golan...@googlegroups.com, rdem...@gmail.com
Thanks for pointing that out; sorry.
Roger

rogerjd

unread,
Jun 11, 2014, 8:16:41 PM6/11/14
to golan...@googlegroups.com, rdem...@gmail.com
Does it make sense to contribute James' code to the go strings package?

func isASCII(s string) bool {
for _, c := range s {
if c > 127 {
return false
}
}
return true
}

As mentioned, is not a big deal, but will undoubtedly be useful to others, and we avoid duplication.
If so, I'd be happy to do that :)
Roger

Ian Lance Taylor

unread,
Jun 11, 2014, 8:23:44 PM6/11/14
to rogerjd, golang-nuts
On Wed, Jun 11, 2014 at 5:16 PM, rogerjd <rdem...@gmail.com> wrote:
>
> Does it make sense to contribute James' code to the go strings package?
>
> func isASCII(s string) bool {
> for _, c := range s {
> if c > 127 {
> return false
> }
> }
> return true
> }
>
>
> As mentioned, is not a big deal, but will undoubtedly be useful to others,
> and we avoid duplication.

I suspect that most people should not be using this function, so I
don't see a purpose for it in the standard library. I'm not denying
that this function can have a use, but I am denying that it should
have a lot of uses. unicode.MaxASCII is enough.

Ian

Rob Pike

unread,
Jun 11, 2014, 8:35:37 PM6/11/14
to Ian Lance Taylor, rogerjd, golang-nuts
There is no simple definition for what people want. Some want values
less than 0x80. Some want no NUL. Some want no DEL. Some want no BS.

Yet it's so easy to make the decision yourself.

It seems there is no compelling need for a function.

-rob

rogerjd

unread,
Jun 11, 2014, 10:39:39 PM6/11/14
to golan...@googlegroups.com, ia...@golang.org, rdem...@gmail.com
Thank you everyone.
Roger
Reply all
Reply to author
Forward
0 new messages