Splitting CamelCaseWords in Go

1,109 views
Skip to first unread message

Tong Sun

unread,
Sep 3, 2017, 5:23:59 PM9/3/17
to golang-nuts
Hi, 

I need to split "CamelCaseWords" into individual words like "Camel Case Words". 
The following is the Perl code that I get for doing just that:

    @words = $words[0] =~ /[A-Z][^A-Z]*/g
       
if @words == 1 && $words[0] =~ /^[A-Z]/;

However, I've been staring at it long enough to confirm myself that I really don't quite understand how it was done. 

Anyway, I'm wondering what's the neat way to do it in Go. 

PS. if you must know, I know that the algorithm I can borrow from is github.com/danverbraganza/varcaser, but when I was trying to use it, I noticed a side effect that makes it works for "myConstantVariable" but not for "GNU PYTHON Standard":

Thanks

Seth Bunce

unread,
Sep 3, 2017, 6:06:45 PM9/3/17
to Tong Sun, golang-nuts
https://play.golang.org/p/-FM2wC22A0

My inclination would be to go for an approach like this. Then make a table-driven unit test and throw everything you can think of at it and iterate until you get what you need.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tong Sun

unread,
Sep 3, 2017, 11:49:50 PM9/3/17
to Seth Bunce, golang-nuts


On Sun, Sep 3, 2017 at 6:05 PM, Seth Bunce wrote:

Thanks a lot Seth!

FTR, I archived it in my personal treasure chest at
with a little bit enhancement. 


Florian Florensen

unread,
Sep 4, 2017, 6:55:25 PM9/4/17
to golang-nuts
Hi, two approaches would be:

func camelAppend(str string) string {
  w
:= []rune(str)
 
for i := len(w) - 1; i > 1; i-- {
   
if unicode.IsUpper(w[i]) {
      w
= append(w[:i], append([]rune{' '}, w[i:]...)...)
   
}
 
}
 
return string(w)
}

func camelRegexp(str string) string {
  re
:= regexp.MustCompile(`([A-Z]+)`)
  str
= re.ReplaceAllString(str, ` $1`)
  str
= strings.Trim(str, " ")
 
return str
}

$ go test -bench=.
goos
: darwin
goarch
: amd64
BenchmarkCamelAppend-4   3000000       444 ns/op
BenchmarkCamelRegexp-4    200000     11224 ns/op
PASS

Tong Sun

unread,
Sep 4, 2017, 9:43:22 PM9/4/17
to Florian Florensen, golang-nuts
Oh thanks a lot Florian! 

I wished I had received it earlier (my email header said, Created at: Sun, Sep 3, 2017 at 6:03 PM (Delivered after 89531 seconds)), because my own version is embarrassingly complicated:


I'll go with the simple camelRegexp, because to be fair, the camelAppend is not handing the cases that camelRegexp is handling, e.g., "FooBarBaz GNU PYTHON Standard", and I'll make it not inserting space if there is already one there...

PS. did you have to write extra code (not published here) to use `go test -bench=.`?

thx



--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/MmerkVS9ke0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.

Florian Florensen

unread,
Sep 5, 2017, 1:35:57 PM9/5/17
to golang-nuts, fl...@posteo.de
I've just joined the group and had to get activated, that's why my answer took so long. Should have waited until then.

Thank you for the notice! I fixed it in the playground: https://play.golang.org/p/kuk6FxesDq.
Although I wrote a benchmark (https://play.golang.org/p/YpnI257SHD), I didn't write tests. Sorry for that!

I still would use Seths version, since it correctly splits uppercase-words like CAMELCase to ["C" "A" "M" "E" "L" "Case"].
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

Tong Sun

unread,
Sep 5, 2017, 3:34:10 PM9/5/17
to Florian Florensen, golang-nuts

I'll be religiously avoiding "unicode.IsUpper()" as something mystery happened in the past:

BTW, for my case, I do need the string,

"FooBarBaz GNU PYTHON Standard"

to be split exactly to be

"Foo Bar Baz GNU PYTHON Standard"

I.e., 6 words altogether, no more spacing than this. 

thanks again for helping and showing me you benchmark code. 


To unsubscribe from this group and all its topics, send an email to golang-nuts+unsubscribe@googlegroups.com.

Michael Jones

unread,
Sep 5, 2017, 6:19:16 PM9/5/17
to Tong Sun, Florian Florensen, golang-nuts

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Michael T. Jones
michae...@gmail.com

Tong Sun

unread,
Sep 6, 2017, 9:13:33 AM9/6/17
to Michael Jones, Florian Florensen, golang-nuts
Almost, https://play.golang.org/p/6Zl_EKqFqT
But thanks a lot! It's significantly shorter/better than my initial version. 

Steven Hartland

unread,
Sep 6, 2017, 9:59:29 AM9/6/17
to golan...@googlegroups.com
Numbers don't match IsLower so how about:
https://play.golang.org/p/Z6q9dJZ7QK
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Michael Jones

unread,
Sep 6, 2017, 11:16:12 AM9/6/17
to Steven Hartland, golang-nuts
Precisely, though probably a renaming is in order:

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steven Hartland

unread,
Sep 6, 2017, 12:10:06 PM9/6/17
to Michael Jones, golang-nuts
You other option is:
splittable == v != ' ' && v == unicode.ToLower(v), depends what you want to happen with non-word characters?
Reply all
Reply to author
Forward
0 new messages