Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Optimising StringBuilder

0 views
Skip to first unread message

Roedy Green

unread,
Apr 4, 2008, 5:19:34 PM4/4/08
to
I'm curious just how clever HotSpot is about optimising StringBuilder,

Here are some possible optimisations.

1. when allocating the StringBuilder, compute the length where
possible to create the exact size.

2. when doing append ( "xxx");
append ("yyy");
convert that to append( "xxxyyy");

3. when doing toString, if char[] internally is the correct length,
steal that array to use in the string, and lazily make a copy (which
should be rarely needed.)

4. convert append ( " " ) to append ( ' ' );

5 convert append ( a + b + c ) to append (a ); append (b); append
(c);

If it is not clever, then consider how you might implement these with
a BYTE CODE optimiser.
--

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Roedy Green

unread,
Apr 4, 2008, 6:14:42 PM4/4/08
to
On Fri, 04 Apr 2008 21:19:34 GMT, Roedy Green
<see_w...@mindprod.com.invalid> wrote, quoted or indirectly quoted
someone who said :

>I'm curious just how clever HotSpot is about optimising StringBuilder,

I have written this up more formally at:
http://mindprod.com/project/stringbuilderoptimiser.html

Boudewijn Dijkstra

unread,
Apr 18, 2008, 6:05:37 PM4/18/08
to
Op Fri, 04 Apr 2008 23:19:34 +0200 schreef Roedy Green
<see_w...@mindprod.com.invalid>:

> I'm curious just how clever HotSpot is about optimising StringBuilder,

Why? If string manipulation is your main bottleneck, then probably it's
better not to use StringBuilder. Or to do less string manipulation. Or
both.

> Here are some possible optimisations.
>
> 1. when allocating the StringBuilder, compute the length where
> possible to create the exact size.

This will cost performance if the programmer already chose an acceptable
guesstimate.

> 2. when doing append ( "xxx");
> append ("yyy");
> convert that to append( "xxxyyy");

I don't consider this an optimisation, as it will cost memory (and
performance if the code isn't used enough) at the expense of tasks which
could very well be much more important than string manipulation.

> 3. when doing toString, if char[] internally is the correct length,
> steal that array to use in the string, and lazily make a copy (which
> should be rarely needed.)

Interesting. Also, it should be possible to determine the usage of the
char[], and so to not keep a reference around when it isn't needed.

> 4. convert append ( " " ) to append ( ' ' );

The source compiler should have done this. Otherwise, if the referenced
string is a variable (which varies in length), it is probably not feasible
(because something will need to check the length anyway).

> 5 convert append ( a + b + c ) to append (a ); append (b); append
> (c);

Should have been done by the source compiler. Even if you have code like
this:
StringBuffer buf = new StringBuffer();
buf.append(a).append(b).append(c);
builder.append(buf);

> If it is not clever, then consider how you might implement these with
> a BYTE CODE optimiser.

--
Gemaakt met Opera's revolutionaire e-mailprogramma:
http://www.opera.com/mail/

Roedy Green

unread,
Apr 20, 2008, 6:44:22 AM4/20/08
to
On Sat, 19 Apr 2008 00:05:37 +0200, "Boudewijn Dijkstra"
<use...@bdijkstra.tmfweb.nl> wrote, quoted or indirectly quoted
someone who said :

>Or to do less string manipulation

I can hardly imagine a case where that would be possible, unless you
mean precomputing hunks of strings or in someway doing the string
manipulation more efficiently.

Boudewijn Dijkstra

unread,
Apr 20, 2008, 7:57:38 AM4/20/08
to
Op Sun, 20 Apr 2008 12:44:22 +0200 schreef Roedy Green
<see_w...@mindprod.com.invalid>:

> On Sat, 19 Apr 2008 00:05:37 +0200, "Boudewijn Dijkstra"
> <use...@bdijkstra.tmfweb.nl> wrote, quoted or indirectly quoted
> someone who said :
>
>> Or to do less string manipulation
>
> I can hardly imagine a case where that would be possible, unless you
> mean precomputing hunks of strings or in someway doing the string
> manipulation more efficiently.

Compare
float f; // distance in parsecs
System.out.printf("Distance is %3.2f parsecs.\n", f);
versus
int i; // distance in centiparsecs
String s = String.valueOf(i);
int dot = s.length() - 2;
System.out.printf("Distance is %s.%s parsecs.\n", s.substring(0, dot),
s.substring(dot));
.

Compare
System.out.printf("Distance is %d parsecs.\n", d);
versus
System.out.print("Distance is ");
System.out.print(d);
System.out.println(" parsecs.");
.

Compare
initialisation:
Label myLabel = new Label();
execution:
myLabel.setText(String.format("Distance is %d parsecs.", d));
versus
initialisation:
Label myLabel1 = new Label("Distance is ");
Label myLabel2 = new Label();
Label myLabel3 = new Label(" parsecs.");
execution:
myLabel2.setText(String.valueOf(d));

0 new messages