String Concatenation optimization

From EggeWiki
Jump to navigation Jump to search

I often see developers needlessly using StringBuffer.append in preference to the built in string concatenation operator. This is based on the idea that the "+" operator always creates a new object. While in general it does, a compiler is allowed to optimise this with an alternative implementation. However, is there any speed difference, or should one care? Well, I ran some tests to find out. Here's the tests I created (all of which produce the same result):

<geshi lang="java"> private static final P TEST1a = new P() { public String print(int i) { return "" + i + ", " + ++i + ", " + ++i; } };

private static final P TEST1b = new P() { public String print(int i) { return i + ", " + ++i + ", " + ++i; } };

private static final P TEST2 = new P() { public String print(int i) { StringBuffer sb = new StringBuffer(); sb.append(i); sb.append(", "); sb.append(++i); sb.append(", "); sb.append(++i); return sb.toString(); } };

private static final P TEST3 = new P() { public String print(int i) { StringBuilder sb = new StringBuilder(); sb.append(i); sb.append(", "); sb.append(++i); sb.append(", "); sb.append(++i); return sb.toString(); } };

private static final P TEST4 = new P() { public String print(int i) { return new StringBuilder() .append(i) .append(", ") .append(++i) .append(", ") .append(++i).toString(); } }; </geshi>

Here's my test harness: <geshi lang="java"> public void testToString() {

List

tests = Arrays.asList(TEST1a, TEST1b, TEST2, TEST3, TEST4); for (P p : tests) { long l = 0; long start = System.currentTimeMillis(); for (int i = 0; i < 10000000; i++) { l += p.print(i).length(); } assertEquals(246666691L, l); System.out.println(p + " " + System.currentTimeMillis() - start); } } </geshi> Here are the results:

TEST1a 7391
TEST1b 8954
TEST2 12344
TEST3 7500
TEST4 7376

TEST1a and TEST4 appear to run at about the same speed. Running the excellent Jad decompiler we can see that they produce the exact same byte code. You might see it as strange that TEST1a and TEST1b differ. This is because TEST1a produces slightly different byte code.

<geshi lang="java">

   private static final P TEST1a = new P() {
       public String print(int i)
       {
           return (new StringBuilder()).append(i).append(", ").append(++i).append(", ").append(++i).toString();
       }
   }
   private static final P TEST1b = new P() {
       public String print(int i)
       {
           return (new StringBuilder(String.valueOf(i))).append(", ").append(++i).append(", ").append(++i).toString();
       }
   }

</geshi>

Without the initial "", the compiler uses the String.valueOf, as it can't pass the integer to the StringBuilder directly. String.valueOf calls Integer.toString. However, StringBuilder.append(int) contains a better performing implementation. It can do this because it assumes a radix of 10 and it can write directly to the buffer.

The other thing to note is of course StringBuilder is faster than StringBuffer. By letting the compiler optimize the code, you can enjoy the benefits of faster implementations in future versions.