Codementor Events

Java String Concatenation Explained

Published Mar 31, 2016Last updated Apr 12, 2017
Java String Concatenation Explained

I have seen a QuickTip here at Codementor and I want to explain it in more detail — to let you know why is it a best practice.

The QuickTip Itself

Let's see what is the best practice I will revise: "If you want to concatenate Strings don't use '+' but a StringBuilder because every time you use '+' a new object is created and you have a lot of unused objects in the memory."

You can read about this best practice in the Quick Tip of Suresh Atta here.

Well, actually what does he mean by "a new object is created"?

The Example

Let's start with the example from this QuickTip. I altered it a bit to have some content too:

String myString = "";
for(int i = 0; i < 1_000_000; i++) {
    myString += i;
}
System.out.println(myString);

As you can see I start with an empty String and then create a loop of one million numbers (1_000_000 is the way you can write 1000000 in Java 8 -- and actually I find it better and more readable).

Let's compile and don't run the application! It takes time, very much time! But why?

As Suresh mentions this code will create one million objects for nothing and trash up your memory. To be honest this creates around 2000000 objects -- so way more than he thinks.

To verify this let's look at the byte code of the generated .class file, however I don't want you to understand what's going on here, I include it for the sake of brevity:

public static void main(java.lang.String...);
    Code:
    0: ldc     #2   // String
    2: astore_1
    3: iconst_0
    4: istore_2
    5: iload_2
    6: ldc     #3   // int 1000000
    8: if_icmpge  36
   11: new     #4  // class java/lang/StringBuilder
   14: dup
   15: invokespecial #5  // Method java/lang/StringBuilder."<init>":()V
   18: aload_1
   19: invokevirtual #6  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
   22: iload_2
   23: invokevirtual #7  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
   26: invokevirtual #8  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
   29: astore_1
   30: iinc    2, 1
   33: goto    5
   36: getstatic  #9  // Field java/lang/System.out:Ljava/io/PrintStream;
   39: aload_1
   40: invokevirtual #10  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
   43: return

This is the byte code Java creates when you compile the application, you can look at it with the

javap -c YourClass.class

command.

The interesting part is between the lines 5 and 33. That's the for loop of the example. If you look carefully you can see that in the body of the loop a new StringBuilder is created (line 11) every time in the loop's body with the current contents of myString (line 19) and then the current value of i (line 23) is appended to the builder too. Then the current value of the StringBuilder is converted toString and myString gets this value assigned (line 26).

And this is not a performant action. Object creation costs time and resources this is why this example code above takes way too long.

The Quick-Tip Solution

Now it is time to look at the quick tip our best practice since ever:

StringBuilder sb = new StringBuilder();
for(int i = 0; i < 1_000_000; i++) {
    sb.append(i);
}
System.out.println(sb.toString());

Here you create the StringBuilder outside of the for loop, append the value of i to the builder and at the end you print out the result. You can run this application because it finishes in no time (the only time-consuming here is to print out the result to the console because it is an I/O operation).

Now let's look at the bytecode too and compare the solutions:

public static void main(java.lang.String...);
    Code:
       0: new           #2  // class java/lang/StringBuilder
       3: dup           
       4: invokespecial #3  // Method java/lang/StringBuilder."<init>":()V
       7: astore_1      
       8: iconst_0      
       9: istore_2      
      10: iload_2       
      11: ldc           #4  // int 1000000
      13: if_icmpge     28
      16: aload_1       
      17: iload_2       
      18: invokevirtual #5  // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
      21: pop           
      22: iinc          2, 1
      25: goto          10
      28: getstatic     #6  // Field java/lang/System.out:Ljava/io/PrintStream;
      31: aload_1       
      32: invokevirtual #7  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      35: invokevirtual #8  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      38: return        
}

The for loop here is between lines 10 and 25. As you can see here the code only appends to the on line 4 created StringBuilder without creating anything new. After the loop is over the contents of the StringBuilder are converted into a string and displayed on the standard out.

This means we have verified that this old "best practice" is still true and we know why to stick with this practice.

Outside of the Loop

After reading myself through this article I wonder how about string concatenation outside of loops? Should we use there a StringBuilder or is using + fine?

Let's see what the code tells us.

First of all here is a simple code block which concatenates some strings into a sentence:

String greeting = "Hello" + " " + "World" + "!";
System.out.println(greeting);

This example is very straightforward: we concatenate these four simple Strings. What happens after we compile?

public static void main(java.lang.String...);
    Code:
       0: ldc           #2                  // String Hello World!
       2: astore_1      
       3: getstatic     #3                  // Field java/lang/System.out:Ljava/io/PrintStream;
       6: aload_1       
       7: invokevirtual #4                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      10: return

Actually no magic: the code simply stores the value "Hello World!" as one String. So let's assign "Hello" and "World" to variables:

String hello = "Hello";
String world = "World";
String greeting = hello + " " + world + "!";
System.out.println(greeting);

Now this lets the compiler do some magic:

public static void main(java.lang.String...);
    Code:
       0: ldc           #2                  // String Hello
       2: astore_1      
       3: ldc           #3                  // String World
       5: astore_2      
       6: new           #4                  // class java/lang/StringBuilder
       9: dup           
      10: invokespecial #5                  // Method java/lang/StringBuilder."<init>":()V
      13: aload_1       
      14: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      17: ldc           #7                  // String  
      19: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      22: aload_2       
      23: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      26: ldc           #8                  // String !
      28: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      31: invokevirtual #9                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      34: astore_3      
      35: getstatic     #10                 // Field java/lang/System.out:Ljava/io/PrintStream;
      38: aload_3       
      39: invokevirtual #11                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      42: return

As you can see, the compiler creates a StringBuilder and appends the four building blocks before converting it to the greeting String. So in this case it does not matter if we create a StringBuilder ourselves or concatenate with +.

How about being a bit more tricky and doing the same way we did in the loop and concatenate greeting step-by-step?

String hello = "Hello";
String world = "World";

String greeting = hello;
greeting += " ";
greeting += world;
greeting += "!";
System.out.println(greeting);

Now I am really nasty — and the compiler does not look through my trick and we get into the same problem than with the loop: we get a lot of StringBuilders created which consumes memory and time:

public static void main(java.lang.String...);
    Code:
       0: ldc           #2                  // String Hello
       2: astore_1      
       3: ldc           #3                  // String World
       5: astore_2      
       6: aload_1       
       7: astore_3      
       8: new           #4                  // class java/lang/StringBuilder
      11: dup           
      12: invokespecial #5                  // Method java/lang/StringBuilder."<init>":()V
      15: aload_3       
      16: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      19: ldc           #7                  // String  
      21: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      24: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      27: astore_3      
      28: new           #4                  // class java/lang/StringBuilder
      31: dup           
      32: invokespecial #5                  // Method java/lang/StringBuilder."<init>":()V
      35: aload_3       
      36: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      39: aload_2       
      40: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      43: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      46: astore_3      
      47: new           #4                  // class java/lang/StringBuilder
      50: dup           
      51: invokespecial #5                  // Method java/lang/StringBuilder."<init>":()V
      54: aload_3       
      55: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      58: ldc           #9                  // String !
      60: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      63: invokevirtual #8                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      66: astore_3      
      67: getstatic     #10                 // Field java/lang/System.out:Ljava/io/PrintStream;
      70: aload_3       
      71: invokevirtual #11                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      74: return

Naturally we do not see any impact on performance with this little application but imagine a bigger workflow where you append values to a String here-and-there to get the final result (if you ask me I would mention as an example a query builder where you can build an SQL query based on various criteria).

Lines 8, 28 and 47 create the new StringBuilder objects so for each += there is a new object construction.

Java 8 and join

Java 8 introduced a new String method: join. If we are at it, let's take a look how join does its job behind the scenes.

String greeting = String.join("", "Hello", " ", "World", "!");
System.out.println(greeting);

This again is a very basic example where I use the empty String as separator and add the space character as a String in the list to join together. The result on the console is as expected Hello World!.

Now let's take a look at the bytecode:

public static void main(java.lang.String...);
    Code:
       0: ldc           #2                  // String
       2: iconst_4      
       3: anewarray     #3                  // class java/lang/CharSequence
       6: dup           
       7: iconst_0      
       8: ldc           #4                  // String Hello
      10: aastore       
      11: dup           
      12: iconst_1      
      13: ldc           #5                  // String  
      15: aastore       
      16: dup           
      17: iconst_2      
      18: ldc           #6                  // String World
      20: aastore       
      21: dup           
      22: iconst_3      
      23: ldc           #7                  // String !
      25: aastore       
      26: invokestatic  #8                  // Method java/lang/String.join:(Ljava/lang/CharSequence;[Ljava/lang/CharSequence;)Ljava/lang/String;
      29: astore_1      
      30: getstatic     #9                  // Field java/lang/System.out:Ljava/io/PrintStream;
      33: aload_1       
      34: invokevirtual #10                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      37: return

Well, this is a bit code for such a short program. Let's analyze what's happening!

On line 0 it initializes a String with no value (this is the separator), on line 8, 13, 18 and 23 it creates Strings for the building blocks of the greeting. After this the values are joined together into the greeting Object which is then displayed.

Conclusion

Some best practices are good to trust however you should know why you are using them. If you "just do" then consider questioning the reasons behind it because it can happen that this one is an old practice and the compiler changed the behavior of your code since some time.

In this article I have shown you the different approaches for string concatenation and how they may impact performance if you use them excessively. Beside this I have proven a best practice to be still true and explained the reasons behind a relevant Quick Tip.

Discover and read more posts from Gábor László Hajba
get started
post comments3Replies
Iavor Dekov
8 years ago

Thank you for this tutorial Gábor, I find it useful and look forward to applying the quick tip to my code when the opportunity arises.

You say that the following code “creates around 2000000 objects.”

for(int i = 0; i < 1_000_000; i++) {
myString += i;
}

I thought that a new object is created during each iteration of the for-loop. In that case, it would mean that only 1,000,000 objects would be created. Why are 2,000,000 created?

Gabor Laszlo Hajba
8 years ago

Hello Iavor,

sorry for the late answer but I did not get any notification about your comment.

Well, the loop does 1000000 iterations and in each iteration it creates a new StringBuilder object – but it also calls StringBuilder.toString() at the end of the loop to assign the new String value to the myString variable. And if you look at the source code of the StringBuilder.toString method you can see the code in the attachment: it creates a new String object too. This is why 2000000 objects are created.

I write in the article “around 2000000” because some Strings may already exist and they are served from the String pool. But this is another story of the internal behaviour of the JVM.

Iavor Dekov
8 years ago

That makes sense, thanks for your reply Gábor!