How To Split A String In Java By Example

There are many times we want to split or divide a string into substrings using some sort of delimiter when processing string data in java. Here we will see various examples of alternative ways to do this. The two commonly used ways to do this is either with the java split() method or with the StringTokenizer.

Splitting a String Using the String.split() Method

String Split With Regex

The String object has a split() method that can be used to split a string based on a regular expression.

The signatures for the split method are

public String[] split(String regex)

The above method will split the string based on a regular expression and return a String array of the results. The String array will return the input string split on the result of the regular expression or the end of the string. The result of the regular expression will act as a delimiter / terminator to determine how the string should be split. The split will occur and many times as required until the end of the input string is reached. Plus any empty strings at the end of the input string will be thrown away.

Lets run a simple junit test to show how this works


    @Test
    public void splitTest() throws Exception {

	String name = "John-Smith";
	String[] output = name.split("-");

	System.out.println(output[0]);
	System.out.println(output[1]);
	System.out.println(Arrays.toString(output));

	assertThat("The first string will be the first name", output[0], is("John"));
	assertThat("The second string will be the surname", output[1], is("Smith"));

    }

The test above will split the String “John-Smith” on the dash, to 2 strings. One point that was made was that trailing empty strings would be discarded. This means that if a delimiter is found, but is only followed by blanks / spaces, another string with blanks will not be created. Below is the output from running the test.

John
Smith
[John, Smith]

So if we change our unit test to include another delimiter and trailing empty spaces, we still end up with the same strings, and the additional empty delimited strings are being thrown away


    @Test
    public void splitTest() throws Exception {

	String name = "John-Smith--";
	String[] output = name.split("-");

	System.out.println(output[0]);
	System.out.println(output[1]);
	System.out.println(Arrays.toString(output));

	assertThat("The first string will be the first name", output[0], is("John"));
	assertThat("The second string will be the surname", output[1], is("Smith"));

    }

and output is

John
Smith
[John, Smith]

String Split With Regex And Limit

This split method will also split the string based on a regular expression and return a String array of the results, but will additionally allow you to provide a limit. The limit indicates how many times (n – 1) it will attempt to split the string with the result of the regular expression.

public String[] split(String regex, int limit)

Lets look at some examples.


    @Test
    public void splitTestWithLimit() throws Exception {

	String name = "John-Harry-Smith--";
	String[] output = name.split("-", 2);

	System.out.println(output[0]);
	System.out.println(output[1]);
	System.out.println(Arrays.toString(output));

	assertThat("The first string will be the first name", output[0], is("John"));
	assertThat("The second string will be the surname", output[1], is("Smith"));

    }

Because the of the limit the split will occur n -1 times so in this case once, giving us the result below

John
Harry-Smith--
[John, Harry-Smith--]

So we end up with 2 strings.

If we pass a negative limit, it will run the split as many times as necessary, and will include empty strings. So lets change the limit and see:


    @Test
    public void splitTestWithNegativeLimit() throws Exception {

	String name = "John-Harry-Smith--";
	String[] output = name.split("-", -1);

	System.out.println(output[0]);
	System.out.println(output[1]);
	System.out.println(Arrays.toString(output));

	assertThat("The first string will be the first name", output[0], is("John"));
	assertThat("The second string will be the middle name", output[1], is("Harry"));
	assertThat("The third string will be the surname", output[2], is("Smith"));
	assertThat("The fourth string will be empty", output[3], is(""));
	assertThat("The fifth string will be empty", output[4], is(""));

    }

Here we actually end up with 5 strings in our array, as shown in the output below. We get 1 each for “John”, “Harry” and “Smith”. But we also get 2 empty strings. This is because we specified a negative limit, so it is not discarding the empty strings.

John
Harry
[John, Harry, Smith, , ]

References

Oracle String split javadoc

Leave a Comment

Your email address will not be published. Required fields are marked *