3.6. Strings
Conceptually, Java strings are sequences of Unicode characters. As you have seen in Section 3.3.4, the concept of what exactly a character is has become complicated. And the encoding of the characters into char values has also become complicated.
However, most of the time, you don’t care. You get strings from string literals or from methods, and you operate on them with methods of the String class. The following sections cover the details.
3.6.1. Concatenation
Java, like most programming languages, allows you to use + to join (concatenate) two strings.
String expletive = "Expletive"; String PG13 = "deleted"; String message = expletive + PG13;
The preceding code sets the variable message to the string "Expletivedeleted". (Note the lack of a space between the words: The + operator joins two strings in the order received, exactly as they are given.)
When you concatenate a string with a value that is not a string, the latter is converted to a string. For example,
int age = 13; String rating = "PG" + age;
sets rating to the string "PG13".
This feature is commonly used in output statements. For example,
IO.println("The answer is " + answer);
is perfectly acceptable and prints what you would expect (and with correct spacing because of the space after the word is).
If you need to put multiple strings together, separated by a delimiter, use the join method:
String all = String.join(" / ", "S", "M", "L", "XL");
// all is the string "S / M / L / XL"
The repeat method produces a string that repeats a given string a number of times:
String repeated = "Java".repeat(3); // repeated is "JavaJavaJava"
3.6.2. Static and Instance Methods
At the end of the preceding section, you saw two methods of the String class, join and repeat. There is a crucial difference between these two methods. When you call
String all = String.join(" / ", "S", "M", "L", "XL");
you provide all arguments that the method needs inside the parentheses. Contrast this with the call
String repeated = "Java".repeat(3);
To compute the repetition of a string, two pieces of information are required: the string itself, and the number of times that it should be repeated.
Note that the string is written before the name of the method, with a dot (.) separating the two. The repeat method is an example of an instance method. As you will see in Chapter 4, an instance method has one special argument; in this case, a string. That value precedes the method name. Supplementary arguments are provided after the method name in parentheses.
The String.join method, on the other hand, is a static method. It doesn’t have a special argument. The dot serves a different function, separating the name of the class in which the method is declared from the method name.
To tell the two apart, locate the dot. Is it preceded by a value (such as the string "Java")? Then you are looking at the call to an instance method. Or is it preceded by the name of a class (such as String)? Then it is a static method.
Many of the methods that you have seen so far, including IO.println, Integer.parseInt, and Math.sqrt, are static methods. However, as you learn more about Java, you will mostly use instance methods.
3.6.3. Indexes and Substrings
Java strings are sequences of char values. As you saw in Section 3.3.4, the char data type is used for representing Unicode code points in the UTF-16 encoding. Some characters can be represented with a single char value, but many characters and symbols require more than one char value.
The length instance method yields the number of char values required for a given string. For example:
String greeting = "Ahoy 🏴☠️"; int n = greeting.length(); // is 10
The call s.charAt(n) returns the char value at position n, where n is between 0 and s.length() – 1. (Like C and C++, Java counts positions in a string starting with 0.) For example:
char first = greeting.charAt(0); // first is 65 or 'A' char last = greeting.charAt(9); // last is 65039
However, these calls are not very useful. The last char value is just a part of the flag symbol, and you won’t generally care what these values are.
Still, you sometimes need to know where a substring is located in a string. Use the indexOf method:
String sub = " "; int start = greeting.indexOf(sub); // 4
As it happens, the position or index of the space is 4, but the exact value doesn’t matter. It depends on the characters preceding the substring, and the number of char values needed to encode each of them. Always treat an index as an opaque number, not the count of perceived characters preceding it.
You can compute where the next character starts:
int nextStart = start + sub.length(); // 5
The string " " has length 1, but do not hard-code the length of a string. Always use the length method instead.
You can extract a substring from a larger string with the substring method of the String class. For example,
String greeting = "Hello, World!";
int a = greeting.indexOf(",") + 2; // 7
int b = greeting.indexOf("!"); // 12
String s = greeting.substring(a, b);
creates a string consisting of the characters "World".
The second argument of substring is the first position that you do not want to copy. In our case, we copy everything from the beginning up to, but not including, the comma.
Note that the string s.substring(a, b) always has length b − a. For example, the substring "World" has length 12 − 7 = 5.
3.6.4. Strings Are Immutable
The String class gives no methods that let you change a character in an existing string. If you want to turn greeting into "Help!", you cannot directly change the last positions of greeting into 'p' and '!'. If you are a C programmer, this can make you feel pretty helpless. How are we going to modify the string? In Java, it is quite easy: Concatenate the substring that you want to keep with the characters that you want to replace.
String greeting = "Hello";
int n = greeting.indexOf("lo");
greeting = greeting.substring(0, n) + "p!";
This declaration changes the current value of the greeting variable to "Help!".
Since you cannot change the individual characters in a Java string, the documentation refers to the objects of the String class as immutable. Just as the number 3 is always 3, the string "Hello" will always contain the code-unit sequence for the characters H, e, l, l, o. You cannot change these values. Yet you can, as you just saw, change the contents of the string variable greeting and make it refer to a different string, just as you can make a numeric variable currently holding the value 3 hold the value 4.
Isn’t that a lot less efficient? It would seem simpler to change the characters than to build up a whole new string from scratch. Well, yes and no. Indeed, it is some amount of work to generate a new string that holds the concatenation of "Hel" and "p!". But immutable strings have one great advantage: The compiler can arrange that strings are shared.
To understand how this works, think of the various strings as sitting in a common pool. String variables then point to locations in the pool. If you copy a string variable, both the original and the copy share the same characters.
Overall, the designers of Java decided that the efficiency of sharing outweighs the inefficiency of string creation. Look at your own programs; most of the time, you probably don’t change strings—you just compare them. (There is one common exception—assembling strings from individual characters or from shorter strings that come from the keyboard or a file. For these situations, Java provides a separate class—see Section 3.6.9.)
3.6.5. Testing Strings for Equality
To test whether two strings are equal, use the equals method. The expression
s.equals(t)
returns true if the strings s and t are equal, false otherwise. Note that s and t can be string variables or string literals. For example, the expression
"Hello".equals(greeting)
is perfectly legal. To test whether two strings are identical except for the upper/lowercase letter distinction, use the equalsIgnoreCase method.
"Hello".equalsIgnoreCase("hello")
Do not use the == operator to test whether two strings are equal! It only determines whether or not the strings are stored in the same location. Sure, if strings are in the same location, they must be equal. But it is entirely possible to store multiple copies of identical strings in different places.
String greeting = "Hello"; // initialize greeting to a string
greeting == "Hello" // true
greeting.substring(0, greeting.indexOf("l")) == "He" // false
greeting.substring(0, greeting.indexOf("l")).equals("He") // true
If the virtual machine always arranges for equal strings to be shared, then you could use the == operator for testing equality. But only string literals are shared, not strings that are computed at runtime. Therefore, never use == to compare strings. Always use equals instead.
3.6.6. Empty and Null Strings
The empty string "" is a string of length 0. You can test whether a string is empty by calling
if (str.length() == 0)
or
if (str.equals(""))
or , for optimum efficiency
if (str.isEmpty())
An empty string is a Java object which holds the string length (namely, 0) and an empty contents. However, a String variable can also hold a special value, called null, that indicates that no object is currently associated with the variable. To test whether a string is null, use
if (str == null)
Sometimes, you need to test that a string is neither null nor empty. Then use
if (str != null && str.length() != 0)
You need to test that str is not null first. As you will see in Chapter 4, it is an error to invoke a method on a null value.
3.6.7. The String API
The String class in Java contains close to 100 methods. The following API note summarizes the most useful ones.
These API notes, found throughout the book, will help you understand the Java Application Programming Interface (API). Each API note starts with the name of a class, such as java.lang.String. (The significance of the so-called package name java.lang is explained in Chapter 4.) The class name is followed by the names, explanations, and parameter descriptions of one or more methods. A parameter variable of a method is the variable that receives a method argument. For example, as you will see in the first API note below, the charAt method has a parameter called index of type int. If you call the method, you supply an argument of that type, such as str.charAt(0).
The API notes do not list all methods of a particular class but present the most commonly used ones in a concise form. For a full listing, consult the online documentation (see Section 3.6.8).
The number following the class name is the JDK version number in which it was introduced. If a method has been added later, it has a separate version number.
3.6.8. Reading the Online API Documentation
As you just saw, the String class has lots of methods. Furthermore, there are thousands of classes in the standard libraries, with many more methods. It is plainly impossible to remember all useful classes and methods. Therefore, it is essential that you become familiar with the online API documentation that lets you look up all classes and methods in the standard library. You can download the API documentation from Oracle and save it locally, or you can point your browser to https://docs.oracle.com/en/java/javase/25/docs/api.
The API documentation has a search box (see Figure 3.2). Older versions have frames with lists of packages and classes. You can still get those lists by clicking on the Frames menu item. For example, to get more information on the methods of the String class, type “String” into the search box and select the type java.lang.String, or locate the link in the frame with class names and click it. You get the class description, as shown in Figure 3.3.
Figure 3.2: The Java API documentation
Figure 3.3: Class description for the String class
When you scroll down, you reach a summary of all methods, sorted in alphabetical order (see Figure 3.4). Click on any method name for a detailed description of that method (see Figure 3.5). For example, if you click on the compareToIgnoreCase link, you’ll get the description of the compareToIgnoreCase method.
Figure 3.4: Method summary of the String class
Figure 3.5: Detailed description of a String method
3.6.9. Building Strings
Occasionally, you need to build up strings from shorter strings, such as keystrokes or words from a file. It would be inefficient to use string concatenation for this purpose. Every time you concatenate strings, a new String object is constructed. This is time consuming and wastes memory. Using the StringBuilder class avoids this problem.
Follow these steps if you need to build a string from many small pieces. First, construct an empty string builder:
StringBuilder builder = new StringBuilder();
You can also provide initial content:
StringBuilder builder = new StringBuilder("INVOICE\n");
Each time you need to add another part, call the append method.
builder.append(str); // appends a string builder.appendCodePoint(cp); // appends a single code point
The latter method is occasionally useful when you need to compute a code point. Here is an example. Flag emojis are made up of two code points, each in the range between 127462 (regional indicator symbol letter A) to 127487 (regional indicator symbol letter Z). Now suppose you have a country string such as "IT". Then you can compute the code points as follows:
final int REGIONAL_INDICATOR_SYMBOL_LETTER_A = 127462; String country = . . .; builder.appendCodePoint(country.charAt(0) - 'A' + REGIONAL_INDICATOR_SYMBOL_LETTER_A); builder.appendCodePoint(country.charAt(1) - 'A' + REGIONAL_INDICATOR_SYMBOL_LETTER_A);
When you are done building the string, call the toString method. You will get a String object with the character sequence contained in the builder.
String completedString = builder.toString();
Cleverly, the StringBuilder methods return the builder object, so that you can chain multiple method calls:
String completedString = new StringBuilder()
.append(str)
.appendCodePoint(cp)
.toString();
The String class doesn’t have a method to reverse the Unicode characters of a string, but StringBuilder does. To reverse a string, use this code snippet:
String reversed = new StringBuilder(original).reverse().toString();
The following API notes contain the most important methods for the StringBuilder class.
3.6.10. Text Blocks
The text block feature, added in Java 15, makes it easy to provide string literals that span multiple lines. A text block starts with """, followed by a line feed. The block ends with another """:
String greeting = """ Hello World """;
A text block is easier to read and write than the equivalent string literal:
"Hello\nWorld\n"
This string contains two \n: one after Hello and one after World. The newline after the opening """ is not included in the string literal.
If you don’t want a newline after the last line, put the closing """ immediately after the last character:
String prompt = """ Hello, my name is Hal. Please enter your name:""";
Text blocks are particularly suited for including code in some other language, such as SQL or HTML. You can just paste it between the triple quotes:
String html = """ <div class="Warning"> Beware of those who say "Hello" to the world </div> """;
All escape sequences from regular strings work the same way in text blocks.
Note that you don’t have to use escape sequences with the quotation marks around Hello. There are just two situations where you need to use the \" escape sequence in a text block:
- If the text block ends in a quotation mark
- If the text block contains a sequence of three or more quotation marks
Unfortunately, you still need the escape sequence \\ to denote a backslash in a text block.
There is one escape sequence that only works in text blocks. A \ directly before the end of a line joins this line and the next. For example,
""" Hello, my name is Hal. Please enter your name:""";
is the same as
"Hello, my name is Hal. Please enter your name:"
Line endings are normalized by removing trailing whitespace and changing any Windows line endings (\r\n) to simple newlines (\n). If you need to preserve trailing spaces, turn the last one into a \s escape. In fact, that’s what you probably want for prompt strings. The following string ends in a space:
""" Hello, my name is Hal. Please enter your name:\s""";
The story is more complex for leading whitespace. Consider a typical variable declaration that is indented from the left margin. You can indent the text block as well:
String html = """
<div class="Warning">
Beware of those who say "Hello" to the world
</div>
""";
The indentation that is common to all lines in the text block is subtracted. The actual string is
"<div class=\"Warning\">\n Beware of those who say \"Hello\" to the world\n</div>\n"
Note that there are no indentations in the first and third lines.
You can always avoid this indentation stripping by having no whitespace in the last line, before the closing """. But many programmers seem to find that it looks neater when text blocks are indented. Your IDE may cheerfully offer to indent all text blocks, using tabs or spaces.
Java wisely does not prescribe the width of a tab. The whitespace prefix has to match exactly for all lines in the text block.
Entirely blank lines are not considered when stripping common indentation. However, the whitespace before the closing """ is significant. Be sure to indent to the end of the whitespace that you want to have stripped.
