Home > Articles > Programming > Java

  • Print
  • + Share This
This chapter is from the book

Strings

Strings are sequences of characters, such as "Hello". Java does not have a built-in string type. Instead, the standard Java library contains a predefined class called, naturally enough, String. Each quoted string is an instance of the String class:

String e = ""; // an empty string
String greeting = "Hello";

Concatenation

Java, like most programming languages, allows you to use the + sign to join (concatenate) two strings together.

String expletive = "Expletive";
String PG13 = "deleted";
String message = expletive + PG13;

The above code makes the value of the string variable message "Expletivedeleted". (Note the lack of a space between the words: the + sign joins two strings together in the order received, exactly as they are given.)

When you concatenate a string with a value that is not a string, the latter is converted to a string. (As you will see in Chapter 5, every Java object can be converted to a string.) For example:

int age = 13;
String rating = "PG" + age;

sets rating to the string "PG13".

This feature is commonly used in output statements; for example,

System.out.println("The answer is " + answer);

is perfectly acceptable and will print what one would want (and with the correct spacing because of the space after the word is).

Substrings

You extract a substring from a larger string with the substring method of the String class. For example,

String greeting = "Hello";
String s = greeting.substring(0, 4);

creates a string consisting of the characters "Hell". Java counts the characters in strings in a peculiar fashion: the first character in a string has position 0, just as in C and C++. (In C, there was a technical reason for counting positions starting at 0, but that reason has long gone away, and only the nuisance remains.)

For example, the character 'H' has position 0 in the string "Hello", and the character 'o' has position 4. The second parameter of substring is the first position that you do not want to copy. In our case, we want to copy the characters in positions 0, 1, 2, and 3 (from position 0 to position 3 inclusive). As substring counts it, this means from position 0 inclusive to position 4 exclusive.

There is one advantage to the way substring works: it is easy to compute the length of the substring. The string s.substring(a, b) always has b - a characters. For example, the substring "Hell" has length 4 – 0 = 4.

String Editing

To find out the length of a string, use the length method. For example:

String greeting = "Hello";
int n = greeting.length(); // is 5.

Just as char denotes a Unicode character, String denotes a sequence of Unicode characters. It is possible to get at individual characters of a string. For example, s.charAt(n) returns the Unicode character at position n, where n is between 0 and s.length() – 1. For example,

char last = greeting.charAt(4); // fourth is 'o'

However, the String class gives no methods that let you change a character in an existing string. If you want to turn greeting into "Hell!", you cannot directly change the last position of greeting into a '!'. If you are a C programmer, this will make you feel pretty helpless. How are you going to modify the string? In Java, it is quite easy: take the substring that you want to keep, and then concatenate it with the characters that you want to replace.

greeting = greeting.substring(0, 4) + "!";

This changes the current value of the greeting variable to "Hell!".

Since you cannot change the individual characters in a Java string, the documentation refers to the objects of the String class as being immutable. Just as the number 3 is always 3, the string "Hello" will always contain the character sequence 'H', 'e', 'l', 'l', 'o'. You cannot change these values. You can, as you just saw however, change the contents of the string variable greeting and make it refer to a different string, just as you can make a numeric variable currently holding the value 3 hold the value 4.

Isn't that a lot less efficient? It would seem simpler to change the characters than to build up a whole new string from scratch. Well, yes and no. Indeed, it isn't efficient to generate a new string that holds the concatenation of "Hell" and "!". But immutable strings have one great advantage: The compiler can arrange that strings are shared.

To understand how this works, think of the various strings as sitting in a common pool. String variables then point to locations in the pool. If you copy a string variable, both the original and the copy share the same characters. Overall, the designers of Java decided that the efficiency of sharing outweighs the inefficiency of string editing by extracting substrings and concatenating.

Look at your own programs; we suspect that most of the time, you don't change strings—you just compare them. Of course, there are some cases in which direct manipulation of strings is more efficient. (One example is when assembling strings from individual characters that come from a file or the keyboard.) For these situations, Java provides a separate StringBuffer class that we describe in Chapter 12. If you are not concerned with the efficiency of string handling (which is not a bottleneck in many Java applications anyway), you can ignore StringBuffer and just use String.

C programmers generally are bewildered when they see Java strings for the first time, because they think of strings as arrays of characters:

char greeting[] = "Hello";

That is the wrong analogy: a Java string is roughly analogous to a char* pointer,

char* greeting = "Hello";

When you replace greeting with another string, the Java code does roughly the following:

char* temp = malloc(6);
strncpy(temp, greeting, 4);
strncpy(temp + 4, "!", 2);
greeting = temp;

Sure, now greeting points to the string "Hell!". And even the most hardened C programmer must admit that the Java syntax is more pleasant than a sequence of strncpy calls. But what if we make another assignment to greeting?

greeting = "Howdy";

Don't we have a memory leak? After all, the original string was allocated on the heap. Fortunately, Java does automatic garbage collection. If a block of memory is no longer needed, it will eventually be recycled.

If you are a C++ programmer and use the string class defined by ANSI C++, you will be much more comfortable with the Java String type. C++ string objects also perform automatic allocation and deallocation of memory. The memory management is performed explicitly by constructors, assignment operators, and destructors. However, C++ strings are mutable—you can modify individual characters in a string.

C programmers generally are bewildered when they see Java strings for the first time, because they think of strings as arrays of characters:

char greeting[] = "Hello";

That is the wrong analogy: a Java string is roughly analogous to a char* pointer,

char* greeting = "Hello";

When you replace greeting with another string, the Java code does roughly the following:

char* temp = malloc(6);
strncpy(temp, greeting, 4);
strncpy(temp + 4, "!", 2);
greeting = temp;

Sure, now greeting points to the string "Hell!". And even the most hardened C programmer must admit that the Java syntax is more pleasant than a sequence of strncpy calls. But what if we make another assignment to greeting?

greeting = "Howdy";

Don't we have a memory leak? After all, the original string was allocated on the heap. Fortunately, Java does automatic garbage collection. If a block of memory is no longer needed, it will eventually be recycled.

If you are a C++ programmer and use the string class defined by ANSI C++, you will be much more comfortable with the Java String type. C++ string objects also perform automatic allocation and deallocation of memory. The memory management is performed explicitly by constructors, assignment operators, and destructors. However, C++ strings are mutable—you can modify individual characters in a string.

Testing Strings for Equality

To test whether or not two strings are equal, use the equals method; the expression

s.equals(t)

returns true if the strings s and t are equal, false otherwise. Note that s and t can be string variables or string constants. For example,

"Hello".equals(command)

is perfectly legal. To test if two strings are identical except for the upper/lowercase letter distinction, use the equalsIgnoreCase method.

"Hello".equalsIgnoreCase("hello")

Do not use the == operator to test if two strings are equal! It only determines whether or not the strings are stored in the same location. Sure, if strings are in the same location, they must be equal. But it is entirely possible to store multiple copies of identical strings in different places.

String greeting = "Hello"; //initialize greeting to a string
if (greeting == "Hello") . . .
   // probably true
if (greeting.substring(0, 4) == "Hell") . . .
   // probably false

If the virtual machine would always arrange for equal strings to be shared, then you could use == for testing equality. But only string constants are shared, not strings that are the result of operations like + or substring. Therefore, never use == to compare strings or you will have a program with the worst kind of bug—an intermittent one that seems to occur randomly.

If you are used to the C++ string class, you have to be particularly careful about equality testing. The C++ string class does overload the == operator to test for equality of the string contents. It is perhaps unfortunate that Java goes out of its way to give strings the same “look and feel” as numeric values but then makes strings behave like pointers for equality testing. The language designers could have redefined == for strings, just as they made a special arrangement for +. Oh well, every language has its share of inconsistencies.

C programmers never use == to compare strings but use strcmp instead. The Java method compareTo is the exact analog to strcmp. You can use

if (greeting.compareTo("Help") == 0) . . .

but it seems clearer to use equals instead.

If you are used to the C++ string class, you have to be particularly careful about equality testing. The C++ string class does overload the == operator to test for equality of the string contents. It is perhaps unfortunate that Java goes out of its way to give strings the same “look and feel” as numeric values but then makes strings behave like pointers for equality testing. The language designers could have redefined == for strings, just as they made a special arrangement for +. Oh well, every language has its share of inconsistencies.

C programmers never use == to compare strings but use strcmp instead. The Java method compareTo is the exact analog to strcmp. You can use

if (greeting.compareTo("Help") == 0) . . .

but it seems clearer to use equals instead.

The String class in Java contains more than 50 methods. A surprisingly large number of them are sufficiently useful so that we can imagine using them frequently. The following API note summarizes the ones we found most useful.

You will find these API notes throughout the book to help you understand the Java Application Programming Interface (API). Each API note starts with the name of a class such as java.lang.String—the significance of the so-called package name java.lang will be explained in Chapter 5. The class name is followed by the names, explanations, and parameter descriptions of one or more methods.

We typically do not list all methods of a particular class but instead select those that are most commonly used, and describe them in a concise form. For a full listing, consult the on-line documentation.

We also list the version number in which a particular class was introduced. If a method has been added later, it has a separate version number.

You will find these API notes throughout the book to help you understand the Java Application Programming Interface (API). Each API note starts with the name of a class such as java.lang.String—the significance of the so-called package name java.lang will be explained in Chapter 5. The class name is followed by the names, explanations, and parameter descriptions of one or more methods.

We typically do not list all methods of a particular class but instead select those that are most commonly used, and describe them in a concise form. For a full listing, consult the on-line documentation.

We also list the version number in which a particular class was introduced. If a method has been added later, it has a separate version number.

java.lang.String 1.0

  • char charAt(int index)

    returns the character at the specified location.

  • int compareTo(String other)

    returns a negative value if the string comes before other in dictionary order, a positive value if the string comes after other in dictionary order, or 0 if the strings are equal.

  • boolean endsWith(String suffix)

    returns true if the string ends with suffix.

  • boolean equals(Object other)

    returns true if the string equals other.

  • boolean equalsIgnoreCase(String other)

    returns true if the string equals other, except for upper/lowercase distinction.

  • int indexOf(String str)

  • int indexOf(String str, int fromIndex)

    return the start of the first substring equal to str, starting at index 0 or at fromIndex.

  • int lastIndexOf(String str)

  • int lastIndexOf(String str, int fromIndex)

    return the start of the last substring equal to str, starting at the end of the string or at fromIndex.

  • int length()

    returns the length of the string.

  • String replace(char oldChar, char newChar)

    returns a new string that is obtained by replacing all characters oldChar in the string with newChar.

  • boolean startsWith(String prefix)

    returns true if the string begins with prefix.

  • String substring(int beginIndex)

  • String substring(int beginIndex, int endIndex)

    return a new string consisting of all characters from beginIndex until the end of the string or until endIndex (exclusive).

  • String toLowerCase()

    returns a new string containing all characters in the original string, with uppercase characters converted to lower case.

  • String toUpperCase()

    returns a new string containing all characters in the original string, with lowercase characters converted to upper case.

  • String trim()

    returns a new string by eliminating all leading and trailing spaces in the original string.

Reading the On-line API Documentation

As you just saw, the String class has lots of methods. Furthermore, there are hundreds of classes in the standard libraries, with many more methods. It is plainly impossible to remember all useful classes and methods. Therefore, it is essential that you become familiar with the on-line API documentation that lets you look up all classes and methods in the standard library. The API documentation is part of the Java SDK. It is in HTML format.

Point your web browser to the docs/api/index.html subdirectory of your Java SDK installation. You will see a screen as in Figure 3-2.

03fig02.jpgFigure 3-2. The three panes of the API documentation

The screen is organized into three windows. A small window on the top left shows all available packages. Below it, a larger window lists all classes. Click on any class name, and the API documentation for the class is displayed in the large window to the right (see Figure 3-3). For example, to get more information on the methods of the String class, scroll the second window until you see the String link, then click on it.

03fig03.jpgFigure 3-3. Class description for the String class

Then scroll the window on the right until you reach a summary of all methods, sorted in alphabetical order (see Figure 3-4). Click on any method name for a detailed description of the method (see Figure 3-5). For example, if you click on the compareToIgnoreCase link, you get the description of the compareToIgnoreCase method.

03fig04.jpgFigure 3-4. Method summary of the String class

03fig05.jpgFigure 3-5. Detailed description of a String method

Bookmark the docs/api/index.html page in your browser right now.

Bookmark the docs/api/index.html page in your browser right now.

Reading Input

You saw that it is easy to print output to the “standard output device” (that is, the console window) just by calling System.out.println. Unfortunately, it is quite a bit more complex to read keyboard input from the “standard input device.”

However, it is easy to supply a dialog box for keyboard input. The method call

JOptionPane.showInputDialog(promptString)

puts up a dialog box that prompts the user for input (see Figure 3-6). The return value is the string that the user typed.

03fig06.gifFigure 3-6. An input dialog

For example, here is how you can query the name of the user of your program:

String name = JOptionPane.showInputDialog("What is your name?");

To read in a number, you have to work a little harder. The JOptionPane.showInputDialog method returns a string, not a number. You use the Integer.parseInt or Double.parseDouble method to convert the string to its numeric value. For example,

String input = JOptionPane.showInputDialog("How old are you?");
int age = Integer.parseInt(input);

If the user types 40, then the string variable input is set to the string "40". The Integer.parseInt method converts the string to its numeric value, the number 40.

If the parameter of the parseInt method contains non-digits, then the method throws an exception. Unless your program “catches” the exception, the virtual machine terminates the program and prints an error message to the console. You will see in Chapter 11 how to catch exceptions.

If the parameter of the parseInt method contains non-digits, then the method throws an exception. Unless your program “catches” the exception, the virtual machine terminates the program and prints an error message to the console. You will see in Chapter 11 how to catch exceptions.

The program in Example 3-2 asks for the user's name and age and then prints out a message like

Hello, Cay. Next year, you'll be 41

When you run the program, you will see that a first dialog appears to prompt for the name. The dialog goes away, and a second dialog asks for the age. Finally, the reply is displayed in the console window, not in a dialog window. This is not very elegant, of course. You will see in later chapters how to program much more pleasant user interfaces. For now, we'll stick to JOptionPane.showInputDialog and System.out.println because they are easy to use.

Note that the program ends with the method call:

System.exit(0);

Whenever your program calls JOptionPane.showInputDialog, you need to end it with a call to System.exit(0). The reason is a bit technical. Showing a dialog box starts a new thread of control. When the main method exits, the new thread does not automatically terminate. To end all threads, you need to call the System.exit method. (For more information on threads, see Chapter 1 of Volume 2.)

The System.exit method receives an integer parameter, the “exit code” of the program. By convention, a program exits with exit code 0 if it completed successfully, and with a non-zero exit code otherwise. You can use different exit codes to indicate different error conditions. The exiting program communicates the exit code to the operating system. Shell scripts and batch files can then test the exit code.

Finally, note the line

import javax.swing.*;

at the beginning of the program. The JOptionPane class is defined in the javax.swing package. Whenever you use a class that is not defined in the basic java.lang package, you need to use an import directive. We will look at packages and import directives in more detail in Chapter 5.

Example 3-2 InputTest.java

 1. import javax.swing.*;
 2.
 3. public class InputTest
 4. {
 5.    public static void main(String[] args)
 6.    {
 7.       // get first input
 8.       String name = JOptionPane.showInputDialog
 9.          ("What is your name?");
10.
11.       // get second input
12.       String input = JOptionPane.showInputDialog
13.          ("How old are you?");
14.
15.       // convert string to integer value
16.       int age = Integer.parseInt(input);
17.
18.       // display output on console
19.       System.out.println("Hello, " + name +
20.          ". Next year, you'll be " + (age + 1));
21.
22.       System.exit(0);
23.    }
24. }

java.swing.JOptionPane 1.2

  • static String showInputDialog(Object message)

    displays a dialog box with a message prompt, an input field, and “Ok” and “Cancel” buttons. The method returns the string that the user typed.

java.lang.System 1.0

  • static void exit(int status)

    terminates the virtual machine and passes the status code to the operating system. By convention, a non-zero status code indicates an error.

Formatting Output

You can print a number x to the console with the statement System.out.print(x). That command will print x with the maximum number of non-zero digits for that type. For example,

x = 10000.0 / 3.0;
System.out.print(x);

prints

3333.3333333333335

That is a problem if you want to display, for example, dollars and cents.

You can control the display format to arrange your output neatly. The NumberFormat class in the java.text package has three methods that yield standard formatters for

  • numbers

  • currency values

  • percentage values

Suppose that the United States locale is your default locale. (A locale is a set of specifications for country-specific properties of strings and numbers, such as collation order, currency symbol, and so on. Locales are an important concept for writing internationalized applications—programs that are acceptable to users from countries around the world. We will discuss internationalization in Volume 2.) Then, the value 10000.0 / 3.0 will print as

3,333.333
$3,333.33
333,333%

in these three formats. As you can see, the formatter adds the commas that separate the thousands, currency symbols ($), and percent signs.

To obtain a formatter for the default locale, use one of the three methods:

NumberFormat.getNumberInstance()
NumberFormat.getCurrencyInstance()
NumberFormat.getPercentInstance()

Each of these methods returns an object of type NumberFormat. You can use that object to format one or more numbers. You then apply the format method to the NumberFormat object to get a string that contains the formatted number. Once you have the formatted string, you will probably simply display the newly formatted number by printing the string:

double x = 10000.0 / 3.0;
NumberFormat formatter = NumberFormat.getNumberInstance();
String s = formatter.format(x); // the string "3,333.33"
System.out.println(s);

You also may want to set the minimum and maximum number of integer digits or fractional digits to display. You can do this with the setMinimumIntegerDigits, setMinimumFractionDigits, setMaximumIntegerDigits, and setMaximumFractionDigits methods in the NumberFormat class. For example,

double x = 10000.0 / 3.0;
NumberFormat formatter = NumberFormat.getNumberInstance();
formatter.setMaximumFractionDigits(4);
formatter.setMinimumIntegerDigits(6);
String s = formatter.format(x); // the string "003,333.3333"

Setting the maximum number of fractional digits is often useful. The last displayed digit is rounded up if the first discarded digit is 5 or above. If you want to show trailing zeroes, set the minimum number of fractional digits to the same value as the maximum. Otherwise, you should leave the minimum number of fractional digits at the default value, 0.

Setting the number of integer digits is much less common. By specifying a minimum number, you force leading zeroes for smaller values. Specifying a maximum number is downright dangerous—the displayed value is silently truncated, yielding a nicely formatted but very wrong result.

If you are familiar with the C printf function and are longing for its simplicity, check out the Format class at http://www.horstmann.com/corejava.html. It is a Java class that faithfully replicates the behavior of printf. For example, Format.printf("%8.2f", 10000.0 / 3.0) prints the string " 3333.33" (with a leading space to yield a field width of 8 digits, and 2 digits after the decimal point).

If you are familiar with the C printf function and are longing for its simplicity, check out the Format class at http://www.horstmann.com/corejava.html. It is a Java class that faithfully replicates the behavior of printf. For example, Format.printf("%8.2f", 10000.0 / 3.0) prints the string " 3333.33" (with a leading space to yield a field width of 8 digits, and 2 digits after the decimal point).

You can also obtain number formats that are appropriate for different locales. For example, let us look up the number formats that are used by the German locale and use them to print our test output. There is a predefined object named Locale.GERMANY of a type called Locale that knows about German number formatting rules. When we pass that Locale object to the getNumberInstance method, we obtain a formatter that follows those German rules.

double x = 10000.0 / 3.0;
NumberFormat formatter
   = NumberFormat.getNumberInstance(Locale.GERMANY);
System.out.println(formatter.format(x));
formatter = NumberFormat.getCurrencyInstance(Locale.GERMANY);
System.out.println(formatter.format(x));

This code prints the numbers:

3.333,333
3.333,33 DM

Note that the German convention for periods and commas in numbers is the exact opposite of the U.S. convention: a comma is used as the decimal separator, and a period is used to separate thousands. Also, the formatter knows that the currency symbol (DM) is placed after the number

java.text.NumberFormat 1.1

  • static NumberFormat getCurrencyInstance()

    returns a NumberFormat object to convert currency values to strings using the conventions of the current locale.

  • static NumberFormat getNumberInstance()

    returns a NumberFormat object to format numbers using the conventions of the current locale.

  • static NumberFormat getPercentInstance()

    returns a NumberFormat object to convert percentages to strings.

  • String format(double number)

    returns a string that contains the formatted number.

  • void setMaximumFractionDigits(int digits)

    Parameters:

    Digits

    the number of digits to display

    sets the maximum number of digits after the decimal point for the format object. The last displayed digit is rounded.

  • void setMaximumIntegerDigits(int digits)

    Parameters:

    Digits

    the number of digits to display

    sets the maximum number of digits before the decimal point for the format object. Use this method with extreme caution. If you specify too few digits, then the number is simply truncated, displaying a dramatically wrong result!

  • void setMinimumFractionDigits(int digits)

    Parameters:

    Digits

    the number of digits to display

    sets the minimum number of digits after the decimal point for the format object. If the number has fewer fractional digits than the minimum, then trailing zeroes are supplied.

  • void setMinimumIntegerDigits(int digits)

    Parameters:

    Digits

    the number of digits to display

    sets the minimum number of digits before the decimal point for the format object. If the number has fewer digits than the minimum, then leading zeroes are supplied.

  • + Share This
  • 🔖 Save To Your Account