Home > Articles > Programming > Java

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

17.2 Parsing Strings by Using StringTokenizer

A common task when doing network programming is to break a large string down into various constituents. A developer could accomplish this task by using low-level String methods such as indexOf and substring to return substrings bounded by certain delimiters. However, the Java platform has a built-in class to simplify this process: the StringTokenizer class. This class isn't specific to network programming (the class is located in java.util, not java.net), but because string processing tends to be a large part of client-server programming, we discuss it here.

The StringTokenizer Class

The idea is that you build a tokenizer from an initial string, then retrieve tokens one at a time with nextToken, either based on a set of delimiters defined when the tokenizer was created or as an optional argument to nextToken. You can also see how many tokens are remaining (countTokens) or simply test whether the number of tokens remaining is nonzero (hasMoreTokens). The most common methods are summarized below.

Constructors

public StringTokenizer(String input)

This constructor builds a tokenizer from the input string, using white space (space, tab, newline, return) as the set of delimiters. The delimiters will not be included as part of the tokens returned.

public StringTokenizer(String input, String delimiters)

This constructor creates a tokenizer from the input string, using the specified delimiters. The delimiters will not be included as part of the tokens returned.

public StringTokenizer(String input, String delimiters, boolean includeDelimiters)

This constructor builds a tokenizer from the input string using the specified delimiters. The delimiters will be included as part of the tokens returned if the third argument is true.

Methods

public String nextToken()

This method returns the next token. The method throws a NoSuchElementException if no characters remain or only delimiter characters remain.

public String nextToken(String delimiters)

This method changes the set of delimiters, then returns the next token. The nextToken method throws a NoSuchElementException if no characters remain or only delimiter characters remain.

public int countTokens()

This method returns the number of tokens remaining, based on the current set of delimiters.

public boolean hasMoreTokens()

This method determines whether any tokens remain, based on the current set of delimiters. Most applications should either check for tokens before calling nextToken or catch a NoSuchElementException when calling next_Token. Note that hasMoreTokens has the side effect of advancing the internal counter, which yields unexpected results when doing the rare but possible sequence of checking hasMoreTokens with one delimiter set, then calling nextToken with another delimiter set.

Example: Interactive Tokenizer

A good way to get a feel for how StringTokenizer works is to try a bunch of test cases. Listing 17.4 gives a simple class that lets you enter an input string and a set of delimiters on the command line and prints the resultant tokens one to a line.

Listing 17.4 TokTest.java

import java.util.StringTokenizer;

/** Prints the tokens resulting from treating the first
 * command-line argument as the string to be tokenized
 * and the second as the delimiter set.
 */

public class TokTest {
 public static void main(String[] args) {
  if (args.length == 2) {
   String input = args[0], delimiters = args[1];
   StringTokenizer tok = 
    new StringTokenizer(input, delimiters);
   while (tok.hasMoreTokens()) {
    System.out.println(tok.nextToken());
   }
  } else {
   System.out.println
    ("Usage: java TokTest string delimeters");
  }
 }
}

Here is TokTest in action:

> java TokTest http://www.microsoft.com/~gates/ :/.
http
www
microsoft
com
~gates
> java TokTest "if (tok.hasMoreTokens()) {" "(){. "
if
tok
hasMoreTokens
  • + Share This
  • 🔖 Save To Your Account