Home > Articles > Programming > Java

  • Print
  • + Share This
This chapter is from the book

This chapter is from the book

17.5 The URL Class

The URL class provides simple access to URLs. The class automatically parses a string for you, letting you retrieve the protocol (e.g., http), host (e.g., java.sun.com), port (e.g., 80), and filename (e.g., /reports/earnings.html) separately. The URL class also provides an easy-to-use interface for reading remote files.

Reading from a URL

Although writing a client to explicitly connect to an HTTP server and retrieve a URL was quite simple, this task is so common that the Java programming language provides a helper class: java.net.URL. We saw this class when we looked at applets (see Section 9.5, "Other Applet Methods"): a URL object of this type that needed to be passed to getAppletContext().showDocument. However, the URL class can also be used to parse a string representing a URL and read the contents. An example of parsing a URL is shown in Listing 17.11.

Listing 17.11 UrlRetriever2.java

import java.net.*;
import java.io.*;

/** Read a remote file using the standard URL class
 * instead of connecting explicitly to the HTTP server.
 */

public class UrlRetriever2 {
 public static void main(String[] args) {
  checkUsage(args);
  try {
   URL url = new URL(args[0]);
   BufferedReader in = new BufferedReader(
    new InputStreamReader(url.openStream()));
   String line;
   while ((line = in.readLine()) != null) {
    System.out.println("> " + line);
   }
   in.close();
  } catch(MalformedURLException mue) { // URL constructor
    System.out.println(args[0] + "is an invalid URL: " + mue);
  } catch(IOException ioe) { // Stream constructors
   System.out.println("IOException: " + ioe);
  }
 }

 private static void checkUsage(String[] args) {
  if (args.length != 1) {
   System.out.println("Usage: UrlRetriever2 <URL>");
   System.exit(-1);
  }
 }
}

Here is the UrlRetriever2 in action:

Prompt> java UrlRetriever2 http://www.whitehouse.gov/
> <HTML>
> <HEAD>
> <TITLE>Welcome To The White House</TITLE>
> </HEAD>
> ... Remainder of HTML document omitted ...
> </HTML>

This implementation just prints out the resultant document, not the HTTP response lines included in the original "raw" UrlRetriever class. However, another Java class called URLConnection will supply this information. Create a URLCon_nection object by calling the openConnection method of an existing URL, then use methods such as getContentType and getLastModified to retrieve the response header information. See the on-line API for java.net.URLConnection for more details.

Other Useful Methods of the URL Class

The most valuable use of a URL object is to use the constructor to parse a string representation and then to use openStream to provide an InputStream for reading. However, the class is useful in a number of other ways, as outlined in the following sections.

public URL(String absoluteSpec)
public URL(URL base, String relativeSpec)
public URL(String protocol, String host, String file)
public URL(String protocol, String host, int port, String file)

These four constructors build a URL in different ways. All throw a MalformedURLException.

public String getFile()

This method returns the filename (URI) part of the URL. See the output following Listing 17.12.

public String getHost()

This method returns the hostname part of the URL. See the output following Listing 17.12.

public int getPort()

This method returns the port if one was explicitly specified. If not, it returns –1 (not 80). See the output following Listing 17.12.

public String getProtocol()

This method returns the protocol part of the URL (i.e., http). See the output following Listing 17.12.

public String getRef()

The getRef method returns the "reference" (i.e., section heading) part of the URL. See the output following Listing 17.12.

public final InputStream openStream()

This method returns the input stream that can be used for reading, as used in the UrlRetriever2 class. The method can also throw an IOException.

public URLConnection openConnection()

This method yields a URLConnection that can be used to retrieve header lines and (for POST requests) to supply data to the HTTP server. The POST method is discussed in Chapter 19 (Server-Side Java: Servlets).

public String toExternalForm()

This method gives the string representation of the URL, useful for printouts. This method is identical to toString.

Listing 17.12 gives an example of some of these methods.

Listing 17.12 UrlTest.java

import java.net.*;

/** Read a URL from the command line, then print
 * the various components.
 */

public class UrlTest {
 public static void main(String[] args) {
  if (args.length == 1) {
   try {
    URL url = new URL(args[0]);
    System.out.println
     ("URL: " + url.toExternalForm() + "\n" +
      " File:   " + url.getFile() + "\n" +
      " Host:   " + url.getHost() + "\n" +
      " Port:   " + url.getPort() + "\n" +
      " Protocol: " + url.getProtocol() + "\n" +
      " Reference: " + url.getRef());
   } catch(MalformedURLException mue) {
    System.out.println("Bad URL.");
   }
  } else
   System.out.println("Usage: UrlTest <URL>");
 }
}

Here's UrlTest in action:

> java UrlTest http://www.irs.gov/mission/#squeeze-them-dry
URL: http://www.irs.gov/mission/#squeeze-them-dry
 File:   /mission/
 Host:   http://www.irs.gov
 Port:   -1
 Protocol: http
 Reference: squeeze-them-dry
  • + Share This
  • 🔖 Save To Your Account