- Hypertext Transfer Protocol
- The Structure of an HTTP Message
- The Structure of an HTTP Request
- The Structure of an HTTP Response
- Testing an HTTP Connection
- Passing Request Parameters
- Client Page Caching
- URI Redirection and Page Refresh
- Persistent Connections
- Using HTTP from Within a Java Program
- Summary
Using HTTP from Within a Java Program
Before leaving the world of HTTP to look at the basic principles of servlets and JSPs, you might find it useful to know how to retrieve pages from a Web server using a Java client. You can use this technique to pull information from an existing Web site.
A simple example would be a Java program that needs a current exchange rate for a particular currency. The Java program could connect to a Web site that is known to display currency exchange rates (many news and financial sites display this information). The resultant response body can be examined and the exchange rate extracted from the other data (including the HTML formatting) on the Web page.
The complexity of writing a data extraction program such as this currency exchange rate example is in finding the required information within the plethora of HTML elements and other data on the page.
NOTE
With the trend toward writing Web pages in HTML that are also well-formed XML documents, or even using XML or XHTML, you can use the SAX and DOM XML support classes in JAXP and Java 1.4 (and later) to simplify extracting page data from the HTML elements.
Handling HTTP is very simple, as the java.net.URL class encapsulates the functionality you need. Listing 3.1 shows a simple program that uses GET to request a page from the URL given as the command-line parameter and display it on System.out.
Listing 3.1 URLGet.java
import java.io.*; import java.net.*; public class URLGet { public static void main(String[] args) { BufferedReader in=null; if (args.length == 1) { try { URL url = new URL(args[0]); in = new BufferedReader( new InputStreamReader(url.openStream())); String line=null; while ((line=in.readLine()) != null) System.out.println(line); } catch (MalformedURLException ex) { System.err.println(ex); } catch (FileNotFoundException ex) { System.err.println("Failed to open stream to URL: "+ex); } catch (IOException ex) { System.err.println("Error reading URL content: "+ex); } if (in != null) try {in.close();} catch (IOException ex) {} } else System.err.println ("Usage: URLGet URL"); } }
The URL object created from the first command-line argument defines the required page. When the URL.openStream() method is called, the HTTP request is sent to the server and the response body is made available as an InputStream.
As an example of a more sophisticated request, the program in Listing 3.2 will accept a command-line URL and an optional list of HTML form parameters. If any form parameters are specified, a POST request is issued; otherwise, a GET request is used.
Listing 3.2 URLRequest.java
import java.io.*; import java.net.*; import java.util.*; public class URLRequest { public static void main(String[] args) { BufferedReader in = null; if (args.length>0) { try { URL url = new URL(args[0]); URLConnection connection = url.openConnection(); connection.setRequestProperty( "User-Agent","Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)"); if (args.length > 1) { connection.setDoOutput(true); Writer post = new OutputStreamWriter( connection.getOutputStream()); for (int i=1; i<args.length; i++) { if (i > 1) post.write('&'); post.write(encodeParameter(args[i])); } post.write("\r\n"); post.close(); } connection.connect(); Map headers = connection.getHeaderFields(); Iterator it = headers.keySet().iterator(); while (it.hasNext()) { String key = (String)it.next(); System.out.println(key+": "+headers.get(key)); } System.out.println(); in = new BufferedReader(new InputStreamReader( connection.getInputStream())); String line=null; while ((line=in.readLine()) != null) System.out.println(line); } catch (MalformedURLException ex) { System.err.println(ex); } catch (FileNotFoundException ex) { System.err.println("Failed to open stream to URL: "+ex); } catch (IOException ex) { System.err.println("Error reading URL content: "+ex); } if (in != null) try {in.close();} catch (IOException ex) {} } else { System.err.println ("Usage: URLRequest URL (uses GET)"); System.err.println ( " URLRequest URL parameters... (uses POST)"); } } private static String encodeParameter(String parameter) { StringBuffer result = new StringBuffer(); try { String name = null; String value = ""; int ix = parameter.indexOf('='); if (ix == -1) name = parameter; else { name = parameter.substring(0,ix); value = parameter.substring(ix+1); } result.append(name); result.append('='); result.append(URLEncoder.encode(value,"UTF-8")); } catch (UnsupportedEncodingException ex) { System.err.println(ex); } return result.toString(); } }
Listing 3.2 shows most of the salient features of the java.net.URL and java.net.URLConnection classes that are used to access a Web server. The method encodedParameter() encodes the request parameters using the RFC 2277 scheme; the java.net.URLEncoder and java.net.URLDecoder classes support this encoding scheme.
Listing 3.2 also shows how to define header fields in the request by setting the User-Agent field to masquerade as Internet Explorer Version 6. The header fields in the response are displayed with the response body.