URLReader

Copyright (c) 2002 John Henckel, john@formulus.DELETETHIS.com

Permission to use, copy, modify, distribute and sell this software and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appear in all copies. All programs contained herein are provided to you "as is". The implied warranties of merchantability and fitness for a particular purpose are expressly disclaimed.

source code

URLReader

This is a java class to facilitate reading data from web files.

Quick and dirty way to use URLReader

This simplest way to use URLReader is with one of the following static methods. You just call the method and get the data from the resource. If the resource is text (like HTML) use one of the string methods, otherwise use the bytes method.

That's it! that's all you need to know. If you want to do something more fancy like passing cookies, setting the referer, or using the POST method, then you need to keep reading.

The more complicated way to use a URLReader object

A URLReader object can be in one of these states

  1. zombie - doesn't have a valid URL
  2. ready - has url, but isn't connected yet
  3. open - the data can be read from the resource
  4. done - end of data, or any kind of error

These are the steps to create and use a URLReader object.

  1. create the object
  2. call as many "set" methods as you like, such as setRequestProperty, setIfModifiedSince, setPost, setURL, or setRequestMethod.
  3. now the URLReader object is "ready" to be connected
  4. call the connect() method and check the status code (this step is optional)
  5. call any of the "get" methods, such as getResponse-, getHeader-, getContent-, getBufferedReader, or getInputStream.
  6. after you are done getting the resource, you can throw it away, or you can call disconnect. After you call disconnect the URLReader object is once again in "ready" state.

After disconnect, you can re-connect and read the data again (and again...), or you can change the URL and read different data. Just remember that everything you set, like postData or usesCaches will stay that way unless you change it! If you try to read data from a zombie, then "null" is returned.

To open the URLReader you call the "connect" method. Also any of the getResponse, etc. methods will automatically connect if the URLReader is ready. The "ready()" method returns true if it is not connected, but is ready to connect. After a connect, the disconnect() method, restores the ready state.

You might ask yourself "why do we need this stuff? aren't the URL and the HttpURLConnection classes good enough?" Well, yes they are almost good enough. Sun's URLConnection classes have three major flaws that are fixed by this class.

  1. URLConnection cannot retry. You get one chance and if 400 or 504 is returned, then TOO BAD. If you want to try again you have to throw away the URLConnection and start over with a new one.
  2. URLConnection does not maintain cookies. This is especially a problem if followRedirects is enabled, because any cookies that were set during the redirect will be lost.
  3. URLConnection.getHeaderField(String) method does not return multiple values. If a header appears more than once, only the last one is returned. The URLReader fixes this problem.

A note about SSL. This class does not have any special code for SSL. However, it can handle SSL connections just fine. All you need to do is put "https:" in front of the URL. Make sure you have installed JSSE and configured it. The configuration must add an "https" URLStreamHandler implementation to the pkgs list


  System.setProperty("java.protocol.handler.pkgs",
                     "com.sun.net.ssl.internal.www.protocol");
  
This will enable the URLStreamHandlerFactory to know how to handle https protocol. The result of "openConnection" will be an instance of HttpsURLConnection which is a subclass of HttpURLConnection.


This page hosted by Get your own Free Home Page