|
Size: 2875
Comment:
|
← Revision 3 as of 2009-09-20 21:46:17 ⇥
Size: 2875
Comment: converted to 1.6 markup
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 12: | Line 12: |
| [[BR]] | <<BR>> |
Use Case: Access some URLs
Description
I have some URLs, and want to access those documents in my application.
Background
This is the most obvious use of an HTTP client component. It is important for the ease of use of the API.
This use case is about a list of URLs rather than a single URL to highlight the difference between initializations and per-access activities.
Variations
- I want to save the documents to the file system, too.
Related / Out of Scope
- access from multiple threads
- access with session cookie
- access with authentication
- access through proxy
Discussion
There is a trade-off between ease of use and flexibility. The standard Java library provides the required functionality for this use case in the java.net package through the URL and HttpURLConnection classes. These are easy to use, but exhibit major drawbacks once you require flexibility.
The HttpClient project was originally started because HttpURLConnection is not good enough. HttpClient is not and will hardly ever be as easy to use. That said, it still shouldn't be unnecessary hard to use either.
If accessing documents is your only concern you may be better off using the Java built-in stuff. But as soon as you have to deal with timeouts, error handling, different cookie specs, connection management, multiple user sessions and all kinds of stuff, HttpClient starts to shine.
Solution
Standard Solution
Additional preparations are required to make this solution work for HTTPS.
create an HttpClient object
- for each URL:
create a GetMethod for the URL
ask the HttpClient to execute the GetMethod to obtain an HttpResponse
get the HttpIncomingEntity type you want from the HttpResponse
- once you're done with the entity, release the connection
Bare Bones Solution
The bare bones solution will not work for HTTPS.
create an NIOHttpClientConnection object
- for each URL:
create a GetMethod for the URL
send the GetMethod over the HttpConnection to obtain an HttpResponse
get the HttpIncomingEntity you want from the HttpResponse
- reset the connection (? - implementation detail, to be decided later)
Variation: Save to File System
in addition to one of the solutions above,...
- for each URL:
the HttpIncomingEntity you want is a StreamIcEntity
get the InputStream from the StreamIcEntity
open a FileOutputStream
- read chunks of data from the input stream into a buffer and write them to the output stream. 4K or 16K are common choices for the buffer size.
- close the input and output streams