I was just writing some HTTP-related code using java.net.URL when I noticed that Apache httpclient 4.0's API seems to want java.net.URI instances. "Why's that, I wonder?" The answer, it seems, is that Java's java.net.URL class is broken: its equals() method is blocking! It goes out on the network and does a reverse lookup of the hostname. This is very unfortunate since in every other way that class is what I want.
From the Javadoc for java.net.URL#equals:
Two hosts are considered equivalent if both host names can be resolved into the same IP addresses; else if either host name can't be resolved, the host names must be equal without regard to case; or both host names equal to null. Since hosts comparison requires name resolution, this operation is a blocking operation. (Emphasis mine)
Good times. So, to avoid abitrary thread "hanging" at some point down the road I guess I'll use java.net.URI. Too bad these are all valid URIs, but nonsense in an HTTP context: "mailto:email@example.com", "abc:123", "quux"
This begs the question: what precisely is the difference between a URI and URL? There is tons written on this (Google it), but I'll add my semi-informed $0.02 as well:
URI: an identifier (name) for a resource. Doesn't necessarily say anything about how to locate the idientified resource, but sometimes does. e.g. "/foo", "http://test.com/bar", "x:y:z/a/b/c"
URL: a URI that MUST include how to locate the resource. i.e. it starts with "http", "https", "ftp", etc. e.g. "http://www.google.com", "https://bank.com", http://abc.com/foo/bar/baz.html"
So, URI is very general, and URLs are a specialization of URIs. There is another subset of URI called URN that adds even more complexity, so I'm going mostly ignore that here. I'll just paraphrase from the SO link below and say that URNs are supposed to be a unique name (over time and space) for a resource, and they say nothing about locating said resource.