[Python-talk] New Kent's Korner: urllib2 cookbook
python at venix.com
Tue Mar 25 09:19:44 EDT 2008
On Tue, 2008-03-25 at 08:30 -0400, Kent Johnson wrote:
> Lloyd Kvam wrote:
> > On Tue, 2008-03-25 at 06:23 -0400, Kent Johnson wrote:
> >> Draft notes for the next Kent's Korner presentation are available at
> >> http://personalpages.tds.net/~kent37/kk/00010.html
> >> Comments welcome.
> > Looks good. If you want to check that a web site is functioning, the
> > loss of status information mentioned at the end (Other Resources
> > section) is a killer.
> Maybe I gave the wrong impression. The status response from the request
> that is returned, is available in f.code. This should be plenty for
> checking for a functioning web site. But imagine this scenario:
> GET /some/old/location
> Returns 301 - permanent redirect - to /new/location
> GET /new/location
> Returns 200
> The two requests occur in a single call to urlopen(). f.code will be
> 200. f.geturl() will be /new/location, so you can detect that a redirect
> took place, but there is no way to distinguish a 301 (permanent) from
> 302 (temp) redirect. The handler in feedparser remembers the *original*
> code as well as the final one.
If I recall correctly, the 200 status gets discarded, so a reference to
f.code will get an attribute error. I know I've been using a modified
urllib2 simply to get better status reporting. (Also the ability to
issue a HEAD request.)
The python2.5 urllib2.py is pretty different from the customized one I
have in my lib. I'll try to get a useful list of the changes so that
you can have it for any discussion.
More information about the Python-talk