[Python-talk] Can python program read index.jsp web page?

Kent Johnson kent37 at tds.net
Fri Aug 10 07:55:58 EDT 2007


Alex Hewitt wrote:
> On Fri, 2007-08-10 at 00:24 -0400, Kent Johnson wrote:
>>> Can a Python program read these pages? I tried
>>> accessing them using urllib but no joy. 
>> Should be able to. It's just text on the wire after all. What did you 
>> try? What happened?

Still wondering...

>> What happens when you go to the page in the browser? Is there any kind 
>> of authentication? What headers & status do you get from
>> curl -i http://my.server.com/index.jsp
>> ?\
> 
> I don't see anything active in the browser, no popup or anything like
> that but get the impression that somehow I've been logged in silently.
> If I can capture that traffic, assuming there is a handshake going on, I
> might be able to write Python code to mimic what's going on.

Possibly the browser has an authentication cookie that is allowing you 
to bypass some kind of login. If you are on a windows client there may 
be other magic methods to authenticate, I'm not sure.

It would be very useful to see the HTTP status and headers coming back 
from the server on your request. Presumably they include either an 
authentication request or a redirect to a login page. Did you try 
looking at them in curl?

>>> My motive in
>>> doing this is to use a Python program to exercise some of the
>>> application functions but I can't do that if I can't read the pages in
>>> the first place.

Not necessarily true. For example you can make a POST or GET request of 
the app providing parameters that make it do something useful without 
having read the page containing the form or link that a user would use 
in a browser.

Can you get any cooperation from someone who knows the server app? Cuz 
what you really want to know is, what is the API the server exposes to 
the client (in the form of URLs).

BTW if the web app makes heavy use of JavaScript that could be another 
deal-killer. But if it is plain HTML you should be able to do what you want.

Kent


More information about the Python-talk mailing list