[Python-talk] Kent's Korner?

Lloyd Kvam python at venix.com
Fri Oct 19 11:03:45 EDT 2007


On Fri, 2007-10-19 at 09:00 -0400, Ric Werme wrote:
> One thing I was thinking of using Beautiful Soup for was in code I
> need
> for a Peridocals Mail permit to convert people's addresses to the ZIP
> +4
> code.  See http://zip4.usps.com/zip4/welcome.jsp .  The result is a
> page
> with a 200 line body (and 700 line head) and changes every so often, I
> think
> in part to annoy bulk converters (I do a dozen or two a month).
> 
> Whenever they change it, I usually find out a few days before I print
> labels, so it's a scramble to fish out the results from the new style.
> 

I think you'll find BeautifulSoup will make your code more dependent on
the site structure and less dependent on the actual strings that get
used.  That's often a win since the overall structure is often more
fixed than the literals you seem to be relying on in your state machine.
Things like line breaks are relatively unimportant in HTML source, but
the state machine depends on them .

On the other hand, your state machine does look like it's easy to work
on.  I would guess that you could invest the time to get BS working and
then see which is easier to keep on track.

-- 
Lloyd Kvam
Venix Corp
DLSLUG/GNHLUG library
http://www.librarything.com/catalog.php?view=dlslug



More information about the Python-talk mailing list