[Python-talk] Calling .next() - Re: PySIG notes, 26-April-2007
Bill Sconce
sconce at in-spec-inc.com
Mon Apr 30 15:42:10 EDT 2007
On Fri, 27 Apr 2007 17:28:59 -0400
Ted Roche <tedroche at tedroche.com> wrote:
> Thirteen attendees made it to the April meeting of the Python Special
> Interest Group...
Superlative writeup, Ted! (Once again.) Thanks! And thanks to
everyone who made it such a success (especially to Janet for the
cookies, and to Ted and Mark for the milk. We're over the one-gallon-
per-meeting threshold, evidently. :)
I promised to write up a twist about iterators which Kent showed us.
Here it is.
Background: to process a file, you often write something like:
#0 --------------------------------------
csvs = open('invoices.csv', 'r')
for csv_line in csvs:
do_something(csv_line)
...
csvs.close()
I.e., loop through the file, doing something for each line. Now say
that the first two lines of your file are something which you wish to
ignore, such as column-header lines in a CSV file from a spreadsheet.
You might write (and I used to, before Kent's presentation) like this:
#1 --------------------------------------
csvs = open('invoices.csv', 'r')
c = 0
for csv_line in csvs:
c += 1
if c < 3:
continue
do_something(csv_line)
...
csvs.close()
Here's a better way, based on the fact that you can call an iterator
explicitly and then "use the rest" of the iterator in a for loop in
the usual (implicit) way:
#2 --------------------------------------
csvs = open('invoices.csv', 'r')
csvs.next()
csvs.next()
for csv_line in csvs:
do_something(csv_line)
...
csvs.close()
Explanation: as Kent explained, a file is an iterator; the for loop
operates by calling the iterator's .next() function under the covers.
Here we "use up" the first two values yielded by the file's iterator
and THEN enter the "for" loop.
Although this is a trivial case it's realistic and very Pythonic. It
simplifies and gets rid of cruft. You get:
o Improved readability
+ Less clutter inside the loop, where the real work is
+ Less clutter period(*)
o Less unit testing
+ For this writeup I had to retest #1 a few times to get it right.
+ (How many times? I'm not talking.)
+ (#2 ran the first time.)
o Better maintainability
+ Code which isn't there doesn't break.
+ You don't have to read through it later if it isn't there.
(*) Just writing the clutter takes a lot of cycles:
o What do I want to call the local variable?
o Do I want to increment it before the loop body? After?
o Will I forget to increment it? :)
o Do I want to start the control variable at 0 and count up?
o Do I want to start the control variable at 1 and count up?
o Do I want to start the control variable at 2 and count down?
o Will I be off by one? (etc.)
It's easier to do these things in Python than in other languages,
but what's easier still is to not do them.
There's no need to ever code like #1 again. Thanks, Kent!
-Bill
P.S. You could even have occasion to call .next() *inside* a loop:
#3 --------------------------------------
csvs = open('invoices.csv', 'r')
csvs.next()
csvs.next()
for csv_line in csvs:
do_something(csv_line)
if has_a_trailer_record(csv_line):
trailer_record = csvs.next()
do_something_else(trailer_record)
...
csvs.close()
More information about the Python-talk
mailing list