[Python-talk] Excel vs. SQLite w Python ?
Ric Werme
ewerme at comcast.net
Sun May 18 12:30:54 EDT 2008
Peter Courlis wonders:
> Thinking further about this, I wondered if there was a Python module
> that interfaces Excel (or perhaps Perl) ?
Beats me, I never used Excel. I do use gnumeric for tracking snow fall
and snow depths around New England. Gnumeric saves things in compressed
XML which always amuses me - it they hadn't used such a bloated encoding
they wouldn't need to compress it. :-)
At any rate, it is pretty easy to parse, the whole program is attached.
Don't expect elegance or error tolerance, if it breaks, it affects only me.
The result is stuck in some other HTML, the result is at
http://wermenh.com/sdd/ne-0708.html . If Excel has a XML output it
might be as easy to handle, at least once you get past all the Microsoft
crud that is likely to be there.
> A third alternative would be to use MySql application.
MySQL works too, see http://wermenh.com/wx/vantage_software.html for how I
handle some of my weather data for http://home.comcast.net/~ewerme/wx/current.h
tm
-Ric Werme
# Program to scan ne.gnumeric to pull out the snowfall and SDD information,
# and create the table content that will be used to create the web page.
from xml.dom.minidom import parse, parseString
import sys
# The .gnumeric file xml data looks something like:
# <gnm:Cell Col="0" Row="16" ValueType="60">monthly</gnm:Cell>
# <gnm:Cell Col="1" Row="16" ValueType="60">Poland Spring ME</gnm:Cell>
# <gnm:Cell Col="4" Row="16" ValueType="40">5.6</gnm:Cell>
# <gnm:Cell Col="5" Row="16" ValueType="30">22</gnm:Cell>
# <gnm:Cell Col="6" Row="16" ValueType="40">15.7</gnm:Cell>
# <gnm:Cell Col="7" Row="16" ValueType="30">142</gnm:Cell>
ne = parse(sys.argv[1])
cells = ne.getElementsByTagName('gnm:Cell')
f = open('ne.monthly', 'w')
rowtype = None
needeol = 0
summary = []
# Scan each row for active data (the first column will say "monthly") and
# translate the data in to HTML. We have to accumulate the total snowfall
# and SDD because the XML data for the "summary" lines will be like:
# <gnm:Cell Col="0" Row="41" ValueType="60">summary</gnm:Cell>
# <gnm:Cell Col="1" Row="41" ValueType="60">Poland Spring ME</gnm:Cell>
# <gnm:Cell Col="2" Row="41">=C17+E17+G17+I17+K17+M17+O17+Q17</gnm:Cell>
# <gnm:Cell Col="3" Row="41">=D17+F17+H17+J17+L17+N17+P17+R17</gnm:Cell>
# <gnm:Cell Col="4" Row="41">=D42/C42</gnm:Cell>
# <gnm:Cell Col="5" Row="41" ExprID="2"/>
# Spreadsheet cell data is recorded left-to-right, top-to-bottom, and we
# believe that. The column 0 data is used to start a new line and finish
# the old one. When we see a summary line, we skip the rest of the data
# because we've computed what we need. That's kept in a list of tuples
# at summary.
for element in cells:
col = int(element.getAttribute('Col'))
row = int(element.getAttribute('Row'))
try:
valuetype = int(element.getAttribute('ValueType'))
except:
print '*** Bogus ValueType at row %d+1, col %d+1, content %s' % \
(row, col, element.childNodes[0].wholeText)
valuetype = 10
if valuetype == 10: value = None
else: value = element.childNodes[0].wholeText
# print 'row %s, col %s, value %s' % (row, col, value)
if col == 0:
rowtype = value
if needeol == 1:
f.write(' </tr>\n')
summary.append((placename, totalsnow, totalsdd))
needeol = 0
if rowtype == 'summary':
break
totalsnow = 0.0
totalsdd = 0.0
elif col == 1:
placename = value
if rowtype == 'monthly':
f.write(' <tr align=right>\n <th nowrap>%s</th>' % value)
needeol = 1
expect_col = 2
elif needeol == 1 and (valuetype == 30 or valuetype == 40):
# Months of no activity have no cell data.
while expect_col < col:
expect_col += 2
f.write(' <td></td>\n <td></td>\n')
expect_col = col + 1
# Print snow, sdd, snow, sdd data across array
f.write(' <td>%s</td>\n' % value)
# Even columns are snow, odd are SDD, we have to sum them up ourselves.
if col & 1:
totalsdd += float(value)
else:
totalsnow += float(value)
f.close()
f = open('ne.summary', 'w')
# f = sys.stdout
# print 'Summary:'
for (placename, snow, sdd) in summary:
f.write(' <tr align=right>\n <th nowrap>%s</th>\n' % placename)
f.write(' <td>%g</td>\n' % snow)
f.write(' <td>%g</td>\n' % sdd)
if snow > 0.0:
pq = '%.1f' % (sdd / snow)
else:
pq = '...'
# print placename, snow, sdd, pq
f.write(' <td>%s</td>\n </tr>\n' % pq)
f.close()
More information about the Python-talk
mailing list