|
Having experimented with the small-scale simulations, you may be wondering how they were written and put onto the Web. This little article gives a short non-technical account of how we did it. Links to detailed references can be found at the end.
Our modellers wrote the simulations in Excel, as spreadsheets. This is something they are very familiar with, but it did pose some problems in getting the models onto the Web. Which, of course, we wanted to do, so that we could publish them to the widest possible audience. If you run Windows and have Excel on your computer, perhaps you know that if you tell Internet Explorer to
go to a Web address that points at a spreadsheet, then it will start up Excel and display the spreadsheet for you. The problem is that not everyone uses Internet Explorer or Windows. Even for those who do, Excel is not the best solution, because it forces all the simulations' inputs and outputs into a grid, which is not too easy to read. We wanted to display our simulations
using the full freedom of layout permitted on the Web.
The answer was to write a "spreadsheet evaluator": a program that reads spreadsheet files, extracts the formulae from them, and evaluates these to get the answers. This ran on our Web server, and could display its results as HTML. In essence, this is simple, but as John McCabe said in his novel Stickleback, essence and programming are contradictory notions.
Programming is never simple, and always takes longer than you expect, even after you have allowed for the fact that it takes longer than you expect.
One problem was making the evaluator read the spreadsheets. Normal Excel files are very complex, and hard to get information about. So I didn't even try; instead, I asked our modellers to save their simulations in a simpler format called SYLK. These files are plain text, and contain the formulae in conventional mathematical notation, so it's easy to use standard
computing techniques to read them.
When evaluating a spreadsheet, it is important to do the calculations in the right order. For example, consider the house prices simulation. This contains a formula which calculates supply from cost of land, and another which calculates equilibrium price from supply. If the spreadsheet evaluator works out the second before the
first, it won't have a value for supply. So before doing any calculations, it has to determine how all the spreadsheet cells depend on one another, and then sort the formulae so that the ones which depend on nothing else are calculated first, then the ones which depend on those are calculated, and so on until we have the results.
The evaluator is actually part of a larger project, which I call Model Master, for helping people write spreadsheets safely. Why should that be necessary? Consider this example:
Two spreadsheets with 15,000 cells were used to project the market for CAD equipment. The spreadsheet's author programmed the spreadsheet to round its numbers off to whole dollars. But this made the inflation multiplier, which should have been 1.06, become 1. Without inflation, the spreadsheet underestimated the market by $36 million.
or this:
A Florida construction company used the command @sum in a Lotus spreadsheet to total numbers in a range of cells. When somebody put a new item into the spreadsheet, they overlooked changing the range, so @sum ignored the new item. This caused the company to underbid the project by a quarter of a million dollars. The company sued Lotus.
The problem is that spreadsheets can be very hard to read and debug: because they don't use meaningful names, but cell identifiers such as C16; because the contents of a spreadsheet can't be printed in the way a normal program can. So I have been developing a set of tools, Model Master, for reading and analysing spreadsheets, and to implement a programming language that
is less error-prone for writing spreadsheets than is Excel. The spreadsheet evaluator was a by-product of this work.
Anyway, once the evaluator was working, it had to be linked to our Web server so that when students submitted data from a simulation input form, it would go to the server, thence to the evaluator, which would work out the results, and send them back as an HTML page. Linking programs like this to Web servers is easy if one uses a programming language called Java. Java is
portable, meaning that it's easy to write programs on one machine and run them on another. So it was possible to develop and test on a laptop which had its own Web server, and when the programs were working there, upload them to the Biz/Ed server.
Although portable, Java is philosophically defective. It is designed so that one has to treat everything in the program as an "object": an entity with its own location and identity, which can change with time. This makes sense in some situations, but is completely inappropriate when programming with mathematical entities such as the numbers that make up a spreadsheet.
I overcame this by using another language named Kawa. This is as portable as Java (it compiles programs into Java code), but has a much more "mathematical" approach, viewing the world in terms of numbers and other mathematical entities, and functions that operate on these. That's what one needs when doing this kind of computer-science work.
Another nice thing about Kawa is that it is free, being provided by GNU, an organisation devoted to the ideal that, as a matter of personal liberty:
Free software is a matter of freedom: people should be free to use software in all the ways that are socially useful. Software differs from material objects - such as chairs, sandwiches, and gasoline - in that it can be copied and changed much more easily. These possibilities make software as useful as it is; we believe software users should be able to make use of them.
As mentioned at the start of this article, we wrote the spreadsheet evaluator so that our modellers could exploit the freedom of layout that the Web offers, allowing inputs and outputs to appear anywhere in a Web page - in paragraphs, in headings, in tables - rather than being constrained to the grid of a spreadsheet. Since HTML - the language in which Web pages are written -
can only describe static pages whose content never changes, it is not suitable for the models' output pages, where the content changes on each run. So I devised a system which allows our staff to write HTML, but with special commands in it where the spreadsheet output should appear. For example, the house price model calculates its "New Price" result in cell 16 of
the spreadsheet. So, for example, with this system, to make the output page, we could write
<TABLE>
<TR>
<TD>New Price</TD>
<TD>[ (output "C16") ]</TD>
</TR>
...
</TABLE>
Here, the TABLE command makes a table, the TR command makes a row, and the TD command makes a cell. If you look at the output page, you should be able to see which part this refers to. And the (output "C16") command causes the output from cell C16 to appear. The system actually works by converting the page description into a program (in Kawa), which then
runs on our server and prints its output every time a model is run.
|