Tag Archives: awk

Presenting math on the web

This is a long post about ways to present math on the web, in the context of what I have done with The Handbook of Mathematical Discourse and abstractmath.org (Abmath).  “Ways to present math” include both organization and production technology.

The post is motivated by and focused on my plans to reconstruct Abmath this fall, when I will not be teaching.    During the last couple of years I have experimented with several possibilities for the reconstruction (while doing precious little on the actual website) and have come to a tentative conclusion about how I will do it.  I am laying all this out here, past history and future plans, in the hope that readers will have suggestions that will help the process (or change my mind).

I set out to write both the Handbook and Abmath using ideas about how math should be presented on the web.  They came out differently.  Now I think I went wrong with some of the ways in which I organized Abmath and that I need to reconstruct it so that it is more like the Handbook.  On the other hand, I have decided to stick with the production method I used for Abmath. I will explain.

Organization

My concept for both these works was that they  would have these properties:

1) Each work would be a cloud of articles. They would have little or no hierarchy.  They would consist of lots of short articles, not organized into chapters, sections and subsections.

2) The articles would be densely hyperlinked with each other and with the rest of the web. The reader would use the links to move from article to article. The articles might occur in alphabetical order in the production file but to the reader the order would be irrelevant.

I wanted the works to be organized that way because that is what I wanted from an information-presenting website.  I want it that way because I am a grasshopper. Wikipedia and n-lab are each organized as a cloud of articles. I started writing the Handbook in the late nineties before Wikipedia began.

The Handbook exists in two forms. The web version is a hypertext PDF file that consists of short articles with extensive interlinking. The printed book has the same short articles arranged in alphabetical order. In the book form, the links are replaced by page indices (“paper hyperlinks”). In both forms some links are arranged as lists  of related topics.

Abstractmath.org is a large, interlinked collection of html pages.  They are organized in four large sections with many subsections.

Many entrances

For this cloud of articles arrangement to work, there must be many entrances into the website, so that a reader can find what they want. The Handbook has a list of entries in alphabetical order. Certain entries (for example the entries on attitudes, on behaviors, and on multiple meanings) have internal lists of links to examples of what that entry discusses.  In addition, the paper version has an index that (in theory) provides links to all important occurrences of each concept in the book.  This index is not included in the current hypertext version, although the LaTeX package hyperref would make it possible to include it.  On the other hand, the hypertext version has the PDF search capability.

Abmath has a table of contents, listing articles in hierarchical form, as well as an index, which is different from the Handbook index in that it gives only one link from each word or phrase. In addition, it has header sections that briefly describe the contents of each main section and (in some cases) subsection, and also a Diagnostic Examples section (currently fragmentary)in which each entry provides a description of a particular problem that someone may have in understanding abstract math, with links to where it is discussed. The website currently has no search capability.

The Handbook is really a cloud of articles, and Abmath is not. I made a serious mistake imposing a hierarchy on Abmath, and that is the main thing I want to correct when I reconstruct it.  Basically, I want to dissolve the hierarchy into a cloud of articles.

Production methods

The Handbook was composed using LaTeX.  It originally existed in hypertext form (in a PDF file) and lived on the web for several years, generating many useful suggestions. I wrote a LaTeX header that could be set to produce PDF output with hyperlinks or PDF output formatted as a book with paper hyperlinks; that form was eventually published as a book.

I used a number of Awk programs to gather the various kinds of links.  For example, every entry referring to a math word that has multiple meanings was marked and an Awk program gathered them into a list of links.

I generated the html pages for Abmath using Microsoft Word and MathType.  MathType is very easy to use and has the capability (recently acquired) of converting all math entries that it generated  into TeX. The method used for Abmath has several defects.  You can’t apply Awk (or nowadays Python) programs to a Word document since it is in a proprietary format.  Another problem is that the appearance of the result varies with browser.

But the Abmath method also has advantages.  It produces html documents which can be read in windows that you can make narrower or wider and the text will adjust.  PDF files are fixed width and rigid, and I find clicking on links requires you to be annoyingly precise with your fingers.

So my original thought was to go back to LaTeX for the new version of Abmath. There are several ways to produce html files from LaTeX, and converting the MathType entries to TeX provides a big headstart on converting the Word files into text files.  Then I could use Awk to do a lot of bookkeeping and cut the hyperlink errors, the way I did with the Handbook.

So at first I was quite nostalgic about the wonderful time I had doing the Handbook in LaTeX — until I remembered all the fussing I did to include illustrations and marginal remarks. (I couldn’t just put the illo there and leave it.) Until I remembered how slowly the resulting PDF file loads because there seems to be no way to break it into individual article files without breaking the links.

And then I found that (as far as I could determine) there is no HTMLTeX that produces a reasonable HTML file from any TeX file the way PDFTeX produces a PDF file from any TeX file, using Knuth’s  TeX program. In fact all the TeX to HTML systems I investigated don’t use Knuth’s program at all — they just have code in some programming language that reads a TeX file and interprets what the programmer felt like interpreting.  I would love to be contradicted concerning this.

So now my thought is to stick with Word and MathType.  And to do textual manipulation I will have to learn Word Basic.  I just ordered two books on Word Basic. I would rather learn Python, but I have to work with what I have already done.  Stay tuned.

Send to Kindle