Writings

Documents

Weblogs

Writing and Markup (January 15, 2004)

What I really want is just to write these pages without worrying about markup, and to have full control when I finally present it. As is pretty clear by now from all I've talked about it, the pages you're reading here are written and stored as XML, before being converted to the XHTML you are probably reading. I've already written about the advantages, and the reasons why I did this, but the downside is the amount of typing I've forced myself into by using XML.

It would be easier, I guess, if I were using HTML, since there are so many good editors for HTML these days. Most of them give you a WYSIWYG view of the page, so writing feels more of less like using a work processor. For the most part, you just type, and use keyboard shortcuts to markup boldface, italic, underlines, headings, and so on. Occassionally you have to jump to a menu to insert a table or something, but generally you can spend most of your time writing.

Using XML means I can do neat things, like generate different versions of the website, or parse the pages to look for links, inter-page references, make lists of pages, and dadada. I like that aspect of it. What I don't like so much is having to surround my writing with silly little tags like

   <p></p>   <b></b>   <ul><li></li></ul>   <code></code>   

and so on. I use jEdit to write the pages, and that helps a lot (it closes tags automatically, and checks page validity against the DTD), but it still feels like I am constantly aware of markup while typing.

There are solutions, and the current one I'm thinking of involves a mixture of XML and structured text. Structured text means different things in different contexts, but here I'm thinking about Wiki-style structure text (WST). The people who developed the different Wiki tools decided to make it easy to format your Wiki pages, by letting you type what you wanted without (almost) any markup at all, or rather, using WST, which is very very simple. You just type. When you want something in bold, you surround it with *asterisks*, when you want underlines you surround it in _underscores_. These few conventions are so non-intrusive that it feels just like writing an email. And the cool thing is, when you finally see the rendered Wiki page, you have boldface, italic, headings, lists, links and so forth. If I was using WST, then *this* _little piece of text_ would appear to you like this little piece of text.

I like WST a lot. Once you learn the basic rules, writing a page is just like typing in a text editor. There is really nothing that gets in the way. So my new idea is to use XML to structure pages in my website, but once I begin a <section></section>, I just type using WST. When my XSL encounters one of those <section elements, it passes the contents to a Wiki-markup engine like SnipSnap Radeox, which spits out valid XHTML. I am waiting till the rest of my XSL is stable enough to do this, since I am likely to break things when I try to link in an extension function into XSL.

The question is, how does a Wiki accomplish this stunning feat? Well, WST simplifies the markup problem by saying that there are really just a few types of basic markup that you want: bold, italic, underlines, headings, lists, links--and so on. I think the standard WST includes just over a dozen of these. Then they figured out the least amount of typing required to get those dozen variations, and also followed some established conventions people were already using in plain-text email, like ** or __ to highlight text. They also restricted the variations in layout. In other words, they simplified the markup by simplifying the problem.

Compared to DocBook, LaTEX, or SGML markup, WST can't really do much for you. Those let you write complete books with as much control you could possibly need. But DocBook, for example, has dozens of tags for formatting documents, in part because "real" writers need so much control over how a "real" book is layed out.

For my purposes in writing a basic website, WST seems like a good solution. Radeox allows you to compose your own definition of a WST, so if I need to I can come up with weird little conventions, so that $% is markup for a filename or some thing. But the more I deviate from the standard, the more I need to remember; WST works because there is little to remember. Sadly, what I will probably never have, at the end of the day, is perfect control of presentation using a simple-to-type markup.

You can comment on this on the jRoller Website site, the the host for the blog entry above.