(anonymous guest) (logged out)

Copyright (C) by the contributors. Some rights reserved, license BY-SA.

Sponsored by the Wiki Symposium and the Nuveon GmbH.


It is my humble opinion that much of the success of HTML derives from its bastard nature of being somewhere in between the ideals of pure semantic and pure presentation markup. Many feel that semantic markup is by definition superior to presentation markup, but I think there are advantages to both approaches, and I don't think that a markup language need apologize for retaining some of its presentation flavor rather than being purely semantic.

The advantages of semantic markup are compelling. It gives you more flexibility in presentation (especially when combined with style sheets), and enables more sophisticated processing of the marked up text.

That said, let us take a critical look at the relative advantages of presentation markup, especially as the basis for wikis. I argue that it is simpler, less likely to gotten wrong, and provides more direct feedback to users.

Consider the correct markup for indicating phrases in a foreign language under both approaches. There is a well established typographic tradition that such phrases be italic. Thus, in a presentation framework, the correct markup is:

Presentation markup has a certain //je ne sais quoi.//

In HTML, the situation is considerably more complex. There are a number of simple tags that have the same presentation as above in standard display contexts, including the presentation markup <i>je ne sais quoi</i>, but none of the semantic tags have quite the right meaning. Most typists will probably use <em>, under the belief that <em> is superior to <i> because it's semantic, but the phrase isn't necessarily emphasized, so this semantic markup isn't very accurate. None of <cite>, <var>, or <address> has the desired meaning either.

In fact, HTML does have a way to indicate the meaning precisely. It is as follows:

<style type="text/css">
:lang(fr) { font-style: italic }
Semantic markup lacks a certain <span lang="fr">je ne sais quoi.</span>

This markup even has the advantage of being more likely to be read correctly by audio screen reader software. Yet, I think a survey would find that correct semantic markup is a tiny fraction of all such instances on the web. Incorrect semantic markup is almost certainly the majority, and (correct, but not useful to screen reader) presentation markup is probably a significant minority.

In sum, the goal of achieving correct semantic markup is more ambitious than the corresponding goal of correct presentation markup. There are many more choices, and, perhaps most important in a wiki context, very little in the way of feedback to indicate that the markup is wrong. By contrast, with presentation markup the feedback is clear and direct - it looks wrong.

Another reason to believe Wiki markup has more of a presentation flavor than HTML is the lack of style sheets (at least under author control). This virtually guarantees that semantic markup will be chosen based on its rendering to the desired presentation, rather than the semantic meaning.

This is not to say that Wiki markup is purely presentation. Quite the contrary, many elements such as headers and lists have a strong semantic flavor, and the mapping from semantics to presentation is legitimately diverse. If you copied text from the Wikipedia to this wiki (for example), you'd want your headers and bullets to look consistent here, rather than copying the presentation from the Wikipedia.

Instead, I recommend we honor both traditions, and celebrate the relative simplicity of presentation markup when the semantic waters get deep and murky. One symbolic way to do that would be for the preferred XHTML for and wiki markup to be <i> and <b> tags, respectively.

Perhaps more importantly, we should recognize that indented blocks have many possible semantic meanings, and if we focus only on the single semantic meaning of Quoting, that guarantees their misuse.

-- Raph Levien 2007-01-01

Matthew Paul Thomas has a pretty good essay making similar points to what I was trying to say above. His concluding paragraph:

<blockquote> So if you want to use bold or italics, and HTML doesn’t have a semantic element for what you mean, use b or i. If you’re not sure which semantic element to use, use b or i. And if you’re creating an authoring tool for people who won’t know or care about semantics, please leave the semantic markup alone, and just stick to b and i. Thankyou. </blockquote>

In that essay, he mentions Markdown, and in a followup, he specifically addresses Wikis.

Paul Ford has some interesting things to say about pidgin and creole in a piece that apparently aired on NPR's All Things Considered, and expands on these ideas here.

And for some fun, Clinton Forbes goes after semantic markup zealots here. I'm not sure whether he had Faruk Ate in mind when he wrote that.

Jukka Korpela makes a strong case that <em> and <strong> are little more than aliases for <i> and <b> to satisfy purists.

Lastly, I vote for won't somebody please think of the gerbils? (by diveintomark) as the appropriate response to mindless advocates of semantic markup.

-- Raph Levien 2007-01-07

Add new attachment

Only authorized users are allowed to upload new attachments.

« This page (revision-5) was last changed on 26-Jun-2007 18:42 by