I tend to agree that there is something wrong with the list markup, however, I'm not sure it's the actual character used as the list marker. I encountered exactly the same problems with numbered lists -- because MoinMoin uses the # character for comments and pragmas and it's hardcoded, handled before the parser plugin even sees the text. Sure, that's the problem of this particular wiki engine, but many other engines also use the # character in similar way -- maybe not hardcoded, but that still gives the same trouble in mixed mode.
-one -**two** --two-one **still two-one** ---two-one-one -three
I don't believe that changing the list character to "-" will really solve the problem at its roots -- this too looks to me like a cure for the symptoms only. And somehow changing it to a more widely used character doesn't seem right. It's just inviting more problems. This is in fact plainly visible by the need for additional escape character markup. The list-bold ambiguity can also be easily solved with a similar technique, an "invisible space" similar to the one used for }}}. But that's not a "clean" solution. It doesn't feel right.
*one *~**two** **two-one ~**still two-one** ***two-one-one *three
Many wiki engines select a different markup for nested lists -- indentation. This has several nice benefits, like allowing to only use single character for list markup (and making any double-character combinations safe from ambiguity), visually highlighting the list items in a large body of text (which greatly improves readability and ease of editing) and, last but not least, doesn't look extremely ugly, like the multiple-character nested lists do. Unfortunately, this approach also has some huge disadvantages, that make it practically impossible to accept: use InvisibleMarkup, requiring users to count the spaces; use indentation -- which is extremely obnoxious in most browsers due to lack of autoindetation and "just working" tab key; require manual fixing of the indetation whenever the text is moved between lists created by different users, with different indentation depths. This approach is also hard to implement in mixed mode in the wikis that use the other technique -- because they usually already use indetation for pre blocks.
*one ***two** *two-one **still two-one** *two-one-one *three
Of course, using an otherwise unused character for marking lists would work. This is why nobody but me is complaining about the numbered lists -- the # is actually rare. So, use of "+", or "@", or "%", or even something like "~", or "^" would get us out of the trouble -- by introducing an artifical, totally unfamiliar markup unlike anything else. Let's face it, both "-" and "*" are the most popular bullet characters: "-" in all kinds of plain-text files, and "*" in wikis. Use of any other character seems hardly apropriate, unless the actual construct to be marked up is actually very rare. Lists are not rare, at least not the one-level ones.
+one +**two** ++two-one **still two-one** +++two-one-one +three
The RequireSpaceAfterBulletProposal provides yet another solution -- it makes the markup used for bullets unique in Creole, by adding a "list indicator character" to it, in this case a space. This makes the list markup "at least two characters long" and unique, except for the cases when it's immediately adjacent to other markup-sensitive characters. It also has some of the advantage of indented lists -- visual highilgting of the beginning of the list item. I think that it's already obvious that this is the solution I prefer.
* one * **two** ** two-one **still two-one** *** two-one-one * three
In plain-text files one can encounter one more way of distinguishing the nesting level of a list. Actually, you can see it also in books, and posters, and menus in restaurants, and flyers. The technique is based on using varying chapes of the actual bullets -- usually smaller with each deeper nesting level. I know of no wiki engine (apart from my own puny experiments) that would use this technique, hence it is new in the context of Creole. While it doesn't collide with any Creole's markup, there can still be semantic ambiguity when the characters used for markup are chosen without some thought -- especially in case of hyphens. This can be avoided by the user himself, by picking different characters when there is ambiguity, but it's not exactly clean.
*one ***two** +two-one **still two-one** -two-one-one *three
When looking at these examples, I find the indented list the less confuding and most beautiful, with the "space after bullet" one right behind it. The rest is just horrible to me. Which ones do you like best? Are there any other approaches we missed?
-- RadomirDopieralski, 2007-02-23
Actually, an "invisible space" is already available in Creole, although the markup for it is a little elaborate, yet obvious: {{{}}}.
-- RadomirDopieralski, 2007-02-23
Radomir, (qouting you in italics)
But that's not a "clean" solution. It doesn't feel right.
It feels right for me. Usually this takes a while when you where used to something else ;). It feels right because something that is used quite often is cleanly distinguishable for me as a human reader. Again ** **second level bold** will cause the same discontent amongst wiki users as '''''bold italics''''' is causing discontent amongst Wikipedia users. We already talked about it in Talk.Bold And Italics. It's not only the ambiguity problem for parsers, you know.
This is in fact plainly visible by the need for additional escape character markup.
You write as if we would introduce an escape character only for this proposal. We need a general escape character anyway.
The list-bold ambiguity can also be easily solved with a similar technique, an "invisible space" similar to the one used for }}}.
You still thinking in terms of "some kind of character to distinguish between bold and lists", you later call it the "list indicator character". You still think in the Require Space After Bullet Proposal. With two distinguishable characters for bold and lists you don't need this artificial "list indicator character" anymore. The escape character in this proposal is solely for the EdgeCases of using minus as a first character. While with the Require Space After Bullet Proposal we need a "invisible space" escape character for 25% of the lists usage. People only have to learn the use of the tilde if it happens to them, that they have to use minus to indicate a negative number as the first text in a line and don't want to use the nowiki markup, this is an EdgeCase (0.1%?). With the Require Space After Bullet Proposal we would design our markup around EdgeCases because we would trade away easy usage of "lists with bold" (25%) for being able to use "negative numbers as first literal in a line" (0.01%?).
Many wiki engines select a different markup for nested lists -- indentation.
We already have been through this discussion. We don't want the user to count whitespace. Creole does not rely on whitespace in front of elements. But we should document this better I think. For a user it doesn't matter if a second level element is indented with two or three character, but for a simple line based regex parser it does, because it is hard to do a "look ahead/behind", right?.
thats how your parser expects it: one two three and thats how the user does it: one two three
Now tell me in an wink of an eye what the difference in my second example is?. The "wink of an eye" is important here: as soon as the user has to count this becomes a root of confusion and errors.
Coming back to the general escape character issue I therefore think that whitespace like proposed in the Add No Wiki Escape Proposal is not a good character. It might work in the case of nowiki markup, but not when you try to escape something at the beginning of a line. To be consistent we should have one general escape character that works everywhere, but let us discuss that in the new Escape Character Proposal, not here.
Of course, using an otherwise unused character for marking lists would work. This is why nobody but me is complaining about the numbered lists
I hardly use the numbered lists myself. I even could live with it if it becomes an addition, but it's just to frequently used I guess (is it?). But we should not go into this discussion here.
Which ones do you like best? Are there any other approaches we missed?
Using hyphens for lists of course.
--ChristophSauer, 2007-02-23
I really try to think outside the box, that's why I enumerated all the sane markups I can think of -- no matter how wrong they seem to me.
I don't think we need a general escape character -- the ambigous lists is the only place where it is required (headings and tables are required to start at the first column, so it's easy to escape them with space, like in pre block, you don't even have to explain it in the spec). I can't even see a sane way of actually implementing the escape character -- looking at the examples on the proposal pages, the escape character doesn't apply to a single character -- but to an undefined, context-dependet piece of text. Escape character also introduces something pretty evil: markup that has no visible effect on the rendered page. Really, pleasy go and try to explain the idea of an escape character to a non-programmer.
As for indenting lists for nesting -- I don't really advocate it, I've already written that it's unacceptable in Creole. It just gives the best appearance. But you are wrong about the space-counting thing, actually so wrong, that I suspect you did it on purpose. You don't have to count single spaces and match the indentation perfectly -- you can MakeTheMachineWorkHarder and just recognize changes in indetation, not the exact amount of spaces used. This works very well in MoinMoin and other wikis that use this technique. That's just for the record, as it's not going to be used in Creole anyways (I think).
Now, for the "speciality" of the cases. Have you recently read any non-technical book? One that has some action? I do sometimes read such books, and I also read various stories published on the Internet, often in wikis. A good, dialogue-packed story has more than 70% of paragraphs starting with a hyphen. Look at this particular wiki. Every other page has about 25% of paragraphs starting with a hyphen. I don't think there is a single use of an asterisk other than to show the actual asterisk in the text here.
I don't mind using single hyphens for lists -- it is so normal and common that it even has its own name: "hyphenated list". It's the use of multiple hyphens for nested lists that I'm opposing -- it's a totally new invention -- there is exactly one wiki that uses it on wikimatrix.org: PukiWiki. It looks horrible. I could look at it and think for hours and never guess it's a list. It conflicts with markups for singature, en-dash, em-dash, and horizontal line -- not to mention the Markdown-like headings, if one aims for a mixed mode. Finally, it looks extremely ugly. And beauty is very important when you want passionate users who contribute their hard work just because they like it. Ugliness really reduces user performance, I can point you to actual usability experiments.
Looking at the wikimatrix.org I can see that actually two wiki engines use the "different bullet for different levels" approach: SnipSnap and LunaWiki. Also, two wikis use a "exotic" character for lists: ProntoWiki and PodWiki. And I see one approach I didn't think of: WikkaWiki uses hyphens that can be "indented" with a visible charcter, tilde. One can imagine periods or colons used in similar manner. Ok, this is ugly too :)
I believe that good list markup should have following features:
- The users should be able to tell what the particular thing is, without a need to read user manual or experimenting -- just by looking at it. And without having to scroll to see the start of the list. This works good with single-level lists made with hyphens or asterisks. It also works good with indented lists. Repeated bullets are just alien and artifical. This feature I view as the most important.
- The list must be easy to navigate -- first with one's eyes, then with the cursor. It must be easy to locate the end of an item and beginning of a next one -- and also the beginning and end of the whole list. Asterisk alone is bad at this, as it has text color similar to an average letter -- at least in fonts made for reading prose, not coding. Hyphen is not good too -- it appears very frequently in the body of text. You can either use a character with some very dark or very light color, or lighten it using whitespace. Indented lists do marvelous job here too.
- The list must be easy to edit. This means changing the order and nesting level of items, moving items between different lists, turning paragraphs into lists and vice versa. This also partially relies on navigation, but also on the number of characters used, complexity of the markup, availability of keys on the keyboard. Here indentation does a horrible job, but multiplying the list bullet is not really much better.
These three points are my main concerns. If we could limit the nesting level of lists to two, I woud't hesitate, and would recommend this:
* first list item * second list item - first sublist item - second sublist item * third list item - first sublist item
It's actually the most popular markup for (not numbered) lists I've seen in text files when nested lists were involved. The other, even more popular approach, was to use numbered (or otherwise enumerated) list mixed with bullet list. Or a numbered list with several levels of numbering, like 1.2.4. Note how the compulsory space after the bullets increase readability and navigability immensely. But this nice apprach breaks if you need more nesting levels. Introducing additional bullet characters, like "+", "@", "%", ".", "~" is artifical. Indentation is evil. Repeating hyphens is ugly.
Please tell me if I'm repeating myself :) -- RadomirDopieralski, 2007-02-23