(anonymous guest) (logged out)

Copyright (C) by the contributors. Some rights reserved, license BY-SA.

Sponsored by the Wiki Symposium and the Nuveon GmbH.

 
This is version . It is not the current version, and thus it cannot be edited.
[Back to current version]   [Restore this version]

"treat line breaks as whitespace - precedent, good for Hebrew" -- How is that? I run a Hebrew wiki and we treat line break as <br>. --Yonat

Why linebreaks are evil? #

I've been meaning to pick this subject for a long time, but I don't want to present my arguments in wrong way, or to miss any important argument. So I was preparing for this carefully. This is going to be a rant against converting newlines to <br>, as you can easily guess. Ok, lets begin.

Target audience #

The justification of the current handling of newlines I've heard goes something like this: "this is intuitive, people coming to wikis from MS Word or blogging software will expect it".

So new wiki users will be happy to have this feature in Creole, as it means less new things for them, right?

In the meantime, people who already use wikis, will stumble on it and find it awkward at least as much as me. And it's not just some time until they get used to it -- because it's not a "standard" they can get used to. 99% of wikis that don't use Creole handle newlines as spaces. Many web forums, message boards, less advanced blogging software handle newlines like that too -- simply because that's how HTML does it, and because it's easy to translate such text directly into HTML. Practically all proffessional or half-professional typesetting languages treat single newlines the same as space -- from HTML, through PostScript and Rich Text Format, up to LaTeX and TeX. Single newline has no meaning other than simple space. And, byt the way, the same goes for handling multiple spaces as a single one (although RTF is different in this regard).

This means, that while new users are happily typing their text the way they think it should work, the experienced users need to maintain this "split personality", remembering where they can use enter, and where they can't, because it will produce invalid layout (mind you, there are literally 2 or 3 cases when a newline is actually considered correct and needed in typesetting).

Now, the oldies will eventually die off, we need to look with hope at the new generation of blogg^H^H^H^H^Hwikizens. But even they, as they will discover more advanced software, will have to maintain the "shizm" between how the newlines are handled. And it's in an especially sensitive, "transition" time, when a small obstacle like that can make them give up and forever stay in the newbie world of MS Word and blogs, locked out from proffessional software and non-Creole wikis.

Usability experts know of a thing called "myth of experienced user" -- a trap that interface designers often fall for, by designing two intrefaces, one for newbies and one for experienced users -- thinking that when a newbie will become experienced, they can switch to the more powerful but also more complicated interface. But this never happens, and users are locked forever in the "newbie" interface, just because they don't become more experienced with the more powerful one by using the less complicated one.

I'm strongly convinced that this kind of "pro-newbie" decissions create a similar chasm -- not between interfaces of a single application, but between different wikis. Creole is not intended to be the one and only wiki markup. We don't want to lock users from other wikis, we want to introduce them gently to them. Thus, Creole should be simple and easy to learn, but it should not be substantially different from other wiki markups.

Technical difficulties #

Everyone who tried to implement a Creole parser knows that this rule, together with some other special cases, increases the parser's complexity considerably, making it much harder to create, debug and extend. Hard to write parser means worse adoption across different wiki engines and more accidental incosistences between implementations. That's on the side of the developers.

Difficulties on the side of users usually involve copying and pasting of text from their e-mails or text editor -- different line-lengths will result in text that looks like this:

Lorem ipsum dolor sit amet, consectetuer
adipiscing
elit, sed diam nonummy nibh euismod tincidunt
ut
laoreet dolore magna aliquam erat volutpat.
Ut wisi
enim ad minim veniam, quis nostrud exerci
tation
ullamcorper suscipit lobortis nisl ut aliquip
ex ea 
commodo consequat.

That's the effect of mixing of the browser's automatic line wrapping with the user's manual one. The same thing happens when wiki's textarea has different width than the rendered page -- when a wiki site has a large sidebar, for example.

The wiki admins will also have a hard time with this. Text produced by the wiki users will practically lock them in one layout, because changing (especially decreasing) the line length will lead to the effect above.

Even worse, for "flowing" layouts, users that have non-standard font size or window size will experience the effect above.

I don't even want to think about printing wiki pages.

Now, cleaning up this mess automatically is impossible (while it is possible to convert the text the other way around) -- it involves manually browsing trough the text and removing all the spuriouous newlines. I've done it several times. It's debilitating.

Often just looking* for the spurious newlines is hard -- because with the automatically wrapped textarea they are InvisibleMarkup. Another reason to drop them.

Solution #

Remove the rule about newlines from the Creole specification. Add a separate rule for forcing a linebreak when it is really absolutely required -- something like "
" or "##" or "||" at the end of the line, for example.

If we absolutely need to accomodate the blog users, then just don't specify the newline handling in the spec -- allow different wikis to use what is best for their user base. But I'm all for specifying that single newlines shall be ignored and treated as spaces.

-- RadomirDopieralski, 2006-12-11

I found some surveys about how different wiki engines treat newlines on MeatBall: http://www.usemod.com/cgi-bin/mb.pl?ParagraphFormattingRules

Shall we start a discussion on ChangeLinebreakMarkupProposal?


To start moving this fowrward, I'd like to propose "\\\s*$" as the regular expression for forcing line breaks. I'ts simple to type, looks similar to the markup used for marking newlines in many markup langueges, and is consistent with use of "\" for as an escape character.

The requirement for an end of line makes this markup more WYSIWYG, but makes it inadequate to use in table cells. "\\(\s*$|\s)" could be used instead...

-- RadomirDopieralski, 2006-01-01

For me, the difference between a break line and a new paragraph is the space between the two different lines. Am I wrong? Is there a semantic difference?

So, my first thought was to go even further: only one newline is sufficient to change *paragraphs* (<p></p>). Two or more newlines would be treated as one and only one paragraph change. Food for thoughts...

A newline is considered as a new paragraph (marked by a pilcrow) in Microsoft Word. In Word 2007, the spacing between paragraphs is now obvious in the default template (10 points after a paragraph).

Break line #

To break a line (<br /­>), any \s+\\\\\s+} (that's two backslashes (\\) preceded and followed by at least one whitespace) would be sufficient in my opinion. The rendering engine would eliminate theses spaces.

Then, the questions are:

  1. Why do we *absolutely* need two newlines for paragraphs? (The spec could say: At least one newline is necessary to create a new paragraph.
  2. Is one backslash sufficient to break a line?
  3. Do we need to put a newline after or one whitespace is sufficient?
  4. What about one whitespace before?
  5. Should two or more subsequent whitespaces be treated as only one space?

Remarks:

  • The backslash key is not easy to find on several keyboard layouts.

-- EricChartre, 2006-01-03

Add new attachment

Only authorized users are allowed to upload new attachments.

« This particular version was published on 03-Jan-2007 20:01 by 64.254.230.132.