(anonymous guest) (logged out)

Copyright (C) by the contributors. Some rights reserved, license BY-SA.

Sponsored by the Wiki Symposium and the Nuveon GmbH.


Add new attachment

Only authorized users are allowed to upload new attachments.

This page (revision-23) was last changed on 11-Jan-2007 14:18 by ChuckSmith  

This page was created on 05-Sep-2006 17:10 by Yonat

Only authorized users are allowed to rename pages.

Only authorized users are allowed to delete pages.

Difference between version and

At line 2 added 4 lines
Moved part of the discussion to [Talk.ChangeLinebreakMarkupProposal].
At line 3 changed one line
== Why linebreaks are evil? ==
Now, as to my propostion of the __forced line break__ markup, it's not really supposed to be a backslash and a space -- rather it's a backslash at the end of the line. The whitespace is added for the reasons mentioned above. Note, that this "fits" with the use of backslash for an escape character -- "eascaped" newline is being treated literally. Using double backslash has the problem of conflicting with the markup of eascaped literal backslash.
At line 5 changed one line
I've been meaning to pick this subject for a long time, but I don't want to present my arguments in wrong way, or to miss any important argument. So I was preparing for this carefully. This is going to be a rant against converting newlines to <br>, as you can easily guess. Ok, lets begin.
I'm think that requiring a newline in the markup for forced line break where possible is a good idea -- it makes the code more WYSIWYG and thus ReadableMarkup -- you don't need to scan whole line to spot linebreaks -- you only need to look at the actual ends of lines in the raw code. Of course this has a problem if there are places where the newline is forbidden -- I think it's an indication of a greater problem in the markup, though.
At line 7 changed one line
=== Target audience ===
-- RadomirDopieralski, 2006-01-03
At line 9 changed one line
The justification of the current handling of newlines I've heard goes something like this: "this is intuitive, people coming to wikis from MS Word or blogging software will expect it".
Good points.
At line 11 changed one line
So new wiki users will be happy to have this feature in Creole, as it means less new things for them, right?
So, to summarize your proposition with my own:
At line 13 changed one line
In the meantime, people who already use wikis, will stumble on it and find it awkward at least as much as me. And it's not just some time until they get used to it -- because it's not a "standard" they can get used to. 99% of wikis that don't use Creole handle newlines as spaces. Many web forums, message boards, less advanced blogging software handle newlines like that too -- simply because that's how HTML does it, and because it's easy to translate such text directly into HTML. Practically all proffessional or half-professional typesetting languages treat single newlines the same as space -- from HTML, through PostScript and Rich Text Format, up to LaTeX and TeX. Single newline has no meaning other than simple space. And, byt the way, the same goes for handling multiple spaces as a single one (although RTF is different in this regard).
Rule #1: Paragraphs are separated by at least two newlines (CR/LF, LF or FF). Any number of whitespaces before, in between or after will be ignored. More than two consecutive newlines will be considered as one and only one change of paragraph.
At line 15 changed one line
This means, that while new users are happily typing their text the way they think it should work, the experienced users need to maintain this "split personality", remembering where they can use enter, and where they can't, because it will produce invalid layout (mind you, there are literally 2 or 3 cases when a newline is actually considered correct and needed in typesetting).
New paragraph: {{{/(\s*\n\s*\n\s*)/s}}} (Perl expression with the /s modifier)
At line 17 changed one line
Now, the oldies will eventually die off, we need to look with hope at the new generation of blogg^H^H^H^H^Hwikizens. But even they, as they will discover more advanced software, will have to maintain the "shizm" between how the newlines are handled. And it's in an especially sensitive, "transition" time, when a small obstacle like that can make them give up and forever stay in the newbie world of MS Word and blogs, locked out from proffessional software and non-Creole wikis.
Rule #2: To break a line, one must escape a newline. The rule should be permissive enough to allow any number of whitespaces (including CR/LF or LF) after the backslash. A space before is not necessary. Rule of least surprise.
At line 19 changed one line
Usability experts know of a thing called "myth of experienced user" -- a trap that interface designers often fall for, by designing two intrefaces, one for newbies and one for experienced users -- thinking that when a newbie will become experienced, they can switch to the more powerful but also more complicated interface. But this never happens, and users are locked forever in the "newbie" interface, just because they don't become more experienced with the more powerful one by using the less complicated one.
Line break: {{{/(\s*\\\s*\n\s*)/s}}} (Perl expression with the /s modifier)
At line 21 changed 2 lines
I'm strongly convinced that this kind of "pro-newbie" decissions create a similar chasm -- not between interfaces of a single application, but between different wikis.
Creole is not intended to be the one and only wiki markup. We don't want to lock users from other wikis, we want to introduce them gently to them. Thus, Creole should be simple and easy to learn, but it should not be substantially different from other wiki markups.
Rule #3: Tabs are treated as spaces.
At line 24 changed one line
=== Technical difficulties ===
Rule #4: Consecutive spaces or tabs are treated as one space only.
At line 26 changed 2 lines
Everyone who tried to implement a Creole parser knows that this rule, together with some other special cases, increases the parser's complexity considerably, making it much harder to create, debug and extend. Hard to write parser means worse adoption across different wiki engines and more accidental incosistences between implementations.
That's on the side of the developers.
Rule #5: Spaces or tabs at the beginning of a line are ignored.
At line 29 changed one line
Difficulties on the side of users usually involve copying and pasting of text from their e-mails or text editor -- different line-lengths will result in text that looks like this:
What about FF (\f) or vertical tabs (\v)?
At line 31 changed 13 lines
Lorem ipsum dolor sit amet, consectetuer
elit, sed diam nonummy nibh euismod tincidunt
laoreet dolore magna aliquam erat volutpat.
Ut wisi
enim ad minim veniam, quis nostrud exerci
ullamcorper suscipit lobortis nisl ut aliquip
ex ea
commodo consequat.
-- [EricChartre], 2007-01-05
At line 45 changed one line
That's the effect of mixing of the browser's automatic line wrapping with the user's manual one. The same thing happens when wiki's textarea has different width than the rendered page -- when a wiki site has a large sidebar, for example.
I would say the "new paragraph" in rule #1 is rather {{{/(\s*\n(\s*\n)+)/s}}}. The rule #2 would be much simplier to understand if we just stripped all spaces and tabs from the end of lines before parsing. I don't know where you got #5 from, it wasn't mentioned anywhere -- treating indented lines differently than non-indented ones is still an option -- just not used anywahere. One point, though, is that the amount of indentation shouldn't be relevant (no space counting). The form feed and vertical tab characters are treated as space -- you have the vertical line and headings to give structure to your text.
At line 47 changed one line
The wiki admins will also have a hard time with this. Text produced by the wiki users will practically lock them in one layout, because changing (especially decreasing) the line length will lead to the effect above.
-- RadomirDopieralski, 2007-01-05
At line 49 changed one line
Even worse, for "flowing" layouts, users that have non-standard font size or window size will experience the effect above.
It would be easier, indeed, to strip all consecutive whitespaces and spaces from the end of lines before parsing. Most regular expressions would be simpler and we wouldn't have to use the /s modifier. However, wouldn't it create the need for a new parsing pass?
At line 51 changed one line
I don't even want to think about printing wiki pages.
I took some creative liberty for #5 ;-) while I was on the subject of spaces. See the talk about [quoting|Talk.Quoting].
At line 53 changed one line
Now, cleaning up this mess automatically is impossible (while it is possible to convert the text the other way around) -- it involves manually browsing trough the text and removing all the spuriouous newlines. I've done it several times. It's debilitating.
-- [EricChartre], 2007-01-05
At line 55 changed one line
Often just **looking* for the spurious newlines is hard -- because with the automatically wrapped textarea they are InvisibleMarkup. Another reason to drop them.
Looking at the "characters not in Creole" at [Terms], I can imagine using several others characters for forcing line end:
* {{{#}}} or some combination of it (possible conflict with wikis using {{{##...##}}} markup).
* {{{$}}} or some combination of it (possible conflict with wikis using {{{$$...$$}}} markup).
* {{{=}}} or some combination of it, possibly only at the end of line
* {{{|}}} or some combination, would make it impossible to use inside table cells
At line 51 added one line
Note, that we probably don't need or want a markup for forcing a break of an empty line.
At line 58 changed one line
=== Solution ===
So, any ideas?
At line 60 changed one line
Remove the rule about newlines from the Creole specification. Add a separate rule for forcing a linebreak when it is really absolutely required -- something like "\\" or "##" or "||" at the end of the line, for example.
-- RadomirDopieralski, 2007-01-05
At line 62 changed one line
If we absolutely need to accomodate the blog users, then just don't specify the newline handling in the spec -- allow different wikis to use what is best for their user base. But I'm all for specifying that single newlines shall be ignored and treated as spaces.
Consider {{{%}}}, too. If {{{%%%}}} is used for breaking lines, as in [PhpWiki|http://phpwiki.sourceforge.net/phpwiki-1.2/index.php?TextFormattingRules], it would be [NotNew].
At line 64 changed one line
-- [RadomirDopieralski], 2006-12-11
-- MicheleTomaiuolo, 2007-01-05
At line 66 changed one line
I found some surveys about how different wiki engines treat newlines on MeatBall: http://www.usemod.com/cgi-bin/mb.pl?ParagraphFormattingRules
Good point Michele! Are there any other markups for newlines you know? Is "%" used by other wiki engines? Would it conflict? What advatnages and disadvantages of the "{{{%%%}}}" you could imagine?
-- RadomirDopieralski, 2007-01-06
Looking at popular wikis at [List of Line Break Markup], it seems clear to me that we should go with {{{ \\ }}} as it's the only used common markup for line breaks.
-- [ChuckSmith], 2007-Jan-10
Version Date Modified Size Author Changes ... Change note
23 11-Jan-2007 14:18 4.467 kB ChuckSmith to previous fixed link to list of line break
22 10-Jan-2007 19:45 4.467 kB RadomirDopieralski to previous | to last my mistake :)
21 10-Jan-2007 19:20 5.191 kB RadomirDopieralski to previous | to last advantages and disadvantages of tilde
« This page (revision-23) was last changed on 11-Jan-2007 14:18 by ChuckSmith