(anonymous guest) (logged out)

Copyright (C) by the contributors. Some rights reserved, license BY-SA.

Sponsored by the Wiki Symposium and the Nuveon GmbH.

 

Add new attachment

Only authorized users are allowed to upload new attachments.

List of attachments

Kind Attachment Name Size Version Date Modified Author Change note
rl
creole.rl 33.6 kB 1 20-May-2007 15:05 MarkWharton creole.rl
 
makefile 0.7 kB 1 20-May-2007 15:05 MarkWharton makefile

This page (revision-7) was last changed on 16-Oct-2007 20:16 by YvesPiguet  

This page was created on 27-Dec-2006 07:47 by MarkWharton

Only authorized users are allowed to rename pages.

Only authorized users are allowed to delete pages.

Difference between version and

At line 1 added 2 lines
==About Me==
At line 29 added 41 lines
==My Commitment==
I thought it might be a good time for me to make a public commitment to Creole. That commitment is that whatever is decided for the //final// Creole spec, here at wikicreole.org, I will support 100% in my wiki implementation (barring ineptitude on my part to implement ;-). I might not agree or be happy with everything in the end, but being involved in the process, means I have nothing to be complain about, and after all, the sum of Creole should/will be greater than its parts.
Sure, all of this should be obvious, nonetheless, I believe it is worth stating here directly.
==New Developments==
Just thought others might be interested... I have recently discovered the [[http://www.cs.queensu.ca/~thurston/ragel/|Ragel State Machine Compiler]] which is a most excellent software development tool! I have completely rewritten my Creole parser using Ragel and the construction is so much clearer and efficient - it's unbelievable. For example, with my first attempt I have been able to build a single pass line buffered parser (with limited forward scanning to determine the line type before passing the line content to the state machine). It's all very rough at the moment but I plan to develop something for general release in the next couple of weeks (and ideally before the spec is finalized so any last minute issues can be discussed). Of course, [Yves Piguet]'s [Nyctergatis] is exceptional and the amount of work he has put into it should not be diminished. I think the main issue we still need to resolve is escaping. Like I mentioned earlier, escaping whole Creole sequences does not work and that requirement causes trouble for the state machine approach (at least it does for me).
==Ragel Based Creole Parser==
Here is my command line tool to parse Creole input and output XHTML. It is based on the Ragel State Machine Compiler.
The creole.rl source file attached contains a mix of Ragel and C. A creole.c target file can be generated with the Ragel tools as follows:
{{{
ragel.exe -s creole.rl | rlgen-cd.exe -G2 -o creole.c
}}}
Note: The creole.rl source is released under the GNU General Public License. A copy of the license is included in the file. It is also available from [[http://www.gnu.org/|GNU]].
===Building===
Development and testing were performed using Microsoft Visual C++ 2005 Express Edition on Windows XP Service Pack 2. The creole.c compiles with NMAKE and a suitable makefile is attached which can be used as follows:
{{{
call "%MSDEV%\bin\Vcvars32.bat" x86
nmake /f makefile all
}}}
===Options===
The parser supports options for escaping text and treating linebreaks as linebreaks.
{{{
creole -escape -linebreaks < creole.txt > creole.htm
}}}
By default, the parser does not escape text (except for the special case of space with preformatted block ends), and any linebreaks within paragraphs are discarded. When the escape option is used, a tilde (~) followed by a non alpha numeric character {{{[^\t 0-9A-Za-z]}}} inside a //regular// text section will cause the tilde (~) to not be output and remove any special meaning from the following character. Regular text sections exclude freelinks, link sections, image sections, nowiki and preformatted blocks and also placeholders. This escape design provides a simple, safe and effective escaping mechanism which does not force content creators to change important filenames and urls etc. to avoid accidentally escaping them. Furthermore, link sections, image sections, nowiki and placeholders accept the first 2 (or 3) characters as the openers and the last 2 (or 3) characters as closers. This //waiting// technique allows natural nesting of special characters to be used to achieve results which might otherwise require escaping (e.g. {{{[[Home|[{{home.jpg|{Home!~}}}]]]}}} produces {{{<a href="Home">[<img src="home.jpg" title="{Home!}" />]</a>}}}).\\//Note: It was necessary to escape the example to produce the desired effect here.//
Version Date Modified Size Author Changes ... Change note
7 16-Oct-2007 20:16 6.722 kB YvesPiguet to previous restore (vandalism)
6 16-Oct-2007 20:00 6.731 kB 211.7.138.14 to previous | to last
5 20-May-2007 16:17 6.722 kB MarkWharton to previous | to last Ragel Based Creole Parser
4 14-May-2007 01:04 4.451 kB MarkWharton to previous | to last new developments
3 15-Apr-2007 04:48 3.327 kB MarkWharton to previous | to last my commitment
2 10-Apr-2007 12:12 2.707 kB MarkWharton to previous | to last bit of an update...
1 27-Dec-2006 07:47 1.184 kB MarkWharton to last introduce myself
« This page (revision-7) was last changed on 16-Okt-2007 20:16 by YvesPiguet