At line 68 added 82 lines |
|
I think so. Please be kind if I made a stupid regexp mistake. |
{{{ |
grep '^-[^-]' frwiki-20070131-pages-articles.xml | wc |
20064 316762 2104799 |
grep '[^<]' frwiki-20070131-pages-articles.xml | wc |
32184660 229584507 2007434942 |
ls -l frwiki-20070131-pages-articles.xml |
-rw-r--r-- 1 XX YY 2012412288 Feb 10 02:08 frwiki-20070131-pages-articles.xml |
}}} |
|
The database must come from [[http://download.wikimedia.org/]]. I remember having picked |
it because I had enough room on my hard disk. Someone with more room and bandwidth can |
check with en.wikipedia.org. |
|
-- [[YvesPiguet]], 2007-Apr-27 |
{{{ |
$ cat plwiki-20070420-pages-articles.xml.bz2 | bunzip2 | egrep '^-[^-]' | wc -l |
39380 |
}}} |
{{{ |
$ wget 'http://download.wikimedia.org/enwiki/20070402/enwiki-20070402-pages-articles.xml.bz2' -O - | bunzip2 | egrep '^-[^-]' | wc -l |
}}} |
|
In progress... but you don't need much room if you do it this way :) |
|
-- [[Radomir Dopieralski]], 2007-Apr-27 |
|
You're right, I should have written //someone with more room **or** bandwith//. I've |
just checked with {{{more}}} instead of {{{wc}}}: some hyphens are for ad hoc lists which |
would be rendered as multiple one-item lists (probably harmless), and some are for |
dialogs; see e.g. [[http://fr.wikipedia.org/wiki/Pythagore#Politique]]. |
|
On the other hand, I wouldn't like the question to be hidden behind statistics, which |
everybody will interpret differently. It seems pretty obvious that tens of gigabytes cannot be |
moved easily to an incompatible engine. I'd prefer to have a good Creole and see it |
adopted by small and future wikis than fuzzy recommendations which every implementer |
bends to her taste, making it irrelevant. |
|
-- [[YvesPiguet]], 2007-Apr-27 |
|
{{{ |
$ wget 'http://download.wikimedia.org/enwiki/20070402/enwiki-20070402-pages-articles.xml.bz2' -O - | bunzip2 | egrep '^\s*-[^-]' | wc -l |
45107 |
}}} |
It just finished, so I thought I'll post the result anyways. Comparing the sizes of the page database, no wonder Chuck and Christoph were so surprised at the use of hyphens for dialogs. |
|
By the way, edit conflicts pass silently on this wiki and lead to destroying of posts -- can something be done with it? |
|
-- [[Radomir Dopieralski]], 2007-Apr-27 |
|
It doesn't matter: MediaWiki will not implement Creole in MixedMode, see [[ImagesReasoning#Collisions with MediaWikis Template Format.]] |
|
//I'd prefer to have a good Creole and see it |
adopted by small and future wikis than fuzzy recommendations which every implementer |
bends to her taste, making it irrelevant.// |
|
I don't think so: We are not creating Creole for your particular software, but for all Wiki engines out there, or to be more exact, for the users that author them. Our mission is to create a common wiki markup for the wiki ohana. Engine developers will not dump their proven markup, just because we think that we know it better. That's why it is called CREOLE. Your aims are obviously incompatible with the aims of creole. |
|
-- [[ChristophSauer]], 2007-04-29 |
|
//It doesn't matter: MediaWiki will not implement Creole in MixedMode, see ImagesReasoning#Collisions with MediaWikis Template Format.// |
|
We know. It started as an answer to Chuck's "worries". It's totally obvious to me that Mediawiki won't include Creole's list markup in its main engine. |
|
//I don't think so: We are not creating Creole for your particular software, but for all Wiki engines out there, or to be more exact, for the users that author them.// |
|
That's something I'd understood and accepted, thank you. |
|
//Our mission is to create a common wiki markup for the wiki ohana. Engine developers will not dump their proven markup, just because we think that we know it better.// |
|
That's nevertheless something clearly stated in [[Implementation]]: "gradual alignment - improve bad syntax from your markup with more user-friendly syntax from Creole". |
|
//That's why it is called CREOLE. Your aims are obviously incompatible with the aims of creole.// |
|
It isn't that obvious. My aims for Creole are its success. I've advocated features I don't need because they're used in millions of wiki pages, such as indented paragraphs. I've implemented (optional) wikiwords, automatic URL conversion to links, interwikis and images, even if I don't like or need them. I've released NME as open source because I've been asked to. I don't regret any of that, but please consider it before making a final jugement on my motivations. |
|
-- [[YvesPiguet]], 2007-Apr-29 |
|
Again, I am sorry Yves. I really appreciate your engagement and your judgment. The fact that tests show that your implementation is the best out there so far, pleas keep the faith. I didn't thank you for that analysis, I do it now. I didn't know that the French Wikipedia is using hyphens at the beginning that much. I am ready to give up on the [[HyphenListMarkupProposal]] to reach a consensus, see [[Talk]]. Thanks Yves. |
|
-- [[ChristophSauer]], 2007-04-30 |