Summary of WikiUseful TextFormattingRegularExpressions: * '''Expressions for inline tags''' Quoted emphasis: ('')(.*?)\1 -> $2 Quoted strong: (''')(.*?)\1 -> $2 * '''Expressions for block level tags''' Preformatted code (space at beginning of line): ^ +(.*)\n ->
$1
with a subsequent match and replace of
  ->  
	* '''Expressions for nested group tags (such as bullets)'''
 Ordered list (tab-numeral-space at beginning of line):
 ^\t[0-9]\s*(.*)(?:\n?|$)  ->  
  • $1
  • with a subsequent match and replace of (?)((?:
  • .*
  • )+)(?!
  • ) ->
      $1
    Unordered list (tab-asterisk-space at beginning of line with consideration for previous unordered list parsing): ^\t\*\s*(.*)(?:\n?|$) ->
  • $1
  • with a subsequent match and replace of (?|
      )((?:
    1. .*
    2. )+)(?!
    3. |
    ) -> * '''Expressions for parsing links''' standard protocols, not image: (http|news|ftp|mailto)\:(\/\/)?((?:\S(?!\.gif|\.jpg|\.jpeg))+)\s --> $3 standard protocols, inline image: (? * '''Expressions for alternate formatting rules (i.e. rules not native to TheOriginalWiki)''' The above are the rules used for HtagWiki. I would be really interested to know what Ward uses, so we could analyse posssible problems with the various RegularExpression''''''s. I also started to realise that after a certain number of rules it starts getting inefficient, and of course the fastest method (I can think of) would be to implement a custom WikiParser. Doesn't seem to be an issue yet though. -- SvenNeumann ---- The following paragraph was moved from TextFormattingRegularExpressions. This page could have been really, genuinely useful to my endeavors the last two days and I was very happy when I found it. Unfortunately I must say I find many of the expressions here dodgy at best, and just wrong at worst. Could we turn this into a resource for the (many) ways that WikiAuthors have implemented this? -- SvenNeumann Let's use this page to discuss the RegularExpression''''''s and use the other page as a document. ''Great idea!'' ''Maybe I'm just a bit slow, but some of these RegExes give me a splittin' headache :)'' ---- The emphasis RegularExpression ''/'{3}(.*?)'{3}/'' looks like it would match situations where the apostrophes contain nothing. Wouldn't ''/'{3}(.*)'{3}/'' be simpler? And horizontal line RegularExpression should only match ''exactly'' 4 dashes not 4 or more. -- AdewaleOshineye Also in HtagWiki I use the backreferencing to make it easy to match several alternate formatting rules like: (''|//)(.*?)\1 for example to allow the slash notation too in one simple expression. Also, I believe matching blank apostrophe parts is accuarate too, as long as the rules are ordered with the expression gobbling the most quotes first. -- sn ---- Let's try this one: ^ +(.*)\n','
    $1
    to mark all monospaced lines and
    for deletion after to merge the lines into a single block. is that accurate?
    
    ----
    
    What is an efficient quick and dirty way for the spell checker? I tried a RegularExpression but that took over 14seconds for a small page and a 270k dictionary. A simple search for each word takes only 0.4seconds. Of course indexing would speed this up. But perhaps I'm approaching this from the wrong end too. How is simple spell checking usually implemented? -- sn  ''By the users... it's a wiki, after all''.