Rough In New Sanitizer

Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.

(One solution would be to use the HTML5lib parser on <nowiki> blocks.)

In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
This commit is contained in:
Jacques Distler 2008-05-20 17:02:10 -05:00
parent f8e74e53bd
commit 800880f382
15 changed files with 3657 additions and 12 deletions

View file

@ -4,7 +4,7 @@ module MaRuKu; module Out; module HTML
def convert_to_mathml_itex2mml(kind, tex)
begin
if not $itex2mml_parser
require 'sanitize'
require 'stringsupport'
require 'itextomml'
$itex2mml_parser = Itex2MML::Parser.new
end