Commit graph

5 commits

Author SHA1 Message Date
Jacques Distler
800880f382 Rough In New Sanitizer
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.

(One solution would be to use the HTML5lib parser on <nowiki> blocks.)

In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
2008-05-20 17:02:10 -05:00
Jacques Distler
4dd70af5ae HTML5lib is Back.
Synced with latest version of HTML5lib, which fixes problem with Astral plane characters.
I should really do some tests, but the HTML5lib Sanitizer seems to be 2-5 times slower than the old sanitizer.
2007-05-30 10:45:52 -05:00
Jacques Distler
e1a6827f1f Rollback Switch to HTML5lib
Apparently, HTML5lib does not handle astral plane unicode characters correctly.
Which makes it useless.
Return to the previous sanitizer.
2007-05-29 23:57:39 -05:00
Jacques Distler
6b21ac484f HTML5lib Sanitizer
Replaced native Sanitizer with HTML5lib version.
Synced with latest Maruku.
2007-05-25 20:52:27 -05:00
Jacques Distler
e179508377 Sanitization now preserves case-sensitive element and attribute names (necessary to support SVG).
Unit tests, galore.
2007-02-23 11:32:06 -06:00