Commit graph

4 commits

Author SHA1 Message Date
Jacques Distler ca1e8de89c Minor Cleanups
Remove a no-longer-needed function.
' -> &39;
Fix regexp for tag chunk.
2008-05-22 02:46:45 -05:00
Jacques Distler f6508de6dd Whoops!
In some circumstances, the new Sanitizer was double-escaping text nodes.
Fixed (with unit test).
2008-05-21 14:14:43 -05:00
Jacques Distler 45405fc97e New Sanitizer Goes Live
The new sanitizer seems to work well (cuts the time required
to produce the Instiki Atom feed in half). Our strategy is to
use HTML5lib for <nowiki> content, but to use the new sanitizer
for content that has been processed by Maruku (and hence is
well-formed).

The one broken unit test won't affect us (since it dealt with
very malformed HTML).
2008-05-21 02:06:31 -05:00
Jacques Distler 800880f382 Rough In New Sanitizer
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.

(One solution would be to use the HTML5lib parser on <nowiki> blocks.)

In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
2008-05-20 17:02:10 -05:00