Commit graph

11 commits

Author SHA1 Message Date
Jacques Distler a6429f8c22 Ruby 1.9 Compatibility
Completely removed the html5lib sanitizer.
Fixed the string-handling to work in both
Ruby 1.8.x and 1.9.2. There are still,
inexplicably, two functional tests that
fail. But the rest seems to work quite well.
2009-11-30 16:28:18 -06:00
Jacques Distler 510e44a61a More tests 2009-09-26 00:36:28 -05:00
Jacques Distler dcd3e63ae8 Nowiki Include
Previously,
   <nowiki>[[!include foo]]</nowiki>
would produce some garbage, like
   chunk18226682includechunk
instead of the desired rendered text,
   [[!include foo]]

Fixed.
2008-12-20 23:24:50 -06:00
Jacques Distler 34fcd7943a Some Tests
Some functional tests for 'delete orphaned pages by category'.
2008-12-07 00:24:25 -06:00
Jacques Distler 513b2b16c1 Better
Put the "safe" XHTML sanitization in lib/santize.rb, rather than in lib/chunks/nowiki.rb.
D'oh!
2008-12-01 10:29:46 -06:00
Jacques Distler 758325923f Fix another ill-Formedness hole
The html5lib sanitizer does not necessarily produce well-formed output.
Take some "bad" input, wrap it in a <nowiki> tag and -- bingo! -- you get
ill-formed output.

Fixed. (Though, probably, one should fix the html5lib sanitizer, instead.)
2008-11-30 21:44:52 -06:00
Jacques Distler 45405fc97e New Sanitizer Goes Live
The new sanitizer seems to work well (cuts the time required
to produce the Instiki Atom feed in half). Our strategy is to
use HTML5lib for <nowiki> content, but to use the new sanitizer
for content that has been processed by Maruku (and hence is
well-formed).

The one broken unit test won't affect us (since it dealt with
very malformed HTML).
2008-05-21 02:06:31 -05:00
Jacques Distler 800880f382 Rough In New Sanitizer
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.

(One solution would be to use the HTML5lib parser on <nowiki> blocks.)

In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
2008-05-20 17:02:10 -05:00
Jacques Distler 41346bf8bd Efficiency: Entity handling
Previously, used a regexp to find and convert named entities in the content.
Now use a more efficient algorithm.
Similar tweak for converting NCRs before checking whether text is valid utf-8.
2008-05-17 01:43:11 -05:00
Jacques Distler 1259e16a4a A Couple of Unit Tests 2007-09-23 00:03:58 -05:00
Jacques Distler 69b62b6f33 Checkout of Instiki Trunk 1/21/2007. 2007-01-22 07:43:50 -06:00