The html5lib sanitizer does not necessarily produce well-formed output.
Take some "bad" input, wrap it in a <nowiki> tag and -- bingo! -- you get
ill-formed output.
Fixed. (Though, probably, one should fix the html5lib sanitizer, instead.)
Updated to Rails 2.2.2.
Added a couple more Ruby 1.9 fixes, but that's pretty much at a standstill,
until one gets Maruku and HTML5lib working right under Ruby 1.9.
Fix Session CookieOverflow bug when rescuing an InstikiValidation error.
Fix some random things which will cause problems with Ruby 1.9. (Plenty
more where those came from.)
Implement amsthm-like Theorem environments with Maruku.
Support is based on Maruku "div"s with special class-names.
Classes
num_*
produce numbered environments, and
un_*
produce un-numbered environments, where * is one of
theorem (for Theorem)
lemma (for Lemma)
prop (for Proposition)
cor (for Corollary)
def (for Definition)
example (for Example)
remark (for Remark)
note (for Note)
In addition, the class
proof
produces a Proof environment.
The LaTeX export works as expected, and these also work in the S5 view.
Bumped version number.
Make session secret persist across restarts. (Been meaning to do this for
a while: no more "stale cookie" warnings fter restarting the server.
Avoid cookie overflow in session store.
IE7+MathPlayer do *not* like the charset parameter to be set in the
Content-Type header. Forcing Rails to omit that parameter is surprisingly
difficult.
The new sanitizer seems to work well (cuts the time required
to produce the Instiki Atom feed in half). Our strategy is to
use HTML5lib for <nowiki> content, but to use the new sanitizer
for content that has been processed by Maruku (and hence is
well-formed).
The one broken unit test won't affect us (since it dealt with
very malformed HTML).
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.
(One solution would be to use the HTML5lib parser on <nowiki> blocks.)
In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
The "optimization" of using arrays instead of regexps to
implement to_utf8 and is_utf8? (and their brethren) is
actually no faster. Go back to the logically-clearer implementation.
Previously, used a regexp to find and convert named entities in the content.
Now use a more efficient algorithm.
Similar tweak for converting NCRs before checking whether text is valid utf-8.
Make remove_orphaned_pages work in a proxied situation.
Also, "fix" a busted functional test. I'm not happy with
this one. We're enforcing plain-text titles (which, I think,
is the correct thing to do), but sending them as type="html",
which then requires double-encoding.