When a Web uses one of the Markdown Text Filters, and you export
all the pages as a zip file, you'd like the MathML and SVG to
render when the pages are viewed locally. This means saving them
with a .xhtml extension. Users of non-XHTML-capable browsers or
Textile users should still get .html files.
Ruby's String.sub!(pattern, replacement) routine is fundamentally
broken. But the block version works fine.
Using the broken routine in the Chunk handler was a subtle mistake.
WikiWord (and the like) could wreak havoc in equations. Protect them
(the way <a>, <pre> and <code> blocks are protected).
For some reason, this doesn't seem to work in inline equations.
Maruku is doing something funny there ... => one failing Unit Test.
For the file_list action, include the pages which link to the given file(s).
This required rejiggering so that that information is actually retained in the database.
Unfortunately, you'll actually need to revise the page(s) in question, because that's the
only time this information is updated in the database.
Some more tests from Clint Ruoho. The main branch of Instiki (and, I guess,
the old sanitizer) are vulnerable.
Also: under Ruby 1.8.x, CGI.unescapeHTML screws up horribly decoding NCRs
which represent high-bit ASCII characters. UTF-8 agrees with 7-bit ASCII,
but CGI.unescapeHTML doesn't seem to know that they disagree for i>127.
CMyApp is a WikiWord (at least, on other Wiki systems, like TWiki).
Should allow that here
Also, choose a more obscure name for the thread-local variable tracking
included chunks.
Use "Thread.current[:included_by]" instead of the Class variable,
"@@included_by".
The former will work on some newfangled multi-threaded Webserver stack,
which uses separate threads to handle multiple simlutaneous requests
(one request/thread). Dunno that the rest of the application is
thread-safe, but using a class variable, in this context, probably isn't.
Thanks to Sam Ruby for the suggestion.
Another request from the old (and apparently defunct) Instiki Bug Tracker:
allow single letter WikiLinks, e.g. "[[a]]". Requested by a Japanese user.
Fixed.
Another very amusing 3-year old bug from the main Instiki Bug Tracker
(don't they ever fix anything?): the chunk-handling code was supposed
to prevent recursive [[!include ...]] statements. Alas, instead of
actually preventing them it would -- when it encountered a recursive
include -- churn away until Rails ran out of stack space.
Fixed.
Previously,
<nowiki>[[!include foo]]</nowiki>
would produce some garbage, like
chunk18226682includechunk
instead of the desired rendered text,
[[!include foo]]
Fixed.
By default, Rails will cache
example.com/mywiki/show/SomePage
and
www.example.com/mywiki/show/SomePage
In Instiki, this just leads to stale cached pages and frustration.
Fix that behaviour.
Be a little gentler in recovering from Instiki::ValidationErrors, when saving a page.
Previously, we threw away all the user's changes upon the redirect. Now we attempt
to salvage what he wrote.
Links to a published web should be to the 'publish' action, not to the
'show' action. Previously, the published status of the source, not the target
was used.
Also, correct display of the Navigation Links for the 'published' action.
The html5lib sanitizer does not necessarily produce well-formed output.
Take some "bad" input, wrap it in a <nowiki> tag and -- bingo! -- you get
ill-formed output.
Fixed. (Though, probably, one should fix the html5lib sanitizer, instead.)
Updated to Rails 2.2.2.
Added a couple more Ruby 1.9 fixes, but that's pretty much at a standstill,
until one gets Maruku and HTML5lib working right under Ruby 1.9.
The new sanitizer seems to work well (cuts the time required
to produce the Instiki Atom feed in half). Our strategy is to
use HTML5lib for <nowiki> content, but to use the new sanitizer
for content that has been processed by Maruku (and hence is
well-formed).
The one broken unit test won't affect us (since it dealt with
very malformed HTML).
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.
(One solution would be to use the HTML5lib parser on <nowiki> blocks.)
In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
The "optimization" of using arrays instead of regexps to
implement to_utf8 and is_utf8? (and their brethren) is
actually no faster. Go back to the logically-clearer implementation.
Previously, used a regexp to find and convert named entities in the content.
Now use a more efficient algorithm.
Similar tweak for converting NCRs before checking whether text is valid utf-8.
Upgraded to Rails 2.0.2, except that we maintain
vendor/rails/actionpack/lib/action_controller/routing.rb
from Rail 1.2.6 (at least for now), so that Routes don't change. We still
get to enjoy Rails's many new features.
Also fixed a bug in Chunk-handling: disable WikiWord processing in tags (for real this time).
OK. This is a better way: define a custom TreeWalker which converts named entities to utf-8 as it goes. This avoids having to do an extra tree traversal in sanitize_rexml, AND avoids the trainwreck that is html5/inputstream.rb.
My REXML::Element.to_ncr (and REXML::Element.to_utf8) is horribly slow. For long documents, it proves more efficient to serialize to a string, apply String.to_ncr (or String.to_utf8) and then Sanitize the string.