Apply the same methodology, as in Revision 432,
to the category chunk-handler. This completes
the replacement of all the code that looks like
if string.is_utf8?
do something
else
complain
end
with code that looks like
string.purify
do something
Previously, if the user tried to submit content which was
malformed utf-8, Instiki would complain loudly to him.
A slightly more user-friendly approach was suggested by
the latest Rails 2.3.4, and a conversation with Sam Ruby
(who suggested some improvements).
Now, instead of complaining, we remove the offending bytes,
leaving a well-formed utf-8 string, which we pretend is what
the user meant to submit.
Should be to the published action. This
didn't work right for inter-web links.
(Reported by Mike Shulman)
Also, change some .length's to .size's
(for Andrew Stacey)
Refactor the upgrade_instiki rake task.
Based on the (very nice) JHerdman's
64d305f2a8
but defaults to 'production' environment, instead.
Instiki users don't know about production/development/test.
Instiki defaults to 'production'. So should its associated rake tasks.
Added the ability to rename existing pages.
[[!redirects Some Page Name]] redirects Wikilinks [[Some Page Name]] to
the current page (assuming "Some Page Name" does not exist).
Real pages trump redirects (though this may change, depending on
user feedback).
From Jason Blevins:
Create a "History" page for each wiki page.
Link to it, and to the "Diff" page from "Recently Revised".
Also, correct a bug in listing/deleting links to uploaded
video and audio files.
Using <object> and <embed> were forbidden for obvious
security reasons. Instiki now permits embedding video
via the HTML5 <video> element (Ogg/Theora encoded videos
only, with .ogg or .ogv extensions). You can even upload
videos with
[[foo.ogg:video]]
Instiki now support x-sendfile. See the Proxying page for
configuring Apache (with the x-sendfile module). Lighttpd
should work similarly.
Update Rails to latest Edge (hopefully converging on RC2!).
Instiki now runs on the Rails 2.3.0 Candidate Release.
Among other improvements, this means that it now
automagically selects between WEBrick and Mongrel.
Just run
./instiki --daemon
On Webs with file uploads enabled, uploaded files were stored
(in version 0.16.1 and earlier) in the public/ directory.
This was a security threat. A miscreant could upload a .html file.
When a user clicked on the link to the file, it was opened (unsanitized)
in the browser.
As of version 0.16.2, uploaded files are stored in the webs/
directory. Now, when the user clicks on the link, the file is sent
with the
Content-Disposition: attachment
header set, which causes the file to be downloaded, rather than opened
in the browser. As always, files downloaded from the internets should be
treated with caution. At least, this way, they are not aoutomatically
opened in the browser.
To move your existing uploaded files to the new location, do a
rake upgrade_instiki
When a Web uses one of the Markdown Text Filters, and you export
all the pages as a zip file, you'd like the MathML and SVG to
render when the pages are viewed locally. This means saving them
with a .xhtml extension. Users of non-XHTML-capable browsers or
Textile users should still get .html files.
Ruby's String.sub!(pattern, replacement) routine is fundamentally
broken. But the block version works fine.
Using the broken routine in the Chunk handler was a subtle mistake.
WikiWord (and the like) could wreak havoc in equations. Protect them
(the way <a>, <pre> and <code> blocks are protected).
For some reason, this doesn't seem to work in inline equations.
Maruku is doing something funny there ... => one failing Unit Test.
For the file_list action, include the pages which link to the given file(s).
This required rejiggering so that that information is actually retained in the database.
Unfortunately, you'll actually need to revise the page(s) in question, because that's the
only time this information is updated in the database.
Some more tests from Clint Ruoho. The main branch of Instiki (and, I guess,
the old sanitizer) are vulnerable.
Also: under Ruby 1.8.x, CGI.unescapeHTML screws up horribly decoding NCRs
which represent high-bit ASCII characters. UTF-8 agrees with 7-bit ASCII,
but CGI.unescapeHTML doesn't seem to know that they disagree for i>127.
CMyApp is a WikiWord (at least, on other Wiki systems, like TWiki).
Should allow that here
Also, choose a more obscure name for the thread-local variable tracking
included chunks.
Use "Thread.current[:included_by]" instead of the Class variable,
"@@included_by".
The former will work on some newfangled multi-threaded Webserver stack,
which uses separate threads to handle multiple simlutaneous requests
(one request/thread). Dunno that the rest of the application is
thread-safe, but using a class variable, in this context, probably isn't.
Thanks to Sam Ruby for the suggestion.
Another request from the old (and apparently defunct) Instiki Bug Tracker:
allow single letter WikiLinks, e.g. "[[a]]". Requested by a Japanese user.
Fixed.
Another very amusing 3-year old bug from the main Instiki Bug Tracker
(don't they ever fix anything?): the chunk-handling code was supposed
to prevent recursive [[!include ...]] statements. Alas, instead of
actually preventing them it would -- when it encountered a recursive
include -- churn away until Rails ran out of stack space.
Fixed.
Previously,
<nowiki>[[!include foo]]</nowiki>
would produce some garbage, like
chunk18226682includechunk
instead of the desired rendered text,
[[!include foo]]
Fixed.
By default, Rails will cache
example.com/mywiki/show/SomePage
and
www.example.com/mywiki/show/SomePage
In Instiki, this just leads to stale cached pages and frustration.
Fix that behaviour.
Be a little gentler in recovering from Instiki::ValidationErrors, when saving a page.
Previously, we threw away all the user's changes upon the redirect. Now we attempt
to salvage what he wrote.
Links to a published web should be to the 'publish' action, not to the
'show' action. Previously, the published status of the source, not the target
was used.
Also, correct display of the Navigation Links for the 'published' action.
The html5lib sanitizer does not necessarily produce well-formed output.
Take some "bad" input, wrap it in a <nowiki> tag and -- bingo! -- you get
ill-formed output.
Fixed. (Though, probably, one should fix the html5lib sanitizer, instead.)
Updated to Rails 2.2.2.
Added a couple more Ruby 1.9 fixes, but that's pretty much at a standstill,
until one gets Maruku and HTML5lib working right under Ruby 1.9.
The new sanitizer seems to work well (cuts the time required
to produce the Instiki Atom feed in half). Our strategy is to
use HTML5lib for <nowiki> content, but to use the new sanitizer
for content that has been processed by Maruku (and hence is
well-formed).
The one broken unit test won't affect us (since it dealt with
very malformed HTML).
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.
(One solution would be to use the HTML5lib parser on <nowiki> blocks.)
In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
The "optimization" of using arrays instead of regexps to
implement to_utf8 and is_utf8? (and their brethren) is
actually no faster. Go back to the logically-clearer implementation.
Previously, used a regexp to find and convert named entities in the content.
Now use a more efficient algorithm.
Similar tweak for converting NCRs before checking whether text is valid utf-8.
Upgraded to Rails 2.0.2, except that we maintain
vendor/rails/actionpack/lib/action_controller/routing.rb
from Rail 1.2.6 (at least for now), so that Routes don't change. We still
get to enjoy Rails's many new features.
Also fixed a bug in Chunk-handling: disable WikiWord processing in tags (for real this time).
OK. This is a better way: define a custom TreeWalker which converts named entities to utf-8 as it goes. This avoids having to do an extra tree traversal in sanitize_rexml, AND avoids the trainwreck that is html5/inputstream.rb.
My REXML::Element.to_ncr (and REXML::Element.to_utf8) is horribly slow. For long documents, it proves more efficient to serialize to a string, apply String.to_ncr (or String.to_utf8) and then Sanitize the string.