Commit graph

28 commits

Author SHA1 Message Date
Jacques Distler ca1e8de89c Minor Cleanups
Remove a no-longer-needed function.
' -> &39;
Fix regexp for tag chunk.
2008-05-22 02:46:45 -05:00
Jacques Distler f6508de6dd Whoops!
In some circumstances, the new Sanitizer was double-escaping text nodes.
Fixed (with unit test).
2008-05-21 14:14:43 -05:00
Jacques Distler 45405fc97e New Sanitizer Goes Live
The new sanitizer seems to work well (cuts the time required
to produce the Instiki Atom feed in half). Our strategy is to
use HTML5lib for <nowiki> content, but to use the new sanitizer
for content that has been processed by Maruku (and hence is
well-formed).

The one broken unit test won't affect us (since it dealt with
very malformed HTML).
2008-05-21 02:06:31 -05:00
Jacques Distler 800880f382 Rough In New Sanitizer
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.

(One solution would be to use the HTML5lib parser on <nowiki> blocks.)

In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
2008-05-20 17:02:10 -05:00
Jacques Distler 14afed5893 Test for Entity-handling 2008-05-17 15:02:16 -05:00
Jacques Distler 41346bf8bd Efficiency: Entity handling
Previously, used a regexp to find and convert named entities in the content.
Now use a more efficient algorithm.
Similar tweak for converting NCRs before checking whether text is valid utf-8.
2008-05-17 01:43:11 -05:00
Jacques Distler 5ca0760f7c Efficiency: Sanitize Once
Envoke the HTML5lib Sanitizer just once (when the content is finally rendered),
rather than each time it passes through the chunk-handler.
2008-05-15 01:22:13 -05:00
Jacques Distler 5dd0507acc Support svg:foreignObject
Fixes to the html5lib sanitizer and maruku to support the SVG <foreignObject> element.
Also update to the latest REXML.
2008-02-03 23:56:17 -06:00
Jacques Distler 38ae064b8a Bundle Latest REXML
Sam Ruby has been doing a bang-up job fixing the bugs in REXML.
Who knows when these improvements will trickle down to vendor distributions of Ruby.
In the meantime, let's bundle the latest version of REXML with Instiki.
We check the version number of the bundled REXML against that of the System REXML, and use whichever is later.
2008-01-11 23:53:29 -06:00
Jacques Distler e74deb0cfb Unit test
Add a unit test for previous WikiWord fix.
2007-12-21 08:53:45 -06:00
Jacques Distler 6873fc8026 Upgrade to Rails 2.0.2
Upgraded to Rails 2.0.2, except that we maintain

   vendor/rails/actionpack/lib/action_controller/routing.rb

from Rail 1.2.6 (at least for now), so that Routes don't change. We still
get to enjoy Rails's many new features.

Also fixed a bug in Chunk-handling: disable WikiWord processing in tags (for real this time).
2007-12-21 01:48:59 -06:00
Jacques Distler 9c55037626 Some more tests to track down Diego Restrepo's bug 2007-10-28 14:04:30 -05:00
Jacques Distler 5208bbf0af Sanitize url refs in SVG attributes
Add some tests.
Sync with latest HTML5lib (includes above sanitization improvements).
2007-10-27 17:34:29 -05:00
Jacques Distler ae82f1be49 Whoops!
Fix an inadvertently broken test.
2007-10-26 16:09:50 -05:00
Jacques Distler 8ce5016b41 UTF-8 Bug
Create a test case for utf-8 bug reported by Diego Restrepo. Seems to be related to WikiWord chunk handling.
Add some other tests, and fix a minor bug in vendor/plugins/maruku/lib/maruku/ext/math/latex_fix.rb.
2007-10-26 00:48:43 -05:00
Jacques Distler 1911d18f65 Performance
OK. This is a better way: define a custom TreeWalker which converts named entities to utf-8 as it goes. This avoids having to do an extra tree traversal in sanitize_rexml, AND avoids the trainwreck that is html5/inputstream.rb.
2007-10-14 21:07:46 -05:00
Jacques Distler 198d7847bd Performance
My REXML::Element.to_ncr (and REXML::Element.to_utf8) is horribly slow. For long documents, it proves more efficient to serialize to a string, apply String.to_ncr (or String.to_utf8) and then Sanitize the string.
2007-10-13 16:32:04 -05:00
Jacques Distler 5dd75d4cb0 File Upload Links
I like this a little better.
2007-10-09 23:56:55 -05:00
Jacques Distler 402de89abf Tests for Rev 171
One test is still broken. Will fix.
2007-10-09 03:16:07 -05:00
Jacques Distler 2484542f12 Security: HTTP GET Bypassed Spam Protection
Apparently, the form_spam_protect plugin only works with HTTP POST, not GET.
Unsafe operations (save and file-upload) should be POSTs anyway.
Fixed.

Also, two broken tests fixed. Only two Unit Tests now fail: both are minor bugs in XHTMLDiff.
2007-10-07 01:59:50 -05:00
Jacques Distler be8bb3d06d InterWeb Links
From Jason Blevins:  [[Web Name:Page Name]] or [[Web Name:Page Name|alternate label]] produce inter-Web links on the same Instiki installation.
2007-10-06 16:04:11 -05:00
Jacques Distler 3f5d804c22 Testcases for Recent XSS flaws
Testcases for unsanitized chunk-handling.
2007-09-11 20:49:56 -05:00
Jacques Distler d0e834978a Fix Broken Tests
In preparation for adding new tests, let's fix the existing ones.
3 Unit tests and one Functional test still fail.

* Two unit tests are bugs in xhtmldiff
* One is a bug in Maruku
* A file upload functional test fails, for reasons that escape me.
2007-09-11 12:04:26 -05:00
Jacques Distler 3de374d6c1 More fixes, sync with HTML5lib
Do a better job with the wrapper <div>s added by xhtmldiff and Maruku's to_html_tree method.
More tests fixed.
2007-06-13 23:05:15 -05:00
Jacques Distler 3ca33e52b5 Cleanup
Got rid of redcloth_for_tex.
Fixed almost all the busted tests.
2007-06-13 01:56:44 -05:00
Jacques Distler 2da672ec5b Many Minor Fixes
Fixed a whole bunch of minor stuff.
Had a go at getting some of the plethora of broken tests to pass.
2007-06-12 17:37:55 -05:00
Jacques Distler 3b6cd309ff Sync with Instiki Trunk
Sync with Revision 519 of Instiki trunk (2007/5/7).
2007-05-11 11:47:38 -05:00
Jacques Distler 69b62b6f33 Checkout of Instiki Trunk 1/21/2007. 2007-01-22 07:43:50 -06:00