Jacques Distler
ca1e8de89c
Minor Cleanups
...
Remove a no-longer-needed function.
' -> &39;
Fix regexp for tag chunk.
2008-05-22 02:46:45 -05:00
Jacques Distler
f6508de6dd
Whoops!
...
In some circumstances, the new Sanitizer was double-escaping text nodes.
Fixed (with unit test).
2008-05-21 14:14:43 -05:00
Jacques Distler
45405fc97e
New Sanitizer Goes Live
...
The new sanitizer seems to work well (cuts the time required
to produce the Instiki Atom feed in half). Our strategy is to
use HTML5lib for <nowiki> content, but to use the new sanitizer
for content that has been processed by Maruku (and hence is
well-formed).
The one broken unit test won't affect us (since it dealt with
very malformed HTML).
2008-05-21 02:06:31 -05:00
Jacques Distler
800880f382
Rough In New Sanitizer
...
Start work (which may not pan out) on a new sanitizer. Right now, it passes
all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much
of anything to ensure well-formedness. This is not an issue for Maruku-processed
content, but it is a concern for <nowiki> blocks.
(One solution would be to use the HTML5lib parser on <nowiki> blocks.)
In any case, this baby is 3 times as fast as the HTML5lib sanitizer.
2008-05-20 17:02:10 -05:00
Jacques Distler
f8e74e53bd
Rollback
...
The "optimization" of using arrays instead of regexps to
implement to_utf8 and is_utf8? (and their brethren) is
actually no faster. Go back to the logically-clearer implementation.
2008-05-18 13:22:38 -05:00
Jacques Distler
dfe22be5ff
Minor tweak
...
This is slightly better.
2008-05-17 02:32:20 -05:00
Jacques Distler
41346bf8bd
Efficiency: Entity handling
...
Previously, used a regexp to find and convert named entities in the content.
Now use a more efficient algorithm.
Similar tweak for converting NCRs before checking whether text is valid utf-8.
2008-05-17 01:43:11 -05:00
Jacques Distler
5ca0760f7c
Efficiency: Sanitize Once
...
Envoke the HTML5lib Sanitizer just once (when the content is finally rendered),
rather than each time it passes through the chunk-handler.
2008-05-15 01:22:13 -05:00
Jacques Distler
6359d06ed1
Bug in Include Chunk-handler
...
Fix the chunk-handler for [[!include ...]] so that it behaves as expected.
2008-01-16 11:28:43 -06:00
Jacques Distler
4586614914
Misc Cleanup
...
Cleaned up some dependencies, and added a mime_types.yml file for Mongrel-compatibility.
2008-01-14 14:46:38 -06:00
Jacques Distler
ebc409e1a0
Ensure the_content REALLY is utf-8
...
Our check that the the_content was valid utf-8 was rather busted.
This one works right. In particular, we needed to expand NCRs before checking.
2008-01-03 15:27:03 -06:00
Jacques Distler
c8196cbe41
More Unicode Fun
...
From Philip Taylor (via Henri Sivonen): disallow U+fffe and U+ffff.
2008-01-01 22:00:07 -06:00
Jacques Distler
6873fc8026
Upgrade to Rails 2.0.2
...
Upgraded to Rails 2.0.2, except that we maintain
vendor/rails/actionpack/lib/action_controller/routing.rb
from Rail 1.2.6 (at least for now), so that Routes don't change. We still
get to enjoy Rails's many new features.
Also fixed a bug in Chunk-handling: disable WikiWord processing in tags (for real this time).
2007-12-21 01:48:59 -06:00
Jacques Distler
0f6889e09f
Fix Unicode bug
...
Fix Diego Restrepo's bug (see Rev 184).
Update to latest HTML5lib.
2007-12-17 03:17:43 -06:00
Jacques Distler
207fb1f7f2
New Version
...
Sync with Latest Instiki Trunk.
Migrate to Rails 1.2.5.
Bump version number.
2007-10-15 12:16:54 -05:00
Jacques Distler
de125367b0
Update RDOC documentation.
...
Update the documentation for sanitize.rb, to match current behaviour.
2007-10-14 22:22:18 -05:00
Jacques Distler
1911d18f65
Performance
...
OK. This is a better way: define a custom TreeWalker which converts named entities to utf-8 as it goes. This avoids having to do an extra tree traversal in sanitize_rexml, AND avoids the trainwreck that is html5/inputstream.rb.
2007-10-14 21:07:46 -05:00
Jacques Distler
198d7847bd
Performance
...
My REXML::Element.to_ncr (and REXML::Element.to_utf8) is horribly slow. For long documents, it proves more efficient to serialize to a string, apply String.to_ncr (or String.to_utf8) and then Sanitize the string.
2007-10-13 16:32:04 -05:00
Jacques Distler
5dd75d4cb0
File Upload Links
...
I like this a little better.
2007-10-09 23:56:55 -05:00
Jacques Distler
fbdf4c5dfe
Fix Broken Test
...
Was not picking up user-supplied alt text in [[filename|Alt text:pic]].
Fixed.
2007-10-09 11:02:44 -05:00
Jacques Distler
0eb723e125
Accessibility: Use Uploaded File Descriptions
...
The file upload dialog asks for a description of the image or file to be uploaded. Use this as the default alt-text for the image and as a title attribute for a file link.
2007-10-09 02:51:38 -05:00
Jacques Distler
be8bb3d06d
InterWeb Links
...
From Jason Blevins: [[Web Name:Page Name]] or [[Web Name:Page Name|alternate label]] produce inter-Web links on the same Instiki installation.
2007-10-06 16:04:11 -05:00
Jacques Distler
3a3cfeaa9b
Drop URI Chunk-handling
...
The URIChunk and LocalURICunk handlers were
1) Slow
2) Buggy (prone to produce ill-formed pages in edge cases)
3) Of dubious utility
So I ditched them. No auto-linked URLs, but who cares?
2007-10-05 16:25:41 -05:00
Jacques Distler
08857ebe8e
Fix Markdown (non-math) Engine, Tweak Themes
...
More tweaks to the supplied S5 themes.
Fixed a minor regression in the non-Math Markdown engine.
2007-09-14 18:09:24 -05:00
Jacques Distler
54aada824c
Use Standard PageRenderer for S5 Content
...
From Jason Blevins: use the standard PageRenderer class to render S5 content. This way, WikiWords (etc) are processed in S5 slideshows.
2007-09-14 10:43:03 -05:00
Jacques Distler
119ab342dc
Security: Sanitize <nowiki>
...
Another XSS hole: the contents of <nowiki>...</nowiki> was not being sanitized.
2007-09-10 22:35:50 -05:00
Jacques Distler
9035c98dc5
Bugfix: Category listings
...
Fixed bug where clicking on a category link would stomp on the "All Pages" listing.
2007-09-09 23:20:06 -05:00
Jacques Distler
5b182bd228
HTML5lib Bug
...
Fixed a bug in the HTML5lib tokenizer (affects S5 slideshows).
Some miscellaneous code cleanup. In particular, don't bother with zapping control characters;
instead, rely on is_utf8? method to raise an exception (which we do anyway).
2007-09-06 10:40:48 -05:00
Jacques Distler
5ff1b7f6da
XSS Security Fix
...
There was a XSS vulnerability in the handling of categories. Now they are escaped.
2007-09-02 00:33:28 -05:00
Jacques Distler
6fd6be8fea
Sanitizer Fix
...
Whoops! Looks like Ryan changed the API for the HTML5 sanitizer. Bad, bad, bad.
Fixed now.
2007-08-30 16:06:20 -05:00
Jacques Distler
1bc5da0053
Use XHTMLSerializer, where appropriate.
2007-07-04 18:53:03 -05:00
Jacques Distler
8ccaad85a5
Sync with latest HTML5lib and latest Maruku
2007-07-04 17:36:59 -05:00
Jacques Distler
3de374d6c1
More fixes, sync with HTML5lib
...
Do a better job with the wrapper <div>s added by xhtmldiff and Maruku's to_html_tree method.
More tests fixed.
2007-06-13 23:05:15 -05:00
Jacques Distler
3ca33e52b5
Cleanup
...
Got rid of redcloth_for_tex.
Fixed almost all the busted tests.
2007-06-13 01:56:44 -05:00
Jacques Distler
2da672ec5b
Many Minor Fixes
...
Fixed a whole bunch of minor stuff.
Had a go at getting some of the plethora of broken tests to pass.
2007-06-12 17:37:55 -05:00
Jacques Distler
a68d1aa8f3
Sanitizer API documentation now online
...
See:
http://golem.ph.utexas.edu/~distler/code/rdoc/sanitize/
2007-06-08 23:51:30 -05:00
Jacques Distler
f818238dd3
Consolidation
...
Shuffled around a couple of files.
2007-06-08 22:39:37 -05:00
Jacques Distler
3bf560c3b3
Updated to Latest HTML5lib
...
Synced with latest HTML5lib.
Added some RDoc-compatible documentation to the sanitizer.
2007-06-08 17:26:00 -05:00
Jacques Distler
8badd0766a
Enhancements to sanitize.rb
...
Options, options, ... options.
2007-06-08 01:23:09 -05:00
Jacques Distler
0298868573
Fix S5 Unicode
...
Make sure sanitize_xhtml and sanitize_html are set to utf-8 encoding.
Also, a stylesheet tweak.
2007-06-07 17:30:42 -05:00
Jacques Distler
e1acebe6e4
Bugfix
...
Me stoopid.
2007-06-05 18:06:26 -05:00
Jacques Distler
f0cf0ec625
Sanitize REML trees
...
OK. Enabled sanitization of rexml trees instead of strings.
My timing tests seem to be erratic. Can't tell whether this is really faster.
2007-06-05 17:13:44 -05:00
Jacques Distler
bd8ba1f4b1
REXML Trees
...
Synced with latest HTML5lib.
Added preliminary support (currently disabled) for sanitizing REXML trees.
2007-06-05 16:34:49 -05:00
Jacques Distler
4dd70af5ae
HTML5lib is Back.
...
Synced with latest version of HTML5lib, which fixes problem with Astral plane characters.
I should really do some tests, but the HTML5lib Sanitizer seems to be 2-5 times slower than the old sanitizer.
2007-05-30 10:45:52 -05:00
Jacques Distler
e1a6827f1f
Rollback Switch to HTML5lib
...
Apparently, HTML5lib does not handle astral plane unicode characters correctly.
Which makes it useless.
Return to the previous sanitizer.
2007-05-29 23:57:39 -05:00
Jacques Distler
6b21ac484f
HTML5lib Sanitizer
...
Replaced native Sanitizer with HTML5lib version.
Synced with latest Maruku.
2007-05-25 20:52:27 -05:00
Jacques Distler
b0e063451f
Sanitize Tweak
...
Add 'cite' to the list of attributes whose values are URI's.
2007-04-28 02:09:21 -05:00
Jacques Distler
9b55a75570
More SVG Elements and Attributes
...
Added <tspan> and <marker>, as well as a slew of related SVG attributes.
Also an SVG-related stylesheet tweak
2007-04-27 21:52:29 -05:00
Jacques Distler
6ca6525ff7
Add another SVG attribute to Sanitize.
...
Add 'stroke-opacity' to list of allowed SVG attributes.
2007-04-20 16:09:55 -05:00
Jacques Distler
0db06a9fa3
To be really XML-safe, don't emit XHTML+MathML named entities. (Ported MathML::Entities to Ruby.)
2007-03-29 03:30:10 -05:00
Jacques Distler
7adac51d6d
Sync with latest Instiki trunk. Changes:
...
1) Upgrade Rails to 1.2.3
2) Revert RedCloth to previous version (who %#$@ cares?)
3) Preserve the Rails Security fix to vendor/rails/actionpack/lib/action_controller/caching.rb from Revision 80.
2007-03-18 11:56:12 -05:00
Jacques Distler
d74116dc67
Ensure that input is bona fide utf-8.
2007-03-07 21:06:39 -06:00
Jacques Distler
f208d50032
Bah!
2007-02-24 23:07:25 -06:00
Jacques Distler
507a17aade
More lenient URI scheme matching in sanitize.
2007-02-24 22:47:31 -06:00
Jacques Distler
f9dcfa5af0
Make list of attributes whose values are scanned for acceptable URI schemes customizable.
2007-02-24 11:55:40 -06:00
Jacques Distler
d8e06f6db9
Sanitize URI schemes.
2007-02-23 13:34:58 -06:00
Jacques Distler
e179508377
Sanitization now preserves case-sensitive element and attribute names (necessary to support SVG).
...
Unit tests, galore.
2007-02-23 11:32:06 -06:00
Jacques Distler
2fa1e08c96
Tweak dependencies of sanitize.rb
2007-02-22 01:16:18 -06:00
Jacques Distler
bacae2c468
Finally! XSS-protection, done right.
...
If you want something done right, ...
2007-02-22 01:06:53 -06:00
Jacques Distler
0aafedb2df
More XSS fixes.
...
Started fixing file uploads.
2007-02-21 12:10:47 -06:00
Jacques Distler
88c6f27e14
Bah! *Someone* will care about those other Text-filters.
2007-02-20 08:18:48 -06:00
Jacques Distler
e727507ac8
Zap gremlins.
...
Close cross-site scripting hole.
2007-02-19 23:15:39 -06:00
Jacques Distler
fc15848517
Configure equation-numbering as we like it.
2007-02-14 22:19:37 -06:00
Jacques Distler
ff63e894b2
Sync with latest Maruku.
...
Finally able to ditch BlueCloth completely.
2007-02-14 20:32:24 -06:00
Jacques Distler
d4b947462b
Whoops! Missed one.
2007-02-10 23:17:16 -06:00
Jacques Distler
63e217bcfd
Moved Maruku (and its dependencies) and XHTMLDiff (and its dependencies) to vendor/plugins/ .
...
Synced with Instiki SVN.
2007-02-10 23:03:15 -06:00
Jacques Distler
0ac586ee25
Sync with latest Maruku.
2007-02-04 19:36:33 -06:00
Jacques Distler
8c52f28864
Replaced diff.rb with xhtmldiff.rb, which (unlike its predecessor) produces well-formed redline documents.
2007-02-03 22:52:48 -06:00
Jacques Distler
86e9c70a26
Fix regression in Maruku.
2007-02-02 01:00:02 -06:00
Jacques Distler
f406318168
Sync with Maruku.
2007-01-24 17:14:50 -06:00
Jacques Distler
488dd334f7
Support for IE+MathPlayer.
...
Sync with latest Maruku.
2007-01-24 10:53:10 -06:00
Jacques Distler
1c05a94d1b
Updated to latest Maruku.
2007-01-23 09:26:45 -06:00
Jacques Distler
ceb0931bb3
Sync to lastest Maruku. Tweak to CSS stylesheet.
2007-01-22 11:34:51 -06:00
Jacques Distler
b19e1e4f47
Bring up to current.
2007-01-22 08:36:51 -06:00
Jacques Distler
69b62b6f33
Checkout of Instiki Trunk 1/21/2007.
2007-01-22 07:43:50 -06:00