instiki

Author	SHA1	Message	Date
Jacques Distler	800880f382	Rough In New Sanitizer Start work (which may not pan out) on a new sanitizer. Right now, it passes all but 1 of the HTML5lib Sanitizer's unit tests. But it doesn't do much of anything to ensure well-formedness. This is not an issue for Maruku-processed content, but it is a concern for <nowiki> blocks. (One solution would be to use the HTML5lib parser on <nowiki> blocks.) In any case, this baby is 3 times as fast as the HTML5lib sanitizer.	2008-05-20 17:02:10 -05:00
Jacques Distler	f8e74e53bd	Rollback The "optimization" of using arrays instead of regexps to implement to_utf8 and is_utf8? (and their brethren) is actually no faster. Go back to the logically-clearer implementation.	2008-05-18 13:22:38 -05:00
Jacques Distler	dfe22be5ff	Minor tweak This is slightly better.	2008-05-17 02:32:20 -05:00
Jacques Distler	41346bf8bd	Efficiency: Entity handling Previously, used a regexp to find and convert named entities in the content. Now use a more efficient algorithm. Similar tweak for converting NCRs before checking whether text is valid utf-8.	2008-05-17 01:43:11 -05:00
Jacques Distler	5ca0760f7c	Efficiency: Sanitize Once Envoke the HTML5lib Sanitizer just once (when the content is finally rendered), rather than each time it passes through the chunk-handler.	2008-05-15 01:22:13 -05:00
Jacques Distler	6359d06ed1	Bug in Include Chunk-handler Fix the chunk-handler for [[!include ...]] so that it behaves as expected.	2008-01-16 11:28:43 -06:00
Jacques Distler	4586614914	Misc Cleanup Cleaned up some dependencies, and added a mime_types.yml file for Mongrel-compatibility.	2008-01-14 14:46:38 -06:00
Jacques Distler	ebc409e1a0	Ensure the_content REALLY is utf-8 Our check that the the_content was valid utf-8 was rather busted. This one works right. In particular, we needed to expand NCRs before checking.	2008-01-03 15:27:03 -06:00
Jacques Distler	c8196cbe41	More Unicode Fun From Philip Taylor (via Henri Sivonen): disallow U+fffe and U+ffff.	2008-01-01 22:00:07 -06:00
Jacques Distler	6873fc8026	Upgrade to Rails 2.0.2 Upgraded to Rails 2.0.2, except that we maintain vendor/rails/actionpack/lib/action_controller/routing.rb from Rail 1.2.6 (at least for now), so that Routes don't change. We still get to enjoy Rails's many new features. Also fixed a bug in Chunk-handling: disable WikiWord processing in tags (for real this time).	2007-12-21 01:48:59 -06:00
Jacques Distler	0f6889e09f	Fix Unicode bug Fix Diego Restrepo's bug (see Rev 184). Update to latest HTML5lib.	2007-12-17 03:17:43 -06:00
Jacques Distler	207fb1f7f2	New Version Sync with Latest Instiki Trunk. Migrate to Rails 1.2.5. Bump version number.	2007-10-15 12:16:54 -05:00
Jacques Distler	de125367b0	Update RDOC documentation. Update the documentation for sanitize.rb, to match current behaviour.	2007-10-14 22:22:18 -05:00
Jacques Distler	1911d18f65	Performance OK. This is a better way: define a custom TreeWalker which converts named entities to utf-8 as it goes. This avoids having to do an extra tree traversal in sanitize_rexml, AND avoids the trainwreck that is html5/inputstream.rb.	2007-10-14 21:07:46 -05:00
Jacques Distler	198d7847bd	Performance My REXML::Element.to_ncr (and REXML::Element.to_utf8) is horribly slow. For long documents, it proves more efficient to serialize to a string, apply String.to_ncr (or String.to_utf8) and then Sanitize the string.	2007-10-13 16:32:04 -05:00
Jacques Distler	5dd75d4cb0	File Upload Links I like this a little better.	2007-10-09 23:56:55 -05:00
Jacques Distler	fbdf4c5dfe	Fix Broken Test Was not picking up user-supplied alt text in [[filename\|Alt text:pic]]. Fixed.	2007-10-09 11:02:44 -05:00
Jacques Distler	0eb723e125	Accessibility: Use Uploaded File Descriptions The file upload dialog asks for a description of the image or file to be uploaded. Use this as the default alt-text for the image and as a title attribute for a file link.	2007-10-09 02:51:38 -05:00
Jacques Distler	be8bb3d06d	InterWeb Links From Jason Blevins: [[Web Name:Page Name]] or [[Web Name:Page Name\|alternate label]] produce inter-Web links on the same Instiki installation.	2007-10-06 16:04:11 -05:00
Jacques Distler	3a3cfeaa9b	Drop URI Chunk-handling The URIChunk and LocalURICunk handlers were 1) Slow 2) Buggy (prone to produce ill-formed pages in edge cases) 3) Of dubious utility So I ditched them. No auto-linked URLs, but who cares?	2007-10-05 16:25:41 -05:00
Jacques Distler	08857ebe8e	Fix Markdown (non-math) Engine, Tweak Themes More tweaks to the supplied S5 themes. Fixed a minor regression in the non-Math Markdown engine.	2007-09-14 18:09:24 -05:00
Jacques Distler	54aada824c	Use Standard PageRenderer for S5 Content From Jason Blevins: use the standard PageRenderer class to render S5 content. This way, WikiWords (etc) are processed in S5 slideshows.	2007-09-14 10:43:03 -05:00
Jacques Distler	119ab342dc	Security: Sanitize <nowiki> Another XSS hole: the contents of <nowiki>...</nowiki> was not being sanitized.	2007-09-10 22:35:50 -05:00
Jacques Distler	9035c98dc5	Bugfix: Category listings Fixed bug where clicking on a category link would stomp on the "All Pages" listing.	2007-09-09 23:20:06 -05:00
Jacques Distler	5b182bd228	HTML5lib Bug Fixed a bug in the HTML5lib tokenizer (affects S5 slideshows). Some miscellaneous code cleanup. In particular, don't bother with zapping control characters; instead, rely on is_utf8? method to raise an exception (which we do anyway).	2007-09-06 10:40:48 -05:00
Jacques Distler	5ff1b7f6da	XSS Security Fix There was a XSS vulnerability in the handling of categories. Now they are escaped.	2007-09-02 00:33:28 -05:00
Jacques Distler	6fd6be8fea	Sanitizer Fix Whoops! Looks like Ryan changed the API for the HTML5 sanitizer. Bad, bad, bad. Fixed now.	2007-08-30 16:06:20 -05:00
Jacques Distler	1bc5da0053	Use XHTMLSerializer, where appropriate.	2007-07-04 18:53:03 -05:00
Jacques Distler	8ccaad85a5	Sync with latest HTML5lib and latest Maruku	2007-07-04 17:36:59 -05:00
Jacques Distler	3de374d6c1	More fixes, sync with HTML5lib Do a better job with the wrapper <div>s added by xhtmldiff and Maruku's to_html_tree method. More tests fixed.	2007-06-13 23:05:15 -05:00
Jacques Distler	3ca33e52b5	Cleanup Got rid of redcloth_for_tex. Fixed almost all the busted tests.	2007-06-13 01:56:44 -05:00
Jacques Distler	2da672ec5b	Many Minor Fixes Fixed a whole bunch of minor stuff. Had a go at getting some of the plethora of broken tests to pass.	2007-06-12 17:37:55 -05:00
Jacques Distler	a68d1aa8f3	Sanitizer API documentation now online See: http://golem.ph.utexas.edu/~distler/code/rdoc/sanitize/	2007-06-08 23:51:30 -05:00
Jacques Distler	f818238dd3	Consolidation Shuffled around a couple of files.	2007-06-08 22:39:37 -05:00
Jacques Distler	3bf560c3b3	Updated to Latest HTML5lib Synced with latest HTML5lib. Added some RDoc-compatible documentation to the sanitizer.	2007-06-08 17:26:00 -05:00
Jacques Distler	8badd0766a	Enhancements to sanitize.rb Options, options, ... options.	2007-06-08 01:23:09 -05:00
Jacques Distler	0298868573	Fix S5 Unicode Make sure sanitize_xhtml and sanitize_html are set to utf-8 encoding. Also, a stylesheet tweak.	2007-06-07 17:30:42 -05:00
Jacques Distler	e1acebe6e4	Bugfix Me stoopid.	2007-06-05 18:06:26 -05:00
Jacques Distler	f0cf0ec625	Sanitize REML trees OK. Enabled sanitization of rexml trees instead of strings. My timing tests seem to be erratic. Can't tell whether this is really faster.	2007-06-05 17:13:44 -05:00
Jacques Distler	bd8ba1f4b1	REXML Trees Synced with latest HTML5lib. Added preliminary support (currently disabled) for sanitizing REXML trees.	2007-06-05 16:34:49 -05:00
Jacques Distler	4dd70af5ae	HTML5lib is Back. Synced with latest version of HTML5lib, which fixes problem with Astral plane characters. I should really do some tests, but the HTML5lib Sanitizer seems to be 2-5 times slower than the old sanitizer.	2007-05-30 10:45:52 -05:00
Jacques Distler	e1a6827f1f	Rollback Switch to HTML5lib Apparently, HTML5lib does not handle astral plane unicode characters correctly. Which makes it useless. Return to the previous sanitizer.	2007-05-29 23:57:39 -05:00
Jacques Distler	6b21ac484f	HTML5lib Sanitizer Replaced native Sanitizer with HTML5lib version. Synced with latest Maruku.	2007-05-25 20:52:27 -05:00
Jacques Distler	b0e063451f	Sanitize Tweak Add 'cite' to the list of attributes whose values are URI's.	2007-04-28 02:09:21 -05:00
Jacques Distler	9b55a75570	More SVG Elements and Attributes Added <tspan> and <marker>, as well as a slew of related SVG attributes. Also an SVG-related stylesheet tweak	2007-04-27 21:52:29 -05:00
Jacques Distler	6ca6525ff7	Add another SVG attribute to Sanitize. Add 'stroke-opacity' to list of allowed SVG attributes.	2007-04-20 16:09:55 -05:00
Jacques Distler	0db06a9fa3	To be really XML-safe, don't emit XHTML+MathML named entities. (Ported MathML::Entities to Ruby.)	2007-03-29 03:30:10 -05:00
Jacques Distler	7adac51d6d	Sync with latest Instiki trunk. Changes: 1) Upgrade Rails to 1.2.3 2) Revert RedCloth to previous version (who %#$@ cares?) 3) Preserve the Rails Security fix to vendor/rails/actionpack/lib/action_controller/caching.rb from Revision 80.	2007-03-18 11:56:12 -05:00
Jacques Distler	d74116dc67	Ensure that input is bona fide utf-8.	2007-03-07 21:06:39 -06:00
Jacques Distler	f208d50032	Bah!	2007-02-24 23:07:25 -06:00

1 2

72 commits