Add a couple of XSS tests.

Some more tests from Clint Ruoho. The main branch of Instiki (and, I guess,
the old sanitizer) are vulnerable.

Also: under Ruby 1.8.x, CGI.unescapeHTML screws up horribly decoding NCRs
which represent high-bit ASCII characters. UTF-8 agrees with 7-bit ASCII,
but CGI.unescapeHTML doesn't seem to know that they disagree for i>127.
This commit is contained in:
Jacques Distler 2009-01-05 16:25:27 -06:00
parent 8832dd3438
commit 52c1f74ecc
4 changed files with 90 additions and 3 deletions

View file

@ -133,7 +133,7 @@ module Sanitizer
if node.attributes
node.attributes.delete_if { |attr,v| !ALLOWED_ATTRIBUTES.include?(attr) }
ATTR_VAL_IS_URI.each do |attr|
val_unescaped = CGI.unescapeHTML(node.attributes[attr].to_s).gsub(/`|[\000-\040\177\s\200-\240]/,'').downcase
val_unescaped = node.attributes[attr].to_s.unescapeHTML.gsub(/`|[\000-\040\177\s]+|\302[\200-\240]/,'').downcase
if val_unescaped =~ /^[a-z0-9][-+.a-z0-9]*:/ and !ALLOWED_PROTOCOLS.include?(val_unescaped.split(':')[0])
node.attributes.delete attr
end