Synced with latest HTML5lib. Added some RDoc-compatible documentation to the sanitizer.
Some more tweaks
Synced with latest HTML5lib. Added preliminary support (currently disabled) for sanitizing REXML trees.