Synced with latest HTML5lib. Added preliminary support (currently disabled) for sanitizing REXML trees.
Synced with latest version of HTML5lib, which fixes problem with Astral plane characters. I should really do some tests, but the HTML5lib Sanitizer seems to be 2-5 times slower than the old sanitizer.