Fix a Ruby 1.9 Character Encoding Bug

Wow, this stuff is complicated!
Some things really want to be UTF-8;
others really want to be byte strings.
This commit is contained in:
Jacques Distler 2009-12-01 12:03:15 -06:00
parent e3832c6f79
commit 34b63a8375
3 changed files with 16 additions and 4 deletions

View file

@ -30,9 +30,9 @@ class String
# returns a valid utf-8 string, purged of any subsequences of illegal bytes.
#--
def purify
text = check_ncrs
if text.respond_to?(:encoding)
text.split(//).collect{|c| c.as_bytes}.grep(UTF8_REGEX).join.as_utf8
text = self.dup.check_ncrs.as_utf8
if text.respond_to?(:force_encoding)
text.chars.collect{|c| c.as_bytes}.grep(UTF8_REGEX).join.as_utf8
else
text.split(//u).grep(UTF8_REGEX).join
end