Fix a Ruby 1.9 Character Encoding Bug
Wow, this stuff is complicated! Some things really want to be UTF-8; others really want to be byte strings.
This commit is contained in:
parent
e3832c6f79
commit
34b63a8375
3 changed files with 16 additions and 4 deletions
|
@ -30,9 +30,9 @@ class String
|
|||
# returns a valid utf-8 string, purged of any subsequences of illegal bytes.
|
||||
#--
|
||||
def purify
|
||||
text = check_ncrs
|
||||
if text.respond_to?(:encoding)
|
||||
text.split(//).collect{|c| c.as_bytes}.grep(UTF8_REGEX).join.as_utf8
|
||||
text = self.dup.check_ncrs.as_utf8
|
||||
if text.respond_to?(:force_encoding)
|
||||
text.chars.collect{|c| c.as_bytes}.grep(UTF8_REGEX).join.as_utf8
|
||||
else
|
||||
text.split(//u).grep(UTF8_REGEX).join
|
||||
end
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue