Better algorithm to deal with encodings. Moved fallback rescue message from view to encode library.

This helps fix cases where UTF-8 is wrongly identified as ISO-8859-1. We will only try to convert strings if we are 100% sure about the charset, otherwise, we will fallback to UTF-8.
This commit is contained in:
Gabriel Mazetto 2012-05-26 20:15:06 -03:00
parent 48a36851e6
commit 50c2c16a4d
2 changed files with 7 additions and 4 deletions

View file

@ -8,7 +8,7 @@
%strong.cgray= commit.author_name
–
= image_tag gravatar_icon(commit.author_email), :class => "avatar", :width => 16
%span.row_title= truncate(commit.safe_message, :length => 50) rescue "--broken encoding"
%span.row_title= truncate(commit.safe_message, :length => 50)
%span.right.cgray
= time_ago_in_words(commit.committed_date)

View file

@ -8,16 +8,19 @@ module Gitlabhq
def utf8 message
return nil unless message
encoding = detect_encoding(message)
if encoding
detect = CharlockHolmes::EncodingDetector.detect(message) rescue {}
# It's better to default to UTF-8 as sometimes it's wrongly detected as another charset
if detect[:encoding] && detect[:confidence] == 100
CharlockHolmes::Converter.convert(message, encoding, 'UTF-8')
else
message
end.force_encoding("utf-8")
# Prevent app from crash cause of
# encoding errors
rescue
""
"--broken encoding: #{encoding}"
end
def detect_encoding message