Message from discussion
XSS Vulnerability in Ruby on Rails
Received: by 10.204.36.197 with SMTP id u5mr400180bkd.25.1252020860295;
Thu, 03 Sep 2009 16:34:20 -0700 (PDT)
Received: by 10.204.36.197 with SMTP id u5mr400179bkd.25.1252020860247;
Thu, 03 Sep 2009 16:34:20 -0700 (PDT)
Return-Path: <koziar...@gmail.com>
Received: from mail-bw0-f222.google.com (mail-bw0-f222.google.com [209.85.218.222])
by gmr-mx.google.com with ESMTP id 16si48541fxm.2.2009.09.03.16.34.19;
Thu, 03 Sep 2009 16:34:19 -0700 (PDT)
Received-SPF: pass (google.com: domain of koziar...@gmail.com designates 209.85.218.222 as permitted sender) client-ip=209.85.218.222;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of koziar...@gmail.com designates 209.85.218.222 as permitted sender) smtp.mail=koziar...@gmail.com; dkim=pass (test mode) header...@gmail.com
Received: by bwz22 with SMTP id 22so309330bwz.9
for <rubyonrails-security@googlegroups.com>; Thu, 03 Sep 2009 16:34:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=gamma;
h=domainkey-signature:received:received:sender:message-id:date:from
:user-agent:mime-version:to:subject:x-enigmail-version:openpgp
:content-type;
bh=7jQxVC0+O3K/heHEnb+6DzTFsi941DvQlDVa3xe1BeA=;
b=oT+Cgf9DVLxYNgBjr2I4mw6QPleuj6FE+C+0kdIZl42LnGtyip9SLmLq3b4162gHJh
O+/ybsTkmpkrDmw63gvzv34N9qTFq7FpELocsGzjT412evw5Wh28MqsLm97YrDMRA+V0
qnuziZtnmJpEXn5OA7fFjpRG8KwEafr01/3E4=
DomainKey-Signature: a=rsa-sha1; c=nofws;
d=gmail.com; s=gamma;
h=sender:message-id:date:from:user-agent:mime-version:to:subject
:x-enigmail-version:openpgp:content-type;
b=YOSYAGUYw0g2shlQ5VJe1ryDjdft77TGPOxaHqboCNsBRhqHocwVTNh7D7KmEU0m2u
V9oerylBYk18pv+hFfkxpEko/YHCXw5wrM+8J+XquZORQBx3sKgXVS55gL+nrGsuZZFv
S/Zj3H5/xC3xw64kcDO66PR2oQ3I8p0QrNWBc=
Received: by 10.102.13.21 with SMTP id 21mr4448116mum.100.1252020858792;
Thu, 03 Sep 2009 16:34:18 -0700 (PDT)
Return-Path: <koziar...@gmail.com>
Received: from Macintosh-4.local (158.96.124.202.static.snap.net.nz [202.124.96.158])
by mx.google.com with ESMTPS id y6sm897438mug.40.2009.09.03.16.34.12
(version=SSLv3 cipher=RC4-MD5);
Thu, 03 Sep 2009 16:34:17 -0700 (PDT)
Sender: Michael Koziarski <koziar...@gmail.com>
Message-ID: <4AA05271.9060...@koziarski.com>
Date: Fri, 04 Sep 2009 11:34:09 +1200
From: Michael Koziarski <mich...@koziarski.com>
User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812)
MIME-Version: 1.0
To: rubyonrails-security@googlegroups.com
Subject: XSS Vulnerability in Ruby on Rails
X-Enigmail-Version: 0.96.0
OpenPGP: id=10F695F3
Content-Type: multipart/mixed;
boundary="------------080309000404080903030209"
This is a multi-part message in MIME format.
--------------080309000404080903030209
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
XSS Vulnerability in Ruby on Rails
==================================
There is a vulnerability in the escaping code for the form helpers in
Ruby on Rails. Attackers who can inject deliberately malformed unicode
strings into the form helpers can defeat the escaping checks and inject
arbitrary HTML.
Versions Affected: 2.0.0 and *all* subsequent versions.
Not affected: Applications running on ruby 1.9
Fixed Versions: 2.3.4, 2.2.3
Candidate CVE: CVE-2009-3009
Impact
------
Due to the way that most databases either don't accept or actively
cleanse malformed unicode strings this vulnerability is most likely to
be exploited by non-persistent attacks however persistent attacks may
still be possible in some configurations.
*All* users of affected versions are advised to upgrade to a fixed versions.
Releases
--------
The 2.3.4 and 2.2.3 releases will be made available later today and
tomorrow which will contain fixes for this issue amongst others.
Patches
-------
In order to provide the fixes for users who are running unsupported
releases, or are unable to upgrade at present we have provided patches
against all affected stable release branches.
The patches are in a format suitable for git-am and consist of two
changesets. The code for cleansing multi-byte strings, and the
introduction of that code to the relevant helpers.
* 2-0-CVE-2009-3009.patch - Patch for 2.0 series
* 2-1-CVE-2009-3009.patch - Patch for 2.1 series
* 2-2-CVE-2009-3009.patch - Patch for 2.2 series
* 2-3-CVE-2009-3009.patch - Patch for 2.3 series
Please note that only the 2.2.x and 2.3.x series are supported at
present. Users of earlier unsupported releases are advised to upgrade
sooner rather than later as we cannot guarantee that future issues will
be backported in this manner.
Credits
-------
Thanks to Brian Mastenbrook for reporting the vulnerability to us, and
Manfred Stienstra from Fingertips for his work with us on the fix.
--
Cheers,
Koz
--------------080309000404080903030209
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
name="2-1-CVE-2009-3009.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="2-1-CVE-2009-3009.patch"
>From e3db21fe4f54539be7fc212167553665970a955f Mon Sep 17 00:00:00 2001
From: Manfred Stienstra <manf...@fngtps.com>
Date: Tue, 1 Sep 2009 20:16:11 +0200
Subject: [PATCH] Add methods for string verification and encoding cleanup code.
Signed-off-by: Michael Koziarski <mich...@koziarski.com>
---
activesupport/lib/active_support/multibyte.rb | 18 ++++
.../multibyte/handlers/utf8_handler.rb | 13 +--
.../lib/active_support/multibyte/utils.rb | 39 +++++++
activesupport/test/multibyte_utils_test.rb | 106 ++++++++++++++++++++
4 files changed, 165 insertions(+), 11 deletions(-)
create mode 100644 activesupport/lib/active_support/multibyte/utils.rb
create mode 100644 activesupport/test/multibyte_utils_test.rb
diff --git a/activesupport/lib/active_support/multibyte.rb b/activesupport/lib/active_support/multibyte.rb
index 27c0d18..f76cfba 100644
--- a/activesupport/lib/active_support/multibyte.rb
+++ b/activesupport/lib/active_support/multibyte.rb
@@ -3,7 +3,25 @@ module ActiveSupport
DEFAULT_NORMALIZATION_FORM = :kc
NORMALIZATIONS_FORMS = [:c, :kc, :d, :kd]
UNICODE_VERSION = '5.0.0'
+
+ # Regular expressions that describe valid byte sequences for a character
+ VALID_CHARACTER = {
+ # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
+ 'UTF-8' => /\A(?:
+ [\x00-\x7f] |
+ [\xc2-\xdf] [\x80-\xbf] |
+ \xe0 [\xa0-\xbf] [\x80-\xbf] |
+ [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
+ \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf])\z /xn,
+ # Quick check for valid Shift-JIS characters, disregards the odd-even pairing
+ 'Shift_JIS' => /\A(?:
+ [\x00-\x7e \xa1-\xdf] |
+ [\x81-\x9f \xe0-\xef] [\x40-\x7e \x80-\x9e \x9f-\xfc])\z /xn
+ }
end
end
require 'active_support/multibyte/chars'
+require 'active_support/multibyte/utils'
\ No newline at end of file
diff --git a/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb b/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb
index aa9c16f..2bbb2fa 100644
--- a/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb
+++ b/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb
@@ -100,16 +100,7 @@ module ActiveSupport::Multibyte::Handlers #:nodoc:
# between little and big endian. This is not an issue in utf-8, so it must be ignored.
UNICODE_LEADERS_AND_TRAILERS = UNICODE_WHITESPACE + [65279] # ZERO-WIDTH NO-BREAK SPACE aka BOM
- # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
- UTF8_PAT = /\A(?:
- [\x00-\x7f] |
- [\xc2-\xdf] [\x80-\xbf] |
- \xe0 [\xa0-\xbf] [\x80-\xbf] |
- [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
- \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
- [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
- \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf]
- )*\z/xn
+ UTF8_PAT = ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8']
# Returns a regular expression pattern that matches the passed Unicode codepoints
def self.codepoints_to_pattern(array_of_codepoints) #:nodoc:
@@ -357,7 +348,7 @@ module ActiveSupport::Multibyte::Handlers #:nodoc:
# Replaces all the non-utf-8 bytes by their iso-8859-1 or cp1252 equivalent resulting in a valid utf-8 string
def tidy_bytes(str)
str.split(//u).map do |c|
- if !UTF8_PAT.match(c)
+ if !ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'].match(c)
n = c.unpack('C')[0]
n < 128 ? n.chr :
n < 160 ? [UCD.cp1252[n] || n].pack('U') :
diff --git a/activesupport/lib/active_support/multibyte/utils.rb b/activesupport/lib/active_support/multibyte/utils.rb
new file mode 100644
index 0000000..094e856
--- /dev/null
+++ b/activesupport/lib/active_support/multibyte/utils.rb
@@ -0,0 +1,39 @@
+module ActiveSupport #:nodoc:
+ module Multibyte #:nodoc:
+ # Returns a regular expression that matches valid characters in the current encoding
+ def self.valid_character
+ case $KCODE
+ when 'UTF8'
+ VALID_CHARACTER['UTF-8']
+ when 'SJIS'
+ VALID_CHARACTER['Shift_JIS']
+ end
+ end
+
+ # Verifies the encoding of a string
+ def self.verify(string)
+ if expression = valid_character
+ for c in string.split(//)
+ return false unless valid_character.match(c)
+ end
+ end
+ true
+ end
+
+ # Verifies the encoding of the string and raises an exception when it's not valid
+ def self.verify!(string)
+ raise ActiveSupport::Multibyte::Handlers::EncodingError.new("Found characters with invalid encoding") unless verify(string)
+ end
+
+ # Removes all invalid characters from the string
+ def self.clean(string)
+ if expression = valid_character
+ stripped = []; for c in string.split(//)
+ stripped << c if valid_character.match(c)
+ end; stripped.join
+ else
+ string
+ end
+ end
+ end
+end
\ No newline at end of file
diff --git a/activesupport/test/multibyte_utils_test.rb b/activesupport/test/multibyte_utils_test.rb
new file mode 100644
index 0000000..a4bcfc8
--- /dev/null
+++ b/activesupport/test/multibyte_utils_test.rb
@@ -0,0 +1,106 @@
+require 'abstract_unit'
+
+class MultibyteUtilsTest < Test::Unit::TestCase
+
+ def test_valid_character_returns_an_expression_for_the_current_encoding
+ with_kcode('None') do
+ assert_nil ActiveSupport::Multibyte.valid_character
+ end
+ with_kcode('UTF8') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'], ActiveSupport::Multibyte.valid_character
+ end
+ with_kcode('SJIS') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['Shift_JIS'], ActiveSupport::Multibyte.valid_character
+ end
+ end
+
+ def test_verify_verifies_ASCII_strings_are_properly_encoded
+ with_kcode('None') do
+ examples.each do |example|
+ assert ActiveSupport::Multibyte.verify(example)
+ end
+ end
+ end
+
+ def test_verify_verifies_UTF_8_strings_are_properly_encoded
+ with_kcode('UTF8') do
+ assert ActiveSupport::Multibyte.verify(example('valid UTF-8'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid UTF-8'))
+ end
+ end
+
+ def test_verify_verifies_Shift_JIS_strings_are_properly_encoded
+ with_kcode('SJIS') do
+ assert ActiveSupport::Multibyte.verify(example('valid Shift-JIS'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid Shift-JIS'))
+ end
+ end
+
+ def test_verify_bang_raises_an_exception_when_it_finds_an_invalid_character
+ with_kcode('UTF8') do
+ assert_raises(ActiveSupport::Multibyte::Handlers::EncodingError) do
+ ActiveSupport::Multibyte.verify!(example('invalid UTF-8'))
+ end
+ end
+ end
+
+ def test_verify_bang_doesnt_raise_an_exception_when_the_encoding_is_valid
+ with_kcode('UTF8') do
+ assert_nothing_raised do
+ ActiveSupport::Multibyte.verify!(example('valid UTF-8'))
+ end
+ end
+ end
+
+ def test_clean_leaves_ASCII_strings_intact
+ with_kcode('None') do
+ [
+ 'word', "\270\236\010\210\245"
+ ].each do |string|
+ assert_equal string, ActiveSupport::Multibyte.clean(string)
+ end
+ end
+ end
+
+ def test_clean_cleans_invalid_characters_from_UTF_8_encoded_strings
+ with_kcode('UTF8') do
+ cleaned_utf8 = [8].pack('C*')
+ assert_equal example('valid UTF-8'), ActiveSupport::Multibyte.clean(example('valid UTF-8'))
+ assert_equal cleaned_utf8, ActiveSupport::Multibyte.clean(example('invalid UTF-8'))
+ end
+ end
+
+ def test_clean_cleans_invalid_characters_from_Shift_JIS_encoded_strings
+ with_kcode('SJIS') do
+ cleaned_sjis = [184, 0, 136, 165].pack('C*')
+ assert_equal example('valid Shift-JIS'), ActiveSupport::Multibyte.clean(example('valid Shift-JIS'))
+ assert_equal cleaned_sjis, ActiveSupport::Multibyte.clean(example('invalid Shift-JIS'))
+ end
+ end
+
+ private
+
+ STRINGS = {
+ 'valid ASCII' => [65, 83, 67, 73, 73].pack('C*'),
+ 'invalid ASCII' => [128].pack('C*'),
+ 'valid UTF-8' => [227, 129, 147, 227, 129, 171, 227, 129, 161, 227, 130, 143].pack('C*'),
+ 'invalid UTF-8' => [184, 158, 8, 136, 165].pack('C*'),
+ 'valid Shift-JIS' => [131, 122, 129, 91, 131, 128].pack('C*'),
+ 'invalid Shift-JIS' => [184, 158, 8, 0, 255, 136, 165].pack('C*')
+ }
+
+ def example(key)
+ STRINGS[key]
+ end
+
+ def examples
+ STRINGS.values
+ end
+
+ def with_kcode(code)
+ before = $KCODE
+ $KCODE = code
+ yield
+ $KCODE = before
+ end
+end
\ No newline at end of file
--
1.6.0.1
>From 9af2823b32e001358babde7644e5cc1c0ec29d6e Mon Sep 17 00:00:00 2001
From: Michael Koziarski <mich...@koziarski.com>
Date: Mon, 31 Aug 2009 12:07:30 -0700
Subject: [PATCH] Clean tag attributes before passing through the escape_once logic.
Addresses CVE-2009-3009
---
actionpack/lib/action_view/helpers/tag_helper.rb | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/actionpack/lib/action_view/helpers/tag_helper.rb b/actionpack/lib/action_view/helpers/tag_helper.rb
index ba43b5e..623b8f7 100644
--- a/actionpack/lib/action_view/helpers/tag_helper.rb
+++ b/actionpack/lib/action_view/helpers/tag_helper.rb
@@ -101,7 +101,7 @@ module ActionView
# escape_once("<< Accept & Checkout")
# # => "<< Accept & Checkout"
def escape_once(html)
- html.to_s.gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
+ ActiveSupport::Multibyte.clean(html.to_s).gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
end
private
--
1.6.0.1
--------------080309000404080903030209
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
name="2-2-CVE-2009-3009.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="2-2-CVE-2009-3009.patch"
>From 9c61eb32c534c706815f80eb375012cdcf854e71 Mon Sep 17 00:00:00 2001
From: Michael Koziarski <mich...@koziarski.com>
Date: Mon, 31 Aug 2009 12:20:46 -0700
Subject: [PATCH] Add verify and clean methods to ActiveSupport::Multibyte.
When accepting character input from outside of your application you can't
blindly trust that all strings are properly encoded. With these methods
you can check incoming strings and clean them up if necessary.
Signed-off-by: Michael Koziarski <mich...@koziarski.com>
Conflicts:
activesupport/lib/active_support/multibyte/chars.rb
---
activesupport/lib/active_support/multibyte.rb | 36 ++++-
.../lib/active_support/multibyte/chars.rb | 25 ++---
.../lib/active_support/multibyte/utils.rb | 61 +++++++++
activesupport/test/multibyte_utils_test.rb | 141 ++++++++++++++++++++
4 files changed, 241 insertions(+), 22 deletions(-)
create mode 100644 activesupport/lib/active_support/multibyte/utils.rb
create mode 100644 activesupport/test/multibyte_utils_test.rb
diff --git a/activesupport/lib/active_support/multibyte.rb b/activesupport/lib/active_support/multibyte.rb
index 65a96af..b6354ee 100644
--- a/activesupport/lib/active_support/multibyte.rb
+++ b/activesupport/lib/active_support/multibyte.rb
@@ -1,9 +1,5 @@
# encoding: utf-8
-require 'active_support/multibyte/chars'
-require 'active_support/multibyte/exceptions'
-require 'active_support/multibyte/unicode_database'
-
module ActiveSupport #:nodoc:
module Multibyte
# A list of all available normalization forms. See http://www.unicode.org/reports/tr15/tr15-29.html for more
@@ -27,7 +23,35 @@ module ActiveSupport #:nodoc:
#
# Example:
# ActiveSupport::Multibyte.proxy_class = CharsForUTF32
- mattr_accessor :proxy_class
- self.proxy_class = ActiveSupport::Multibyte::Chars
+ def self.proxy_class=(klass)
+ @proxy_class = klass
+ end
+
+ # Returns the currect proxy class
+ def self.proxy_class
+ @proxy_class ||= ActiveSupport::Multibyte::Chars
+ end
+
+ # Regular expressions that describe valid byte sequences for a character
+ VALID_CHARACTER = {
+ # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
+ 'UTF-8' => /\A(?:
+ [\x00-\x7f] |
+ [\xc2-\xdf] [\x80-\xbf] |
+ \xe0 [\xa0-\xbf] [\x80-\xbf] |
+ [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
+ \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf])\z /xn,
+ # Quick check for valid Shift-JIS characters, disregards the odd-even pairing
+ 'Shift_JIS' => /\A(?:
+ [\x00-\x7e \xa1-\xdf] |
+ [\x81-\x9f \xe0-\xef] [\x40-\x7e \x80-\x9e \x9f-\xfc])\z /xn
+ }
end
end
+
+require 'active_support/multibyte/chars'
+require 'active_support/multibyte/exceptions'
+require 'active_support/multibyte/unicode_database'
+require 'active_support/multibyte/utils'
diff --git a/activesupport/lib/active_support/multibyte/chars.rb b/activesupport/lib/active_support/multibyte/chars.rb
index a00b165..5199bf9 100644
--- a/activesupport/lib/active_support/multibyte/chars.rb
+++ b/activesupport/lib/active_support/multibyte/chars.rb
@@ -73,16 +73,7 @@ module ActiveSupport #:nodoc:
UNICODE_TRAILERS_PAT = /(#{codepoints_to_pattern(UNICODE_LEADERS_AND_TRAILERS)})+\Z/
UNICODE_LEADERS_PAT = /\A(#{codepoints_to_pattern(UNICODE_LEADERS_AND_TRAILERS)})+/
- # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
- UTF8_PAT = /\A(?:
- [\x00-\x7f] |
- [\xc2-\xdf] [\x80-\xbf] |
- \xe0 [\xa0-\xbf] [\x80-\xbf] |
- [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
- \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
- [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
- \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf]
- )*\z/xn
+ UTF8_PAT = ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8']
attr_reader :wrapped_string
alias to_s wrapped_string
@@ -292,23 +283,23 @@ module ActiveSupport #:nodoc:
def rstrip
chars(@wrapped_string.gsub(UNICODE_TRAILERS_PAT, ''))
end
-
+
# Strips entire range of Unicode whitespace from the left of the string.
def lstrip
chars(@wrapped_string.gsub(UNICODE_LEADERS_PAT, ''))
end
-
+
# Strips entire range of Unicode whitespace from the right and left of the string.
def strip
rstrip.lstrip
end
-
+
# Returns the number of codepoints in the string
def size
self.class.u_unpack(@wrapped_string).size
end
alias_method :length, :size
-
+
# Reverses all characters in the string.
#
# Example:
@@ -316,7 +307,7 @@ module ActiveSupport #:nodoc:
def reverse
chars(self.class.u_unpack(@wrapped_string).reverse.pack('U*'))
end
-
+
# Implements Unicode-aware slice with codepoints. Slicing on one point returns the codepoints for that
# character.
#
@@ -617,7 +608,9 @@ module ActiveSupport #:nodoc:
# Replaces all ISO-8859-1 or CP1252 characters by their UTF-8 equivalent resulting in a valid UTF-8 string.
def tidy_bytes(string)
string.split(//u).map do |c|
- if !UTF8_PAT.match(c)
+ c.force_encoding(Encoding::ASCII) if c.respond_to?(:force_encoding)
+
+ if !ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'].match(c)
n = c.unpack('C')[0]
n < 128 ? n.chr :
n < 160 ? [UCD.cp1252[n] || n].pack('U') :
diff --git a/activesupport/lib/active_support/multibyte/utils.rb b/activesupport/lib/active_support/multibyte/utils.rb
new file mode 100644
index 0000000..acef84d
--- /dev/null
+++ b/activesupport/lib/active_support/multibyte/utils.rb
@@ -0,0 +1,61 @@
+# encoding: utf-8
+
+module ActiveSupport #:nodoc:
+ module Multibyte #:nodoc:
+ if Kernel.const_defined?(:Encoding)
+ # Returns a regular expression that matches valid characters in the current encoding
+ def self.valid_character
+ VALID_CHARACTER[Encoding.default_internal.to_s]
+ end
+ else
+ def self.valid_character
+ case $KCODE
+ when 'UTF8'
+ VALID_CHARACTER['UTF-8']
+ when 'SJIS'
+ VALID_CHARACTER['Shift_JIS']
+ end
+ end
+ end
+
+ if 'string'.respond_to?(:valid_encoding?)
+ # Verifies the encoding of a string
+ def self.verify(string)
+ string.valid_encoding?
+ end
+ else
+ def self.verify(string)
+ if expression = valid_character
+ for c in string.split(//)
+ return false unless valid_character.match(c)
+ end
+ end
+ true
+ end
+ end
+
+ # Verifies the encoding of the string and raises an exception when it's not valid
+ def self.verify!(string)
+ raise EncodingError.new("Found characters with invalid encoding") unless verify(string)
+ end
+
+ if 'string'.respond_to?(:force_encoding)
+ # Removes all invalid characters from the string.
+ #
+ # Note: this method is a no-op in Ruby 1.9
+ def self.clean(string)
+ string
+ end
+ else
+ def self.clean(string)
+ if expression = valid_character
+ stripped = []; for c in string.split(//)
+ stripped << c if valid_character.match(c)
+ end; stripped.join
+ else
+ string
+ end
+ end
+ end
+ end
+end
\ No newline at end of file
diff --git a/activesupport/test/multibyte_utils_test.rb b/activesupport/test/multibyte_utils_test.rb
new file mode 100644
index 0000000..d8ac5ff
--- /dev/null
+++ b/activesupport/test/multibyte_utils_test.rb
@@ -0,0 +1,141 @@
+# encoding: utf-8
+
+require 'abstract_unit'
+require 'multibyte_test_helpers'
+
+class MultibyteUtilsTest < ActiveSupport::TestCase
+ include MultibyteTestHelpers
+
+ test "valid_character returns an expression for the current encoding" do
+ with_encoding('None') do
+ assert_nil ActiveSupport::Multibyte.valid_character
+ end
+ with_encoding('UTF8') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'], ActiveSupport::Multibyte.valid_character
+ end
+ with_encoding('SJIS') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['Shift_JIS'], ActiveSupport::Multibyte.valid_character
+ end
+ end
+
+ test "verify verifies ASCII strings are properly encoded" do
+ with_encoding('None') do
+ examples.each do |example|
+ assert ActiveSupport::Multibyte.verify(example)
+ end
+ end
+ end
+
+ test "verify verifies UTF-8 strings are properly encoded" do
+ with_encoding('UTF8') do
+ assert ActiveSupport::Multibyte.verify(example('valid UTF-8'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid UTF-8'))
+ end
+ end
+
+ test "verify verifies Shift-JIS strings are properly encoded" do
+ with_encoding('SJIS') do
+ assert ActiveSupport::Multibyte.verify(example('valid Shift-JIS'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid Shift-JIS'))
+ end
+ end
+
+ test "verify! raises an exception when it finds an invalid character" do
+ with_encoding('UTF8') do
+ assert_raises(ActiveSupport::Multibyte::EncodingError) do
+ ActiveSupport::Multibyte.verify!(example('invalid UTF-8'))
+ end
+ end
+ end
+
+ test "verify! doesn't raise an exception when the encoding is valid" do
+ with_encoding('UTF8') do
+ assert_nothing_raised do
+ ActiveSupport::Multibyte.verify!(example('valid UTF-8'))
+ end
+ end
+ end
+
+ if RUBY_VERSION < '1.9'
+ test "clean leaves ASCII strings intact" do
+ with_encoding('None') do
+ [
+ 'word', "\270\236\010\210\245"
+ ].each do |string|
+ assert_equal string, ActiveSupport::Multibyte.clean(string)
+ end
+ end
+ end
+
+ test "clean cleans invalid characters from UTF-8 encoded strings" do
+ with_encoding('UTF8') do
+ cleaned_utf8 = [8].pack('C*')
+ assert_equal example('valid UTF-8'), ActiveSupport::Multibyte.clean(example('valid UTF-8'))
+ assert_equal cleaned_utf8, ActiveSupport::Multibyte.clean(example('invalid UTF-8'))
+ end
+ end
+
+ test "clean cleans invalid characters from Shift-JIS encoded strings" do
+ with_encoding('SJIS') do
+ cleaned_sjis = [184, 0, 136, 165].pack('C*')
+ assert_equal example('valid Shift-JIS'), ActiveSupport::Multibyte.clean(example('valid Shift-JIS'))
+ assert_equal cleaned_sjis, ActiveSupport::Multibyte.clean(example('invalid Shift-JIS'))
+ end
+ end
+ else
+ test "clean is a no-op" do
+ with_encoding('UTF8') do
+ assert_equal example('invalid Shift-JIS'), ActiveSupport::Multibyte.clean(example('invalid Shift-JIS'))
+ end
+ end
+ end
+
+ private
+
+ STRINGS = {
+ 'valid ASCII' => [65, 83, 67, 73, 73].pack('C*'),
+ 'invalid ASCII' => [128].pack('C*'),
+ 'valid UTF-8' => [227, 129, 147, 227, 129, 171, 227, 129, 161, 227, 130, 143].pack('C*'),
+ 'invalid UTF-8' => [184, 158, 8, 136, 165].pack('C*'),
+ 'valid Shift-JIS' => [131, 122, 129, 91, 131, 128].pack('C*'),
+ 'invalid Shift-JIS' => [184, 158, 8, 0, 255, 136, 165].pack('C*')
+ }
+
+ if Kernel.const_defined?(:Encoding)
+ def example(key)
+ STRINGS[key].force_encoding(Encoding.default_internal)
+ end
+
+ def examples
+ STRINGS.values.map { |s| s.force_encoding(Encoding.default_internal) }
+ end
+ else
+ def example(key)
+ STRINGS[key]
+ end
+
+ def examples
+ STRINGS.values
+ end
+ end
+
+ if 'string'.respond_to?(:encoding)
+ def with_encoding(enc)
+ before = Encoding.default_internal
+
+ case enc
+ when 'UTF8'
+ Encoding.default_internal = Encoding::UTF_8
+ when 'SJIS'
+ Encoding.default_internal = Encoding::Shift_JIS
+ else
+ Encoding.default_internal = Encoding::BINARY
+ end
+ yield
+
+ Encoding.default_internal = before
+ end
+ else
+ alias with_encoding with_kcode
+ end
+end
\ No newline at end of file
--
1.6.0.1
>From 31678df21276f0a986a4e39a69a4c10a2236a2ce Mon Sep 17 00:00:00 2001
From: Michael Koziarski <mich...@koziarski.com>
Date: Mon, 31 Aug 2009 12:07:30 -0700
Subject: [PATCH] Clean tag attributes before passing through the escape_once logic.
Addresses CVE-2009-3009
---
actionpack/lib/action_view/helpers/tag_helper.rb | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/actionpack/lib/action_view/helpers/tag_helper.rb b/actionpack/lib/action_view/helpers/tag_helper.rb
index 1c8d2db..54a9df4 100644
--- a/actionpack/lib/action_view/helpers/tag_helper.rb
+++ b/actionpack/lib/action_view/helpers/tag_helper.rb
@@ -104,7 +104,7 @@ module ActionView
# escape_once("<< Accept & Checkout")
# # => "<< Accept & Checkout"
def escape_once(html)
- html.to_s.gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
+ ActiveSupport::Multibyte.clean(html.to_s).gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
end
private
--
1.6.0.1
--------------080309000404080903030209
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
name="2-3-CVE-2009-3009.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="2-3-CVE-2009-3009.patch"
>From b4115148c5175c986f5c21144b0e55c3ed7d9d0c Mon Sep 17 00:00:00 2001
From: Manfred Stienstra <manf...@fngtps.com>
Date: Wed, 26 Aug 2009 22:38:10 +0200
Subject: [PATCH] Add verify and clean methods to ActiveSupport::Multibyte.
When accepting character input from outside of your application you can't
blindly trust that all strings are properly encoded. With these methods
you can check incoming strings and clean them up if necessary.
Signed-off-by: Michael Koziarski <mich...@koziarski.com>
---
activesupport/lib/active_support/multibyte.rb | 36 ++++-
.../lib/active_support/multibyte/chars.rb | 23 +---
.../lib/active_support/multibyte/utils.rb | 61 +++++++++
activesupport/test/multibyte_utils_test.rb | 141 ++++++++++++++++++++
4 files changed, 239 insertions(+), 22 deletions(-)
create mode 100644 activesupport/lib/active_support/multibyte/utils.rb
create mode 100644 activesupport/test/multibyte_utils_test.rb
diff --git a/activesupport/lib/active_support/multibyte.rb b/activesupport/lib/active_support/multibyte.rb
index 65a96af..b6354ee 100644
--- a/activesupport/lib/active_support/multibyte.rb
+++ b/activesupport/lib/active_support/multibyte.rb
@@ -1,9 +1,5 @@
# encoding: utf-8
-require 'active_support/multibyte/chars'
-require 'active_support/multibyte/exceptions'
-require 'active_support/multibyte/unicode_database'
-
module ActiveSupport #:nodoc:
module Multibyte
# A list of all available normalization forms. See http://www.unicode.org/reports/tr15/tr15-29.html for more
@@ -27,7 +23,35 @@ module ActiveSupport #:nodoc:
#
# Example:
# ActiveSupport::Multibyte.proxy_class = CharsForUTF32
- mattr_accessor :proxy_class
- self.proxy_class = ActiveSupport::Multibyte::Chars
+ def self.proxy_class=(klass)
+ @proxy_class = klass
+ end
+
+ # Returns the currect proxy class
+ def self.proxy_class
+ @proxy_class ||= ActiveSupport::Multibyte::Chars
+ end
+
+ # Regular expressions that describe valid byte sequences for a character
+ VALID_CHARACTER = {
+ # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
+ 'UTF-8' => /\A(?:
+ [\x00-\x7f] |
+ [\xc2-\xdf] [\x80-\xbf] |
+ \xe0 [\xa0-\xbf] [\x80-\xbf] |
+ [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
+ \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf])\z /xn,
+ # Quick check for valid Shift-JIS characters, disregards the odd-even pairing
+ 'Shift_JIS' => /\A(?:
+ [\x00-\x7e \xa1-\xdf] |
+ [\x81-\x9f \xe0-\xef] [\x40-\x7e \x80-\x9e \x9f-\xfc])\z /xn
+ }
end
end
+
+require 'active_support/multibyte/chars'
+require 'active_support/multibyte/exceptions'
+require 'active_support/multibyte/unicode_database'
+require 'active_support/multibyte/utils'
diff --git a/activesupport/lib/active_support/multibyte/chars.rb b/activesupport/lib/active_support/multibyte/chars.rb
index 3d392d2..16bc130 100644
--- a/activesupport/lib/active_support/multibyte/chars.rb
+++ b/activesupport/lib/active_support/multibyte/chars.rb
@@ -73,16 +73,7 @@ module ActiveSupport #:nodoc:
UNICODE_TRAILERS_PAT = /(#{codepoints_to_pattern(UNICODE_LEADERS_AND_TRAILERS)})+\Z/
UNICODE_LEADERS_PAT = /\A(#{codepoints_to_pattern(UNICODE_LEADERS_AND_TRAILERS)})+/
- # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
- UTF8_PAT = /\A(?:
- [\x00-\x7f] |
- [\xc2-\xdf] [\x80-\xbf] |
- \xe0 [\xa0-\xbf] [\x80-\xbf] |
- [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
- \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
- [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
- \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf]
- )*\z/xn
+ UTF8_PAT = ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8']
attr_reader :wrapped_string
alias to_s wrapped_string
@@ -307,23 +298,23 @@ module ActiveSupport #:nodoc:
def rstrip
chars(@wrapped_string.gsub(UNICODE_TRAILERS_PAT, ''))
end
-
+
# Strips entire range of Unicode whitespace from the left of the string.
def lstrip
chars(@wrapped_string.gsub(UNICODE_LEADERS_PAT, ''))
end
-
+
# Strips entire range of Unicode whitespace from the right and left of the string.
def strip
rstrip.lstrip
end
-
+
# Returns the number of codepoints in the string
def size
self.class.u_unpack(@wrapped_string).size
end
alias_method :length, :size
-
+
# Reverses all characters in the string.
#
# Example:
@@ -331,7 +322,7 @@ module ActiveSupport #:nodoc:
def reverse
chars(self.class.u_unpack(@wrapped_string).reverse.pack('U*'))
end
-
+
# Implements Unicode-aware slice with codepoints. Slicing on one point returns the codepoints for that
# character.
#
@@ -646,7 +637,7 @@ module ActiveSupport #:nodoc:
string.split(//u).map do |c|
c.force_encoding(Encoding::ASCII) if c.respond_to?(:force_encoding)
- if !UTF8_PAT.match(c)
+ if !ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'].match(c)
n = c.unpack('C')[0]
n < 128 ? n.chr :
n < 160 ? [UCD.cp1252[n] || n].pack('U') :
diff --git a/activesupport/lib/active_support/multibyte/utils.rb b/activesupport/lib/active_support/multibyte/utils.rb
new file mode 100644
index 0000000..acef84d
--- /dev/null
+++ b/activesupport/lib/active_support/multibyte/utils.rb
@@ -0,0 +1,61 @@
+# encoding: utf-8
+
+module ActiveSupport #:nodoc:
+ module Multibyte #:nodoc:
+ if Kernel.const_defined?(:Encoding)
+ # Returns a regular expression that matches valid characters in the current encoding
+ def self.valid_character
+ VALID_CHARACTER[Encoding.default_internal.to_s]
+ end
+ else
+ def self.valid_character
+ case $KCODE
+ when 'UTF8'
+ VALID_CHARACTER['UTF-8']
+ when 'SJIS'
+ VALID_CHARACTER['Shift_JIS']
+ end
+ end
+ end
+
+ if 'string'.respond_to?(:valid_encoding?)
+ # Verifies the encoding of a string
+ def self.verify(string)
+ string.valid_encoding?
+ end
+ else
+ def self.verify(string)
+ if expression = valid_character
+ for c in string.split(//)
+ return false unless valid_character.match(c)
+ end
+ end
+ true
+ end
+ end
+
+ # Verifies the encoding of the string and raises an exception when it's not valid
+ def self.verify!(string)
+ raise EncodingError.new("Found characters with invalid encoding") unless verify(string)
+ end
+
+ if 'string'.respond_to?(:force_encoding)
+ # Removes all invalid characters from the string.
+ #
+ # Note: this method is a no-op in Ruby 1.9
+ def self.clean(string)
+ string
+ end
+ else
+ def self.clean(string)
+ if expression = valid_character
+ stripped = []; for c in string.split(//)
+ stripped << c if valid_character.match(c)
+ end; stripped.join
+ else
+ string
+ end
+ end
+ end
+ end
+end
\ No newline at end of file
diff --git a/activesupport/test/multibyte_utils_test.rb b/activesupport/test/multibyte_utils_test.rb
new file mode 100644
index 0000000..d8ac5ff
--- /dev/null
+++ b/activesupport/test/multibyte_utils_test.rb
@@ -0,0 +1,141 @@
+# encoding: utf-8
+
+require 'abstract_unit'
+require 'multibyte_test_helpers'
+
+class MultibyteUtilsTest < ActiveSupport::TestCase
+ include MultibyteTestHelpers
+
+ test "valid_character returns an expression for the current encoding" do
+ with_encoding('None') do
+ assert_nil ActiveSupport::Multibyte.valid_character
+ end
+ with_encoding('UTF8') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'], ActiveSupport::Multibyte.valid_character
+ end
+ with_encoding('SJIS') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['Shift_JIS'], ActiveSupport::Multibyte.valid_character
+ end
+ end
+
+ test "verify verifies ASCII strings are properly encoded" do
+ with_encoding('None') do
+ examples.each do |example|
+ assert ActiveSupport::Multibyte.verify(example)
+ end
+ end
+ end
+
+ test "verify verifies UTF-8 strings are properly encoded" do
+ with_encoding('UTF8') do
+ assert ActiveSupport::Multibyte.verify(example('valid UTF-8'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid UTF-8'))
+ end
+ end
+
+ test "verify verifies Shift-JIS strings are properly encoded" do
+ with_encoding('SJIS') do
+ assert ActiveSupport::Multibyte.verify(example('valid Shift-JIS'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid Shift-JIS'))
+ end
+ end
+
+ test "verify! raises an exception when it finds an invalid character" do
+ with_encoding('UTF8') do
+ assert_raises(ActiveSupport::Multibyte::EncodingError) do
+ ActiveSupport::Multibyte.verify!(example('invalid UTF-8'))
+ end
+ end
+ end
+
+ test "verify! doesn't raise an exception when the encoding is valid" do
+ with_encoding('UTF8') do
+ assert_nothing_raised do
+ ActiveSupport::Multibyte.verify!(example('valid UTF-8'))
+ end
+ end
+ end
+
+ if RUBY_VERSION < '1.9'
+ test "clean leaves ASCII strings intact" do
+ with_encoding('None') do
+ [
+ 'word', "\270\236\010\210\245"
+ ].each do |string|
+ assert_equal string, ActiveSupport::Multibyte.clean(string)
+ end
+ end
+ end
+
+ test "clean cleans invalid characters from UTF-8 encoded strings" do
+ with_encoding('UTF8') do
+ cleaned_utf8 = [8].pack('C*')
+ assert_equal example('valid UTF-8'), ActiveSupport::Multibyte.clean(example('valid UTF-8'))
+ assert_equal cleaned_utf8, ActiveSupport::Multibyte.clean(example('invalid UTF-8'))
+ end
+ end
+
+ test "clean cleans invalid characters from Shift-JIS encoded strings" do
+ with_encoding('SJIS') do
+ cleaned_sjis = [184, 0, 136, 165].pack('C*')
+ assert_equal example('valid Shift-JIS'), ActiveSupport::Multibyte.clean(example('valid Shift-JIS'))
+ assert_equal cleaned_sjis, ActiveSupport::Multibyte.clean(example('invalid Shift-JIS'))
+ end
+ end
+ else
+ test "clean is a no-op" do
+ with_encoding('UTF8') do
+ assert_equal example('invalid Shift-JIS'), ActiveSupport::Multibyte.clean(example('invalid Shift-JIS'))
+ end
+ end
+ end
+
+ private
+
+ STRINGS = {
+ 'valid ASCII' => [65, 83, 67, 73, 73].pack('C*'),
+ 'invalid ASCII' => [128].pack('C*'),
+ 'valid UTF-8' => [227, 129, 147, 227, 129, 171, 227, 129, 161, 227, 130, 143].pack('C*'),
+ 'invalid UTF-8' => [184, 158, 8, 136, 165].pack('C*'),
+ 'valid Shift-JIS' => [131, 122, 129, 91, 131, 128].pack('C*'),
+ 'invalid Shift-JIS' => [184, 158, 8, 0, 255, 136, 165].pack('C*')
+ }
+
+ if Kernel.const_defined?(:Encoding)
+ def example(key)
+ STRINGS[key].force_encoding(Encoding.default_internal)
+ end
+
+ def examples
+ STRINGS.values.map { |s| s.force_encoding(Encoding.default_internal) }
+ end
+ else
+ def example(key)
+ STRINGS[key]
+ end
+
+ def examples
+ STRINGS.values
+ end
+ end
+
+ if 'string'.respond_to?(:encoding)
+ def with_encoding(enc)
+ before = Encoding.default_internal
+
+ case enc
+ when 'UTF8'
+ Encoding.default_internal = Encoding::UTF_8
+ when 'SJIS'
+ Encoding.default_internal = Encoding::Shift_JIS
+ else
+ Encoding.default_internal = Encoding::BINARY
+ end
+ yield
+
+ Encoding.default_internal = before
+ end
+ else
+ alias with_encoding with_kcode
+ end
+end
\ No newline at end of file
--
1.6.0.1
>From b066ffe93fb88af3b1e4795783bb71a7b8095ac5 Mon Sep 17 00:00:00 2001
From: Michael Koziarski <mich...@koziarski.com>
Date: Mon, 31 Aug 2009 12:07:30 -0700
Subject: [PATCH] Clean tag attributes before passing through the escape_once logic.
Addresses CVE-2009-3009
---
actionpack/lib/action_view/helpers/tag_helper.rb | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/actionpack/lib/action_view/helpers/tag_helper.rb b/actionpack/lib/action_view/helpers/tag_helper.rb
index af8c4d5..db99a0e 100644
--- a/actionpack/lib/action_view/helpers/tag_helper.rb
+++ b/actionpack/lib/action_view/helpers/tag_helper.rb
@@ -103,7 +103,7 @@ module ActionView
# escape_once("<< Accept & Checkout")
# # => "<< Accept & Checkout"
def escape_once(html)
- html.to_s.gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
+ ActiveSupport::Multibyte.clean(html.to_s).gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
end
private
--
1.6.0.1
--------------080309000404080903030209
Content-Type: text/plain; x-mac-type="0"; x-mac-creator="0";
name="2-0-CVE-2009-3009.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="2-0-CVE-2009-3009.patch"
>From e31d29fae766daa358ed6e6bf278e75b95a317d3 Mon Sep 17 00:00:00 2001
From: Manfred Stienstra <manf...@fngtps.com>
Date: Tue, 1 Sep 2009 20:16:11 +0200
Subject: [PATCH] Add methods for string verification and encoding cleanup code.
Signed-off-by: Michael Koziarski <mich...@koziarski.com>
---
activesupport/lib/active_support/multibyte.rb | 18 ++++
.../multibyte/handlers/utf8_handler.rb | 13 +--
.../lib/active_support/multibyte/utils.rb | 39 +++++++
activesupport/test/multibyte_utils_test.rb | 106 ++++++++++++++++++++
4 files changed, 165 insertions(+), 11 deletions(-)
create mode 100644 activesupport/lib/active_support/multibyte/utils.rb
create mode 100644 activesupport/test/multibyte_utils_test.rb
diff --git a/activesupport/lib/active_support/multibyte.rb b/activesupport/lib/active_support/multibyte.rb
index 27c0d18..f76cfba 100644
--- a/activesupport/lib/active_support/multibyte.rb
+++ b/activesupport/lib/active_support/multibyte.rb
@@ -3,7 +3,25 @@ module ActiveSupport
DEFAULT_NORMALIZATION_FORM = :kc
NORMALIZATIONS_FORMS = [:c, :kc, :d, :kd]
UNICODE_VERSION = '5.0.0'
+
+ # Regular expressions that describe valid byte sequences for a character
+ VALID_CHARACTER = {
+ # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
+ 'UTF-8' => /\A(?:
+ [\x00-\x7f] |
+ [\xc2-\xdf] [\x80-\xbf] |
+ \xe0 [\xa0-\xbf] [\x80-\xbf] |
+ [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
+ \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
+ \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf])\z /xn,
+ # Quick check for valid Shift-JIS characters, disregards the odd-even pairing
+ 'Shift_JIS' => /\A(?:
+ [\x00-\x7e \xa1-\xdf] |
+ [\x81-\x9f \xe0-\xef] [\x40-\x7e \x80-\x9e \x9f-\xfc])\z /xn
+ }
end
end
require 'active_support/multibyte/chars'
+require 'active_support/multibyte/utils'
\ No newline at end of file
diff --git a/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb b/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb
index 66fe47a..f95349e 100644
--- a/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb
+++ b/activesupport/lib/active_support/multibyte/handlers/utf8_handler.rb
@@ -100,16 +100,7 @@ module ActiveSupport::Multibyte::Handlers #:nodoc:
# between little and big endian. This is not an issue in utf-8, so it must be ignored.
UNICODE_LEADERS_AND_TRAILERS = UNICODE_WHITESPACE + [65279] # ZERO-WIDTH NO-BREAK SPACE aka BOM
- # Borrowed from the Kconv library by Shinji KONO - (also as seen on the W3C site)
- UTF8_PAT = /\A(?:
- [\x00-\x7f] |
- [\xc2-\xdf] [\x80-\xbf] |
- \xe0 [\xa0-\xbf] [\x80-\xbf] |
- [\xe1-\xef] [\x80-\xbf] [\x80-\xbf] |
- \xf0 [\x90-\xbf] [\x80-\xbf] [\x80-\xbf] |
- [\xf1-\xf3] [\x80-\xbf] [\x80-\xbf] [\x80-\xbf] |
- \xf4 [\x80-\x8f] [\x80-\xbf] [\x80-\xbf]
- )*\z/xn
+ UTF8_PAT = ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8']
# Returns a regular expression pattern that matches the passed Unicode codepoints
def self.codepoints_to_pattern(array_of_codepoints) #:nodoc:
@@ -357,7 +348,7 @@ module ActiveSupport::Multibyte::Handlers #:nodoc:
# Replaces all the non-utf-8 bytes by their iso-8859-1 or cp1252 equivalent resulting in a valid utf-8 string
def tidy_bytes(str)
str.split(//u).map do |c|
- if !UTF8_PAT.match(c)
+ if !ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'].match(c)
n = c.unpack('C')[0]
n < 128 ? n.chr :
n < 160 ? [UCD.cp1252[n] || n].pack('U') :
diff --git a/activesupport/lib/active_support/multibyte/utils.rb b/activesupport/lib/active_support/multibyte/utils.rb
new file mode 100644
index 0000000..094e856
--- /dev/null
+++ b/activesupport/lib/active_support/multibyte/utils.rb
@@ -0,0 +1,39 @@
+module ActiveSupport #:nodoc:
+ module Multibyte #:nodoc:
+ # Returns a regular expression that matches valid characters in the current encoding
+ def self.valid_character
+ case $KCODE
+ when 'UTF8'
+ VALID_CHARACTER['UTF-8']
+ when 'SJIS'
+ VALID_CHARACTER['Shift_JIS']
+ end
+ end
+
+ # Verifies the encoding of a string
+ def self.verify(string)
+ if expression = valid_character
+ for c in string.split(//)
+ return false unless valid_character.match(c)
+ end
+ end
+ true
+ end
+
+ # Verifies the encoding of the string and raises an exception when it's not valid
+ def self.verify!(string)
+ raise ActiveSupport::Multibyte::Handlers::EncodingError.new("Found characters with invalid encoding") unless verify(string)
+ end
+
+ # Removes all invalid characters from the string
+ def self.clean(string)
+ if expression = valid_character
+ stripped = []; for c in string.split(//)
+ stripped << c if valid_character.match(c)
+ end; stripped.join
+ else
+ string
+ end
+ end
+ end
+end
\ No newline at end of file
diff --git a/activesupport/test/multibyte_utils_test.rb b/activesupport/test/multibyte_utils_test.rb
new file mode 100644
index 0000000..a4bcfc8
--- /dev/null
+++ b/activesupport/test/multibyte_utils_test.rb
@@ -0,0 +1,106 @@
+require 'abstract_unit'
+
+class MultibyteUtilsTest < Test::Unit::TestCase
+
+ def test_valid_character_returns_an_expression_for_the_current_encoding
+ with_kcode('None') do
+ assert_nil ActiveSupport::Multibyte.valid_character
+ end
+ with_kcode('UTF8') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['UTF-8'], ActiveSupport::Multibyte.valid_character
+ end
+ with_kcode('SJIS') do
+ assert_equal ActiveSupport::Multibyte::VALID_CHARACTER['Shift_JIS'], ActiveSupport::Multibyte.valid_character
+ end
+ end
+
+ def test_verify_verifies_ASCII_strings_are_properly_encoded
+ with_kcode('None') do
+ examples.each do |example|
+ assert ActiveSupport::Multibyte.verify(example)
+ end
+ end
+ end
+
+ def test_verify_verifies_UTF_8_strings_are_properly_encoded
+ with_kcode('UTF8') do
+ assert ActiveSupport::Multibyte.verify(example('valid UTF-8'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid UTF-8'))
+ end
+ end
+
+ def test_verify_verifies_Shift_JIS_strings_are_properly_encoded
+ with_kcode('SJIS') do
+ assert ActiveSupport::Multibyte.verify(example('valid Shift-JIS'))
+ assert !ActiveSupport::Multibyte.verify(example('invalid Shift-JIS'))
+ end
+ end
+
+ def test_verify_bang_raises_an_exception_when_it_finds_an_invalid_character
+ with_kcode('UTF8') do
+ assert_raises(ActiveSupport::Multibyte::Handlers::EncodingError) do
+ ActiveSupport::Multibyte.verify!(example('invalid UTF-8'))
+ end
+ end
+ end
+
+ def test_verify_bang_doesnt_raise_an_exception_when_the_encoding_is_valid
+ with_kcode('UTF8') do
+ assert_nothing_raised do
+ ActiveSupport::Multibyte.verify!(example('valid UTF-8'))
+ end
+ end
+ end
+
+ def test_clean_leaves_ASCII_strings_intact
+ with_kcode('None') do
+ [
+ 'word', "\270\236\010\210\245"
+ ].each do |string|
+ assert_equal string, ActiveSupport::Multibyte.clean(string)
+ end
+ end
+ end
+
+ def test_clean_cleans_invalid_characters_from_UTF_8_encoded_strings
+ with_kcode('UTF8') do
+ cleaned_utf8 = [8].pack('C*')
+ assert_equal example('valid UTF-8'), ActiveSupport::Multibyte.clean(example('valid UTF-8'))
+ assert_equal cleaned_utf8, ActiveSupport::Multibyte.clean(example('invalid UTF-8'))
+ end
+ end
+
+ def test_clean_cleans_invalid_characters_from_Shift_JIS_encoded_strings
+ with_kcode('SJIS') do
+ cleaned_sjis = [184, 0, 136, 165].pack('C*')
+ assert_equal example('valid Shift-JIS'), ActiveSupport::Multibyte.clean(example('valid Shift-JIS'))
+ assert_equal cleaned_sjis, ActiveSupport::Multibyte.clean(example('invalid Shift-JIS'))
+ end
+ end
+
+ private
+
+ STRINGS = {
+ 'valid ASCII' => [65, 83, 67, 73, 73].pack('C*'),
+ 'invalid ASCII' => [128].pack('C*'),
+ 'valid UTF-8' => [227, 129, 147, 227, 129, 171, 227, 129, 161, 227, 130, 143].pack('C*'),
+ 'invalid UTF-8' => [184, 158, 8, 136, 165].pack('C*'),
+ 'valid Shift-JIS' => [131, 122, 129, 91, 131, 128].pack('C*'),
+ 'invalid Shift-JIS' => [184, 158, 8, 0, 255, 136, 165].pack('C*')
+ }
+
+ def example(key)
+ STRINGS[key]
+ end
+
+ def examples
+ STRINGS.values
+ end
+
+ def with_kcode(code)
+ before = $KCODE
+ $KCODE = code
+ yield
+ $KCODE = before
+ end
+end
\ No newline at end of file
--
1.6.0.1
>From 5b8b41732f385131d4e1f1a8862d71f44dcc992d Mon Sep 17 00:00:00 2001
From: Michael Koziarski <mich...@koziarski.com>
Date: Mon, 31 Aug 2009 12:07:30 -0700
Subject: [PATCH] Clean tag attributes before passing through the escape_once logic.
Addresses CVE-2009-3009
---
actionpack/lib/action_view/helpers/tag_helper.rb | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/actionpack/lib/action_view/helpers/tag_helper.rb b/actionpack/lib/action_view/helpers/tag_helper.rb
index 999cbfb..bde5581 100644
--- a/actionpack/lib/action_view/helpers/tag_helper.rb
+++ b/actionpack/lib/action_view/helpers/tag_helper.rb
@@ -99,7 +99,7 @@ module ActionView
# escape_once("<< Accept & Checkout")
# # => "<< Accept & Checkout"
def escape_once(html)
- html.to_s.gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
+ ActiveSupport::Multibyte.clean(html.to_s).gsub(/[\"><]|&(?!([a-zA-Z]+|(#\d+));)/) { |special| ERB::Util::HTML_ESCAPE[special] }
end
private
--
1.6.0.1
--------------080309000404080903030209--