----------------------------------------
Bug #5888: JSON unittest fails
https://bugs.ruby-lang.org/issues/5888
Author: Vit Ondruch
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: ruby 1.9.3p0 (2011-10-30) [x86_64-linux]
Hello,
When building Ruby 1.9.3 (as well as 2.0.0), the make check spits following errors:
1) Failure:
test_parse_values(TC_JSON) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json.rb:174]:
<["\"\b\n\r\t\u0000\u001F"]> expected but was
<["\"\b\n\r\t\xA8\xA8"]>.
2) Failure:
test_parser_reset(TC_JSON) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json.rb:291]:
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\u0000\u001F",
"h"=>1000.0,
"i"=>0.001}> expected but was
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\xA0\xA0",
"h"=>1000.0,
"i"=>0.001}>.
3) Failure:
test_fast_generate(TC_JSONGenerate) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json_generate.rb:78]:
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\u0000\u001F",
"h"=>1000.0,
"i"=>0.001}> expected but was
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\xA0\xA0",
"h"=>1000.0,
"i"=>0.001}>.
4) Failure:
test_generate(TC_JSONGenerate) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json_generate.rb:47]:
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\u0000\u001F",
"h"=>1000.0,
"i"=>0.001}> expected but was
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\xA0\xA0",
"h"=>1000.0,
"i"=>0.001}>.
5) Failure:
test_generate_pretty(TC_JSONGenerate) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json_generate.rb:61]:
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\u0000\u001F",
"h"=>1000.0,
"i"=>0.001}> expected but was
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\xA0\xA0",
"h"=>1000.0,
"i"=>0.001}>.
6) Failure:
test_own_state(TC_JSONGenerate) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json_generate.rb:92]:
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\u0000\u001F",
"h"=>1000.0,
"i"=>0.001}> expected but was
<{"a"=>2,
"b"=>3.141,
"c"=>"c",
"d"=>[1, "b", 3.14],
"e"=>{"foo"=>"bar"},
"g"=>"\"\xA0\xA0",
"h"=>1000.0,
"i"=>0.001}>.
7) Failure:
test_chars(TC_JSONUnicode) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json_unicode.rb:57]:
<"\x00"> expected but was
<"\xA0">.
8) Failure:
test_unicode(TC_JSONUnicode) [/builddir/build/BUILD/ruby-1.9.3-p0/test/json/test_json_unicode.rb:20]:
<["\u00A9 \u2260 \u20AC! \u0001"]> expected but was
<["\u00A9 \u2260 \u20AC! \xD8"]>.
Please note that I am building Ruby using GCC 4.7 on Fedora Rawhide. I have not seen this errors before with GCC 4.6
Simpler reproducer is:
# ruby -rjson -v -e "20.times { puts JSON.parse '[\"\u0000\"]'}"
ruby 1.9.3p0 (2011-10-30) [x86_64-linux]
�
�
�
�
h
P
8
�
�
x
`
H
#
As you can see, I obtain 20 time different character. That is weird.
When building with -O0, everything seems to work.
The flaw seems to be in ext/json/parser/parser.c - I compiled only this one with -O0 and the others with -O2 and everything worked fine.
When I compile everything with -O2 and use
valgrind --track-origins=yes -v --read-var-info=yes ruby -rjson -v -e "puts JSON.parse '[\"\u0000\"]'"
Part of the valgrind output is quite suspicious:
==3350== Conditional jump or move depends on uninitialised value(s)
==3350== at 0x4EAB4F5: rb_io_puts (io.c:6223)
==3350== by 0x4EAB6AE: io_puts_ary (io.c:6174)
==3350== by 0x4F853EF: exec_recursive_i (thread.c:3940)
==3350== by 0x4F85A43: exec_recursive (thread.c:3991)
==3350== by 0x4EAB42C: rb_io_puts (io.c:6217)
==3350== by 0x4F787C1: vm_call0 (vm_eval.c:79)
==3350== by 0x4F7BFC2: vm_call_method (vm_insnhelper.c:404)
==3350== by 0x4F72191: vm_exec_core (insns.def:1015)
==3350== by 0x4F777B7: vm_exec (vm.c:1220)
==3350== by 0x4F7E7ED: rb_iseq_eval_main (vm.c:1461)
==3350== by 0x4E89E79: ruby_exec_internal (eval.c:204)
==3350== by 0x4E8A92C: ruby_exec_node (eval.c:251)
==3350== Uninitialised value was created by a stack allocation
==3350== at 0xD3E1430: JSON_parse_string (parser.rl:497)
==3350==
==3350== Syscall param write(buf) points to uninitialised byte(s)
==3350== at 0x524E00D: ??? (in /lib64/libpthread-2.15.so)
==3350== by 0x4EA43B3: io_flush_buffer_sync (io.c:640)
==3350== by 0x4EA9D15: rb_io_fptr_cleanup (io.c:3490)
==3350== by 0x4EABCB0: rb_io_fptr_finalize (io.c:3579)
==3350== by 0x4E99939: finalize_list (gc.c:2944)
==3350== by 0x4E9D301: rb_gc_call_finalizer_at_exit (gc.c:3075)
==3350== by 0x4E8C15A: ruby_cleanup (eval.c:147)
==3350== by 0x4E8C33E: ruby_run_node (eval.c:244)
==3350== by 0x40086A: main (main.c:38)
==3350== Address 0xd3039b0 is 0 bytes inside a block of size 8,192 alloc'd
==3350== at 0x4C284CD: malloc (vg_replace_malloc.c:236)
==3350== by 0x4E9C454: vm_xmalloc (gc.c:764)
==3350== by 0x4EAA3F0: io_binwrite (io.c:839)
==3350== by 0x4EAA67D: io_write (io.c:945)
==3350== by 0x4F787C1: vm_call0 (vm_eval.c:79)
==3350== by 0x4F7905F: rb_funcall (vm_eval.c:235)
==3350== by 0x4EAB475: rb_io_puts (io.c:6222)
==3350== by 0x4EAB6AE: io_puts_ary (io.c:6174)
==3350== by 0x4F853EF: exec_recursive_i (thread.c:3940)
==3350== by 0x4F85A43: exec_recursive (thread.c:3991)
==3350== by 0x4EAB42C: rb_io_puts (io.c:6217)
==3350== by 0x4F787C1: vm_call0 (vm_eval.c:79)
==3350== Uninitialised value was created by a stack allocation
==3350== at 0xD3E1430: JSON_parse_string (parser.rl:497)
Mainly the last uninitialized value it particularly interesting, yet I'm not sure what exactly is causing the problem.
Could you compare disassembled result with -O0, and try #pragma gcc optimize ("O0")?
File gcc-o0 added
File gcc-o2 added
File pragma-gcc-o0 added
Here are the disassembled results for function JSON_parse_string for:
- compilation with gcc -O0
- compilation with pragma GCC optimize ("O0")
- compilation with gcc -O2
The first two work properly, the last one doesn't.
After few hours of debugging, it seems that the flaw may actually be in the json_string_unescape function (in the same file, deeper in the callstack).
When compiled with -O0, it shows nothing suspicious, but with -O2, it shows some possible problems:
1344 unescape_len = convert_UTF32_to_UTF8(buf, ch);
convert_UTF32_to_UTF8 (ch=0, buf=0x7fffffffdd00 "\270", <incomplete sequence \343\225>) at parser.c:46
-- The incomplete sequence is the thing that is weird. The "ch" variable is created using some byte shifting in the json_string_unescape function, which may be the cause (if the optimization does something with the byte shifting).
Bohuslav Kabrda wrote:
> After few hours of debugging, it seems that the flaw may actually be in the json_string_unescape function (in the same file, deeper in the callstack).
> When compiled with -O0, it shows nothing suspicious, but with -O2, it shows some possible problems:
>
> 1344 unescape_len = convert_UTF32_to_UTF8(buf, ch);
> convert_UTF32_to_UTF8 (ch=0, buf=0x7fffffffdd00 "\270", <incomplete sequence \343\225>) at parser.c:46
>
> -- The incomplete sequence is the thing that is weird. The "ch" variable is created using some byte shifting in the json_string_unescape function, which may be the cause (if the optimization does something with the byte shifting).
Incomplete sequence in buf is not the root of this issue because its content and the length are explicitly
specified in convert_UTF32_to_UTF8().
But yes, the bug is here.
The root of this issue is optimization for convert_UTF32_to_UTF8().
With -O2 convert_UTF32_to_UTF8() is inlined to json_string_unescape.
If ch < 0x7F, it runs as
0x00000008034017a0 <+1120>: cmp $0x7f,%rsi # if (ch <= 0x7F) {
0x00000008034017a4 <+1124>: jbe 0x80340180c <JSON_parse_string+1228>
0x000000080340180c <+1228>: mov $0x1,%edx # int len = 1;
0x0000000803401811 <+1233>: jmp 0x8034017b4 <JSON_parse_string+1140>
0x00000008034017b4 <+1140>: lea 0x20(%rsp),%rsi
0x00000008034017b9 <+1145>: jmpq 0x8034016c0 <JSON_parse_string+896> # return
So it doesn't change buf.
I concluded this issue is caused by gcc 4.7's optimization; it wrongly optimizes out assignment to buf.
Its workaround can be following:
diff --git a/ext/json/parser/parser.c b/ext/json/parser/parser.c
index d1d14c7..f3bf50c 100644
--- a/ext/json/parser/parser.c
+++ b/ext/json/parser/parser.c
@@ -40,8 +40,9 @@ static UTF32 unescape_unicode(const unsigned char *p)
return result;
}
-static int convert_UTF32_to_UTF8(char *buf, UTF32 ch)
+static int convert_UTF32_to_UTF8(char *out, UTF32 ch)
{
+ volatile char *buf = out;
int len = 1;
if (ch <= 0x7F) {
buf[0] = (char) ch;
Status changed from Open to Assigned
Assignee set to Yui NARUSE
filed it to GCC's tracker.
For JSON, following is simple patch:
diff --git a/ext/json/parser/parser.c b/ext/json/parser/parser.c
index d1d14c7..1773616 100644
--- a/ext/json/parser/parser.c
+++ b/ext/json/parser/parser.c
@@ -1293,6 +1293,7 @@ static VALUE json_string_unescape(VALUE result, char *string, char *stringEnd)
{
char *p = string, *pe = string, *unescape;
int unescape_len;
+ char buf[4];
while (pe < stringEnd) {
if (*pe == '\\') {
@@ -1325,7 +1326,6 @@ static VALUE json_string_unescape(VALUE result, char *string, char *stringEnd)
if (pe > stringEnd - 4) {
return Qnil;
} else {
- char buf[4];
UTF32 ch = unescape_unicode((unsigned char *) ++pe);
pe += 3;
if (UNI_SUR_HIGH_START == (ch & 0xFC00)) {
----------------------------------------
Bug #5888: JSON unittest fails
https://bugs.ruby-lang.org/issues/5888
Author: Vit Ondruch
Status: Assigned
Priority: Normal
Assignee: Yui NARUSE
I don't believe this is a GCC bug: space for buf[4] can be reclaimed
as soon as execution finishes in the scope where buf[4] was declared.
> For JSON, following is simple patch:
This was always a bug, we were just lucky because GCC wasn't as
aggressive when popping the stack.
Eric Wong wrote:
> Yui NARUSE <nar...@airemix.jp> wrote:
> > filed it to GCC's tracker.
>
> I don't believe this is a GCC bug: space for buf[4] can be reclaimed
> as soon as execution finishes in the scope where buf[4] was declared.
GCC answered the same thing.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51862#c2
> > For JSON, following is simple patch:
>
> This was always a bug, we were just lucky because GCC wasn't as
> aggressive when popping the stack.
I made pull request https://github.com/flori/json/pull/115
----------------------------------------
Bug #5888: JSON unittest fails
https://bugs.ruby-lang.org/issues/5888
Author: Vit Ondruch
Status: Closed
Priority: Normal
Assignee: Yui NARUSE
Thank you guys. Can we backport it for 1.9.3 please?
----------------------------------------
Bug #5888: JSON unittest fails
https://bugs.ruby-lang.org/issues/5888
Author: Vit Ondruch
Status: Closed
Priority: Normal
Assignee: Yui NARUSE
File ruby-1.9.3-fix-json-parser.patch added
The attached patch is a backport for Ruby 1.9.3.
----------------------------------------
Bug #5888: JSON unittest fails
https://bugs.ruby-lang.org/issues/5888
Author: Vit Ondruch
Status: Closed
Priority: Normal
Assignee: Yui NARUSE