[vim/vim] fix: stricter JSON decoding for surrogate pairs and keywords (PR #19807)

7 views
Skip to first unread message

mattn

unread,
5:10 AM (18 hours ago) 5:10 AM
to vim/vim, Subscribed

Tighten json_decode() to reject invalid JSON that was previously accepted.

  1. Lone surrogate rejection: Fix the surrogate pair range check from 0xDFFF to 0xDBFF so that only high surrogates (U+D800-U+DBFF) trigger pair decoding. Additionally, reject lone surrogates (any codepoint in U+D800-U+DFFF that did not form a valid pair) instead of passing them through to utf_char2bytes(), which would produce invalid UTF-8.

  2. Case-sensitive keyword matching: json_decode() used STRNICMP (case-insensitive) for matching true, false, null, NaN, Infinity, and -Infinity. This means "True", "FALSE", "Null", etc. were all silently accepted. RFC 7159 requires these keywords to be lowercase. js_decode() retains the case-insensitive behavior.

Both of these lenient behaviors were intentionally introduced by me in the original JSON support patches, and I recall that Bram's position at the time was that Vim should not perform strict decoding. However, with the current landscape — LSP support, ch_listen(), and broader use of JSON for inter-process communication — silently accepting invalid JSON is a risk rather than a feature.


You can view, comment on, or merge this pull request online at:

  https://github.com/vim/vim/pull/19807

Commit Summary

  • 50b315b fix: stricter JSON decoding for surrogate pairs and keywords

File Changes

(2 files)

Patch Links:


Reply to this email directly, view it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/19807@github.com>

mattn

unread,
12:50 PM (10 hours ago) 12:50 PM
to vim/vim, Push

@mattn pushed 1 commit.

  • e718bb7 perf: reduce function call overhead in JSON encoding


View it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/19807/before/43e312b451260ad2a2ba9aa1fdbc018feb2dcc16/after/e718bb793a5500a99d0b723c889511c771cc66d8@github.com>

mattn

unread,
12:54 PM (10 hours ago) 12:54 PM
to vim/vim, Push

@mattn pushed 1 commit.

  • 5326872 perf: pre-grow buffer in write_string() to reduce ga_grow calls


View it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/19807/before/e718bb793a5500a99d0b723c889511c771cc66d8/after/532687229f05fd89e6a3aad07a406acc658961aa@github.com>

Christian Brabandt

unread,
3:42 PM (8 hours ago) 3:42 PM
to vim/vim, Subscribed

@chrisbra commented on this pull request.


In src/json.c:

> @@ -954,7 +973,13 @@ json_decode_item(js_read_T *reader, typval_T *res, int options)
 			retval = OK;
 			break;
 		    }
-		    if (STRNICMP((char *)p, "false", 5) == 0)
+		    // In strinct JSON mode, keywords must be lowercase.
⬇️ Suggested change
-		    // In strinct JSON mode, keywords must be lowercase.
+		    // In strict JSON mode, keywords must be lowercase.

In src/testdir/test_json.vim:

> @@ -151,6 +151,11 @@ func Test_json_decode()
   call assert_equal(type(v:none), type(json_decode('')))
   call assert_equal("", json_decode('""'))
 
+  " json_decode() requires lowercase keywords (RFC 7159)
+  call assert_fails('call json_decode("True")', 'E491:')
+  call assert_fails('call json_decode("FALSE")', 'E491:')
+  call assert_fails('call json_decode("Null")', 'E491:')

can you add tests, that js_decode() with those values still works? I think we should also document that js_encode()/js_decode() allows special keywords ignoring the case here.


Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/19807/review/4001835641@github.com>

mattn

unread,
8:27 PM (3 hours ago) 8:27 PM
to vim/vim, Push

@mattn pushed 1 commit.


View it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/19807/before/532687229f05fd89e6a3aad07a406acc658961aa/after/e3b634679ce48fd186763f3e0467695b8fbca11d@github.com>

mattn

unread,
8:33 PM (3 hours ago) 8:33 PM
to vim/vim, Push

@mattn pushed 10 commits.

  • f81b1e8 patch 9.2.0236: stack-overflow with deeply nested data in json_encode/decode()
  • 851f980 runtime(manpager): use \x07 instead of \a for BEL in OSC 8 regex
  • b0e8175 patch 9.2.0237: filetype: ObjectScript routines are not recognized
  • 4be997d patch 9.2.0238: showmode message may not be displayed
  • 977d596 patch 9.2.0239: signcolumn may cause flicker
  • 7143d3d runtime(sh): Improve the matching of function definitions
  • 18e7f7d runtime(sh): Distinguish parts of function definitions
  • 9cd88c1 patch 9.2.0240: syn_name2id() is slow due to linear search
  • 1d62dce test: add tests for case-sensitivity of keywords in json_decode/js_decode
  • f567f70 doc: update json_decode/js_decode docs for stricter keyword and surrogate handling


View it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/19807/before/e3b634679ce48fd186763f3e0467695b8fbca11d/after/f567f70693c2ad2d7aa51de666e4eaebdbf9655c@github.com>

mattn

unread,
8:35 PM (3 hours ago) 8:35 PM
to vim/vim, Push

@mattn pushed 5 commits.

  • 58dc97b fix: stricter JSON decoding for surrogate pairs and keywords
  • e456db5 perf: reduce function call overhead in JSON encoding
  • 2caa4a8 perf: pre-grow buffer in write_string() to reduce ga_grow calls
  • d045df6 test: add tests for case-sensitivity of keywords in json_decode/js_decode
  • 32e6010 doc: update json_decode/js_decode docs for stricter keyword and surrogate handling


View it on GitHub.
You are receiving this because you are subscribed to this thread.Message ID: <vim/vim/pull/19807/before/f567f70693c2ad2d7aa51de666e4eaebdbf9655c/after/32e6010b299c1b6bcdbda25d379e55c091788464@github.com>

Reply all
Reply to author
Forward
0 new messages