Maximum length of a string

361 views
Skip to first unread message

jhalp...@gmail.com

unread,
Oct 14, 2013, 11:20:41 AM10/14/13
to jansso...@googlegroups.com
It looks like the only limit to the size of a JSON message is memory, is that right? We need to send messages containing large binary files in base-64 format, and from what I've seen in the source it looks like as long as there's enough memory to allocate a buffer we're good.

Am I right about that?

Thanks

Joe

Andrew Chernow

unread,
Oct 14, 2013, 11:43:20 AM10/14/13
to jansso...@googlegroups.com
I think strings will be limited to 31-bits. The source code for json_string() calls utf8_check_string which takes an "int" for the length.  If the given length supplied to that function is -1, it will use strlen. However, strlen's return value is truncated to an int. Unless I missed something, if utf8_check_string was updated to take a size_t length argument, then the only upper limit on strings would be available memory and the width of size_t.

On way around this is to call json_string_nocheck("my super long string"), which will bypass the internal utf8_check_string call. The downside to this approach is your string is not utf8 verified.  But in your case, you say its base64 so you are safe. 

Andrew


--
--
Jansson users mailing list
jansso...@googlegroups.com
http://groups.google.com/group/jansson-users
---
You received this message because you are subscribed to the Google Groups "Jansson users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jansson-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Petri Lehtinen

unread,
Oct 14, 2013, 1:37:40 PM10/14/13
to jansso...@googlegroups.com
This is correct. In the upcoming v2.6, there's a string API with
explicit size_t lengths, so you can create strings of over 2^31 bytes.

If you're going to use json_dumps() to encode the data as JSON,
remember that the result of json_dumps() also needs memory, so you
actually need approximately twice as much as the base-64 encoded file
contents.

Petri

jhalp...@gmail.com

unread,
Oct 14, 2013, 2:16:42 PM10/14/13
to jansso...@googlegroups.com
Thanks guys, appreciate it.

Joe

Jonathan Landis

unread,
Oct 14, 2013, 2:18:07 PM10/14/13
to jansso...@googlegroups.com
On 10/14/2013 10:37 AM, Petri Lehtinen wrote:
> This is correct. In the upcoming v2.6, there's a string API with
> explicit size_t lengths, so you can create strings of over 2^31 bytes.

I see the recent commits changing int to size_t, but please revisit the
utf8_check_string function again. At line 176 you have this:

if(i + count > length)

Is there any guarantee that the expression "i + count" won't overflow if
the string is very large and ends with a truncated UTF-8 sequence?
Consider using subtraction instead. The outer loop guarantees that i <
length, so "length - i" is safe to evaluate:

if(count > length - i)

I looked for other usages of strlen and found another potential integer
overflow at hashtable.c:252. At issue is the length of hashtable keys,
and probably they are going to be short in practice, but I don't see any
rules enforcing a limit.

JKL

Petri Lehtinen

unread,
Oct 15, 2013, 1:50:51 AM10/15/13
to jansso...@googlegroups.com
Both fixed, thanks!

Petri


> diff --git a/src/utf.c b/src/utf.c
> index 0a2ba9b..cbeeb54 100644
> --- a/src/utf.c
> +++ b/src/utf.c
> @@ -173,7 +173,7 @@ int utf8_check_string(const char *string, size_t length)
> return 0;
> else if(count > 1)
> {
> - if(i + count > length)
> + if(count > length - i)
> return 0;
>
> if(!utf8_check_full(&string[i], count, NULL))
> diff --git a/src/hashtable.c b/src/hashtable.c
> index 5fb0467..a254cfa 100644
> --- a/src/hashtable.c
> +++ b/src/hashtable.c
> @@ -249,7 +249,14 @@ int hashtable_set(hashtable_t *hashtable,
> /* offsetof(...) returns the size of pair_t without the last,
> flexible member. This way, the correct amount is
> allocated. */
> - pair = jsonp_malloc(offsetof(pair_t, key) + strlen(key) + 1);
> +
> + size_t len = strlen(key);
> + if(len >= (size_t)-1 - offsetof(pair_t, key)) {
> + /* Avoid an overflow if the key is very long */
> + return -1;
> + }
> +
> + pair = jsonp_malloc(offsetof(pair_t, key) + len + 1);
> if(!pair)
> return -1;
>
Reply all
Reply to author
Forward
0 new messages