[PATCH] Add json_object_new_double_with_str_representation()

108 views
Skip to first unread message

Even Rouault

unread,
Jul 21, 2012, 6:04:17 AM7/21/12
to jso...@googlegroups.com
Hi,

I submit here a proposed patch to add a function where a json double is created with a user-provided string representation. This is a sanitized and simpler version of a similar function that exists in the embedded copy of json-c in GDAL/OGR, and that would be necessary to ultimately remove it and use plain libjson-c 0.10 instead.

The aim of this function is to be able to precisely control how the double will be serialized (like controlling the number of significant figures, doing truncation of trailing zero's if necessary, using a printf-like function that is not sensitive to the locale, etc... Note: that messy part isn't in the patch, but is done by the caller of json_object_new_double_with_str_representation() )

The patch also includes updated unit tests to test this new function.

(I can submit a pull request if preferred)

Best regards,

Even
json_object_new_double_with_str_representation.patch

Mateusz Loskot

unread,
Jul 21, 2012, 7:34:42 AM7/21/12
to jso...@googlegroups.com
On 21 July 2012 11:04, Even Rouault <even.r...@gmail.com> wrote:
>
> I submit here a proposed patch to add a function where a json double is
> created with a user-provided string representation. This is a sanitized and
> simpler version of a similar function that exists in the embedded copy of
> json-c in GDAL/OGR, and that would be necessary to ultimately remove it and
> use plain libjson-c 0.10
> [...]

Even,

Thank you for this motion.

Folks,

Just to clarify, Even's patch is related to my patches I have been
submitting to json-c lately. Namely, submitting fixes and Visual C++
corrections developed within GDAL/OGR to json-c upstream.

I'd be thankful of json-c team would consider accepting Even's patch.

Best regards,
--
Mateusz Loskot, http://mateusz.loskot.net

Eric Haszlakiewicz

unread,
Jul 29, 2012, 10:05:58 PM7/29/12
to jso...@googlegroups.com
On 21 July 2012 11:04, Even Rouault <even.r...@gmail.com> wrote:
>
> I submit here a proposed patch to add a function where a json double is
> created with a user-provided string representation. This is a sanitized and
> simpler version of a similar function that exists in the embedded copy of
> json-c in GDAL/OGR, and that would be necessary to ultimately remove it and
> use plain libjson-c 0.10
> [...]

hmm... this seems rather hacky, and it's very specific to this one
type. If we want to have specific control over double type properties
now, wouldn't we want similar control over properties of other types
too? Also, it seems weird that there isn't a direct connection
between the double value and the string value.

I'm not familiar with what the use cases are, but perhaps it would
work better to instead provide a way to override the function used to
generate the output? e.g. something like:
int my_json_double_to_json_string(struct json_object *jso, struct
printbuf *pb, int level, int flags)
{
return sprintfbuf(pb, "%.2f", json_object_get_double(jso));
}

...
json_object *jso = json_object_new_double(123.456789);
json_object_set_serializer(jso, my_json_double_to_json_string);

printf("%s\n", json_object_to_json_string(jso)); // displays 123.46
...

Possible reasons I can think of this might not work:
Is the process of generating the string representation work
intensive, such that doing it every time the object is converting to a
string would be prohibitively inefficient?
Is the information contained within the double value insufficient
to properly convert it to string? e.g. are you doing something where
you receive the data in string form, and the same value (let's say
1.234) needs to map to different strings based on how it came in?
(e.g. sometimes "1.2340" vs. "123.4e-02")


eric

p.s. if we stick with the implementation from the patch, the function
name is rather unwieldy. I'd recommend dropping the "_representation"
and just using things like json_object_new_double_with_str()

Even Rouault

unread,
Aug 15, 2012, 6:42:12 AM8/15/12
to jso...@googlegroups.com
Eric, thanks for your feedback.



hmm... this seems rather hacky, and it's very specific to this one
type.  If we want to have specific control over double type properties
now, wouldn't we want similar control over properties of other types
too?

As far as primitive types are concerned, I didn't see how a string or an integer or a boolean could be represented differently (and still using correct JSON formatting) than their default serializing.

 
Also, it seems weird that there isn't a direct connection
between the double value and the string value.

I'm not familiar with what the use cases are, but perhaps it would
work better to instead provide a way to override the function used to
generate the output?  e.g. something like:
   int my_json_double_to_json_string(struct json_object *jso, struct
printbuf *pb, int level, int flags)
   {
        return sprintfbuf(pb, "%.2f", json_object_get_double(jso));
   }

  ...
   json_object *jso = json_object_new_double(123.456789);
   json_object_set_serializer(jso, my_json_double_to_json_string);

   printf("%s\n", json_object_to_json_string(jso));  // displays 123.46
  ...


That's one possibility that I had considered. But we would also need a user_data also passed to the serializer, for example to be able to control the precision instead of having it hard-coded in it. This is the GDAL use case actually. The OGR GeoJSON driver has a user configurable option to specify the number of figures after decimal point.

So in this approach the json_object structure would need 2 new fields, the pointer to the serializer func and the user_data (we would perhaps even need a pointer to a free function for the user data, but that's not strictly needed: we can assume that the caller is reponsible to free the user_data it provides to libjson, if needed). In the end, I wondered if it wouldn't be too costl, that's why I preferred the approach with the string directly provided that has less impact for users not using that customization. On the other hand, the json_object_new_double_with_str()  approach needs to allocate memory for the string, which is not the case of the serializer approach.

By the way, in the above serializer, the printbuf.h header should be used by user code. Is it indented to be used outside of libjson-c ?

 
Possible reasons I can think of this might not work:
   Is the process of generating the string representation work
intensive, such that doing it every time the object is converting to a
string would be prohibitively inefficient?

I don't think there would be significant performance problem whether the serialization is done prior to the call to son_object_new_double_with_str() or by a serializer callback. In my use case, the serialization happens only once.
 
   Is the information contained within the double value insufficient
to properly convert it to string?  e.g. are you doing something where
you receive the data in string form, and the same value (let's say
1.234) needs to map to different strings based on how it came in?
(e.g. sometimes "1.2340" vs. "123.4e-02")

No, the same double value would be always converted to the same string (for the same user_data value, see abve, of course)
 


eric

p.s. if we stick with the implementation from the patch, the function
name is rather unwieldy.  I'd recommend dropping the "_representation"
and just using things like json_object_new_double_with_str()

Yes, that would be a better naming.

In the end, I'm happy with the serializer approach or the json_object_new_double_with_str(). What would you prefer ?

 

Eric Haszlakiewicz

unread,
Sep 2, 2012, 2:00:48 PM9/2/12
to jso...@googlegroups.com
On Wed, Aug 15, 2012 at 5:42 AM, Even Rouault <even.r...@gmail.com> wrote:
> In the end, I'm happy with the serializer approach or the
> json_object_new_double_with_str(). What would you prefer ?

I think I prefer the serializer approach.

> As far as primitive types are concerned, I didn't see how a string or an
> integer or a boolean could be represented differently (and still using
> correct JSON formatting) than their default serializing.

Well, I'm expecting that a facility like this could also be used in
those cases where you need to go a bit beyond the standard JSON format
due to your specific application requirements.

>> int my_json_double_to_json_string(struct json_object *jso, struct
>> printbuf *pb, int level, int flags)
>> {
>> return sprintfbuf(pb, "%.2f", json_object_get_double(jso));
>> }
>>
>> ...
>> json_object *jso = json_object_new_double(123.456789);
>> json_object_set_serializer(jso, my_json_double_to_json_string);
>>
>> printf("%s\n", json_object_to_json_string(jso)); // displays 123.46
>> ...

> That's one possibility that I had considered. But we would also need a
> user_data also passed to the serializer, for example to be able to control
> the precision instead of having it hard-coded in it. This is the GDAL use
> case actually. The OGR GeoJSON driver has a user configurable option to
> specify the number of figures after decimal point.

Here's the API with that included:

int json_object_set_serializer(json_object *jso,
json_object_to_json_string_fn to_string_func,
void *userdata, // optional, can be NULL
void (*freeptr)(void *)); // optional, can be NULL even if
userdata is non-NULL

An example of the free function:

struct myinfo
{
int refcount;
...other fields...
};
void myfreefunc(json_object *jso, void *userdata)
{
struct myinfo *info = userdata;
info->refcount--;
if (info->refcount <= 0)
free(info);
}

> So in this approach the json_object structure would need 2 new fields, the
> pointer to the serializer func and the user_data (we would perhaps even need
> a pointer to a free function for the user data, but that's not strictly
> needed: we can assume that the caller is reponsible to free the user_data it
> provides to libjson, if needed).

Actually, we already have a field for the serializer function, so we
just need fields for the userdata and the free/delete function. I'm
adding that now.

> By the way, in the above serializer, the printbuf.h header should be used by
> user code. Is it indented to be used outside of libjson-c ?

I think that we would just declare it "quasi-public", and available
for use by custom serializer functions. I was thinking about renaming
it to json_printbuf, but since I don't see any other usages of
"printbuf" besides the one that comes with json-c, I'm just going to
leave it as-is.

eric

Mateusz Łoskot

unread,
Dec 20, 2012, 7:02:16 PM12/20/12
to jso...@googlegroups.com
On Sunday, 2 September 2012 19:00:49 UTC+1, Eric wrote:
On Wed, Aug 15, 2012 at 5:42 AM, Even Rouault <even.r...@gmail.com> wrote:
> In the end, I'm happy with the serializer approach or the
> json_object_new_double_with_str(). What would you prefer ?
> [...]

> So in this approach the json_object structure would need 2 new fields, the
> pointer to the serializer func and the user_data (we would perhaps even need
> a pointer to a free function for the user data, but that's not strictly
> needed: we can assume that the caller is reponsible to free the user_data it
> provides to libjson, if needed).

Actually, we already have a field for the serializer function, so we
just need fields for the userdata and the free/delete function.  I'm
adding that now.


Eric,

This is a bit dated topic, but could you confirm, this float-point serialization
update according to Even's patch has not been released in json-c 0.10, has it?
Reply all
Reply to author
Forward
0 new messages