Fast floating-point number encoding

47 views
Skip to first unread message

Volkan Yazıcı

unread,
Dec 27, 2019, 6:29:19 AM12/27/19
to jackson-user
Hello,

Is there a faster way to encode floating-point numbers in JsonGenerator? #writeNumber(double) falls back to Double.toString(), which is known to be not very efficient. In my particular case, I want to encode a j.u.Instant. Instant is, in principle, composed of epoch seconds (integral part) and nanos (fractional part). Hence, I can actually easily do #writeNumber("" + epochSeconds + '.' + epochSecondsNanos). Though, this has one caveat: String allocation. Given I have two long's denoting the integral and fractional parts of a floating-point number, is it possible to output this via JsonGenerator without extra allocation?

Best.

Volkan Yazıcı

unread,
Dec 27, 2019, 5:24:55 PM12/27/19
to jackson-user
I've had some further progress by encoding my floating-point number into a ThreadLocal<char[]> without any garbage, but this time I miss JsonGenerator#writeNumber(char[], int, int) method. I've tried out #writeRawValue(char[], int, int), but have ended up corrupting the state. No luck so far. Any tips?

Tatu Saloranta

unread,
Dec 28, 2019, 6:31:42 PM12/28/19
to jackson-user
On Fri, Dec 27, 2019 at 3:29 AM Volkan Yazıcı <volkan...@gmail.com> wrote:
>
> Hello,
>
> Is there a faster way to encode floating-point numbers in JsonGenerator?

Not currently. I think there is an issue filed to support faster
encoding... this one:

https://github.com/FasterXML/jackson-core/issues/577

so if you or anyone else was interested in working on that -- esp.
since there is some prior art to use (Jsoniter author offered his help
which is very nice) -- that would be very useful.

But as to other specific alternatives:

> #writeNumber(double) falls back to Double.toString(), which is known to be not very efficient. In my particular case, I want to encode a j.u.Instant. Instant is, in principle, composed of epoch seconds (integral part) and nanos (fractional part). Hence, I can actually easily do #writeNumber("" + epochSeconds + '.' + epochSecondsNanos). Though, this has one caveat: String allocation. Given I have two long's denoting the integral and fractional parts of a floating-point number, is it possible to output this via JsonGenerator without extra allocation?

I guess that if you are only focusing on JSON output (and not other
formats), you could probably use sequence of:

writeRawValue(firstPart);
writeRaw(secondAndOthers)

to get output that would not require String allocation on caller side
(and buffering does not construct Strings) and should add proper
separators. `writeRawValue()`, specifically, ensures that preceding
separator(s) / indentation is added, whereas `writeRaw()` explicitly
does nothing.

Whether doing this is worth the hassle is an open question, I think;
you may want to write benchmark to investigate performance benefits.
My guess is that String concat/allocation won't be significant
overhead in this case -- but I have been wrong before so measurements
trump educated guesses :)

One more thing: although it wouldn't help you at this point, addition
of something like

writeNumber(char[] buffer, int offset, int length);

for 2.11 would also allow caller to reuse encoding buffer. If that
seems useful, please file an issue for `jackson-core`: addition would
be trivial and could go in 2.11.

I hope this helps!

-+ Tatu +-

Volkan Yazıcı

unread,
Dec 29, 2019, 9:42:31 AM12/29/19
to jackson-user
Thanks for the prompt reply Tatu!

[My comments are inline.]

On Sunday, December 29, 2019 at 12:31:42 AM UTC+1, Tatu Saloranta wrote:
On Fri, Dec 27, 2019 at 3:29 AM Volkan Yazıcı <volkan...@gmail.com> wrote:
> Is there a faster way to encode floating-point numbers in JsonGenerator?

Not currently. I think there is an issue filed to support faster
encoding... this one:

https://github.com/FasterXML/jackson-core/issues/577

so if you or anyone else was interested in working on that -- esp.
since there is some prior art to use (Jsoniter author offered his help
which is very nice) -- that would be very useful.

I had bumped into #514. Here #577 proposes incorporating jsoniter enhancements, while #514 proposes using ryu, which is the library behind jsoniter, AFAIU.
 
> #writeNumber(double) falls back to Double.toString(), which is known to be not very efficient. In my particular case, I want to encode a j.u.Instant. Instant is, in principle, composed of epoch seconds (integral part) and nanos (fractional part). Hence, I can actually easily do #writeNumber("" + epochSeconds + '.' + epochSecondsNanos). Though, this has one caveat: String allocation. Given I have two long's denoting the integral and fractional parts of a floating-point number, is it possible to output this via JsonGenerator without extra allocation?

I guess that if you are only focusing on JSON output (and not other
formats), you could probably use sequence of:

    writeRawValue(firstPart);
    writeRaw(secondAndOthers)

This did not really work for me. But right now I cannot recall why. I will give this another try and will let you know.
 
to get output that would not require String allocation on caller side
(and buffering does not construct Strings) and should add proper
separators. `writeRawValue()`, specifically, ensures that preceding
separator(s) / indentation is added, whereas `writeRaw()` explicitly
does nothing.

Whether doing this is worth the hassle is an open question, I think;
you may want to write benchmark to investigate performance benefits.
My guess is that String concat/allocation won't be significant
overhead in this case -- but I have been wrong before so measurements
trump educated guesses :)

Yep, my conclusion is JMH-driven.
 
One more thing: although it wouldn't help you at this point, addition
of something like

    writeNumber(char[] buffer, int offset, int length);

for 2.11 would also allow caller to reuse encoding buffer. If that
seems useful, please file an issue for `jackson-core`: addition would
be trivial and could go in 2.11.

Created #587.
 

Volkan Yazıcı

unread,
Dec 29, 2019, 4:56:50 PM12/29/19
to jackson-user
On Sunday, December 29, 2019 at 3:42:31 PM UTC+1, Volkan Yazıcı wrote:
On Sunday, December 29, 2019 at 12:31:42 AM UTC+1, Tatu Saloranta wrote:
I guess that if you are only focusing on JSON output (and not other
formats), you could probably use sequence of:

    writeRawValue(firstPart);
    writeRaw(secondAndOthers)

This did not really work for me. But right now I cannot recall why. I will give this another try and will let you know.

I guess I have a smoking gun here: FilteringGeneratorDelegate! I have reported the issue along with the reproduction path in #588.
 

Volkan Yazıcı

unread,
Jan 7, 2020, 10:16:14 AM1/7/20
to jackson-user
Tatu, I've submitted two PRs for the issues that I've reported: #589 and #590. Would you be so kind to include them in 2.11, please? And do you have an ETA for that release? (I need the writeNumber(char[], int, int) fix for log4j2-logstash-layout.)

Tatu Saloranta

unread,
Jan 7, 2020, 1:32:04 PM1/7/20
to jackson-user
Hi there! thank you for submitting those -- this (#587, I think) is on my todo list:


and I'll try to get it done in near future, now that 2.10.2 release is out (it takes couple of hours to do full minor release).

As to 2.11 (pre-release), earliest release-candidate could go out by end of January, but it depends on my progress on the main remaining "big feature" related to `jackson-jr` (which requires API changes and can not go in a patch). After that I'll go over issues and see if there are any other must-have issues with api changes.

I am not yet sure if there even should be release-candidates or not (there's limited evidence that I get enough feedback to help with official release), too. Either way, actual 2.11.0 should hopefully be released some time during February 2020.

-+ Tatu +-

 
Reply all
Reply to author
Forward
0 new messages