[Feedback wanted] Distinguish null from empty string for LocalDate and LocalDateTime deserialization #114

17 views
Skip to first unread message

Michael O'Keeffe

unread,
Sep 6, 2019, 12:53:33 PM9/6/19
to jackson-dev
Hi all,

This is a request for feedback on how to distinquish null from empty string for LocalDate and LocalDateTime deserialization, issue #114

To summarize the issue and possible solutions:

beytun (issue submitter) writes:
 

For LocalDate and LocalDateTime deserialization, empty string is mapped to null by LocalDateDeserializer and LocalDateTimeDeserializer respectively. So something like {"date": null} can't be distinguished from {"date": ""}. From the view of a strict API, the latter is an error.

It would be good to add some feature to treat empty string as invalid format, that LocalDateDeserializer and LocalDateTimeDeserializer could report this as an error instead of mapping to null.


Tatu writes:

Perhaps use of DeserializationFeature.ACCEPT_EMPTY_STRING_AS_NULL_OBJECT would work here, although my main concern is that since default value is false, adding a check would make formerly accepted values invalid with 2.10...

Alternative way would be to add a new option/feature in JavaTimeModule, where it could be locally enabled/disabled.


Now, given beytun's point about 'strict', perhaps another solution is to use the lenient/strict setting here? That is already used in LocalDateDeserializer to throw an exception:


 if (parser.hasToken(JsonToken.VALUE_NUMBER_INT)) {
   
if (!isLenient()) {
     
return _failForNotLenient(parser, context, JsonToken.VALUE_STRING);
   
}
   
return LocalDate.ofEpochDay(parser.getLongValue());
 
}

Test case to illustrate issue:

private final TypeReference<Map<String, LocalDate>> MAP_TYPE_REF = new TypeReference<Map<String, LocalDate>>() { };
private final ObjectMapper LOCAL_DATE_MAPPER = newMapper();
private final ObjectReader LOCAL_DATE_READER = LOCAL_DATE_MAPPER.readerFor(MAP_TYPE_REF);

@Test
public void testDeserializationNullAndEmptyString() throws Exception {
 
String dateValAsNullStr = null;

 
String valueFromNullStr = LOCAL_DATE_MAPPER.writeValueAsString(asMap("date", dateValAsNullStr));
 
Map<String, LocalDate> actualMapFromNullStr = LOCAL_DATE_READER.readValue(valueFromNullStr);
 assertNull
(actualMapFromNullStr.get("date"));

 
String dateValAsEmptyStr = "";
 
String valueFromEmptyStr = LOCAL_DATE_MAPPER.writeValueAsString(asMap("date", dateValAsEmptyStr));
 
Map<String, LocalDate> actualMapFromEmptyStr = LOCAL_DATE_READER.readValue(valueFromEmptyStr);
 
// this test fails!
 assertNotEquals
(actualMapFromEmptyStr,actualMapFromNullStr);
}



Researching a bit, there doesn't seem to be anything in the spec that says not to do it as beytun suggests:

https://stackoverflow.com/questions/9619852/what-is-the-convention-in-json-for-empty-vs-null?noredirect=1&lq=1
https://www.ibm.com/support/knowledgecenter/SSTLXK_7.5.1/com.ibm.wbpm.wid.integ.doc/topics/rjsonnullunsempprops.html

Speaking of nulls, Tony Hoare calls this his "billion-dollar" mistake. Initially I disagreed but now I'm not so sure :)
https://www.lucidchart.com/techblog/2015/08/31/the-worst-mistake-of-computer-science/



Ruwen Schwedewsky

unread,
Sep 8, 2019, 8:02:22 PM9/8/19
to jacks...@googlegroups.com
Moin!

How about delaying this change to 3.0 and then use the lenient option? Imho "" and null should be treated differently by default.

Cheers
Ruwen
--
You received this message because you are subscribed to the Google Groups "jackson-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jackson-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jackson-dev/f98b3a71-c408-4eb9-a029-743dcf1f73af%40googlegroups.com.

This email and any attachments to it (the "Communication") is, unless otherwise stated, confidential, may contain copyright material and is for the use only of the intended recipient. If you receive the Communication in error, please notify the sender immediately by return email, delete the Communication and the return e mail, and do not read, copy, retransmit or otherwise deal with it. Any views expressed in the Communication are those of the individual sender only, unless expressly stated to be those of Message4U Pty Ltd trading as MessageMedia or any of its related entities (together “MessageMedia”). MessageMedia does not accept liability in connection with the integrity of or errors in the Communication, computer virus, data corruption, interference or delay arising from or in respect of the Communication.

Tatu Saloranta

unread,
Sep 10, 2019, 1:07:21 AM9/10/19
to jacks...@googlegroups.com
On Sun, Sep 8, 2019 at 5:02 PM Ruwen Schwedewsky
<ruwen.sc...@messagemedia.com.au> wrote:
>
> Moin!
>
> How about delaying this change to 3.0 and then use the lenient option? Imho "" and null should be treated differently by default.

I think the problem is that as things are right now, "" and `null` are
not distinguished by Java 8 date/time deserializers. I agree that a
change in default behavior should wait until 3.0, but it seems
unfortunate if there is no way to force strict handling with 2.10.

If we do want to allow that, it seems to me there are 2 main options:

1. Use `lenient` vs "strict" setting: 2.10 allows global setting
(default being "lenient" for backwards compatibility), as well as
`@JsonFormat` override for type (class) and property
2. Use `ACCEPT_EMPTY_STRING_AS_NULL_OBJECT` -- problem being that it
is disabled by default, which is at odds with actual behavior.

I think I would favor (1) , but it has one downside as well: since it
already has meaning for date/time's (for example allowing values like
"February 31st, 2019", to overflow into "March 3rd, 2019", I think),
this would be extending its semantics.

I guess there is also 3rd option: add a setting in Java 8 date/time
module itself; something that would be removed from 3.0, if used.

-+ Tatu +-
> To view this discussion on the web visit https://groups.google.com/d/msgid/jackson-dev/78670562-c800-4048-1dca-b78d76a73efc%40messagemedia.com.au.

Michael

unread,
Sep 11, 2019, 12:21:32 AM9/11/19
to jacks...@googlegroups.com


>I think I would favor (1) , but it has one downside as well: since it
>already has meaning for date/time's (for example allowing values like
>"February 31st, 2019", to overflow into "March 3rd, 2019", I think),
>this would be extending its semantics.

Yes, we are blurring the definition a little of strict vs. lenient, as is currently used in Jackson, with the overflow. However, from these two links, we seem to be within the definition, as an empty string is not "within the outer range of valid values for the field":

Using strict resolution will ensure that all parsed values are within the outer range of valid values for the field. Individual fields may be further processed for strictness.

For example, resolving year-month and day-of-month in the ISO calendar system using strict mode will ensure that the day-of-month is valid for the year-month, rejecting invalid values.

Using lenient resolution will resolve the values in an appropriate lenient manner. Individual fields will interpret this differently.

For example, lenient mode allows the month in the ISO calendar system to be outside the range 1 to 12. For example, month 15 is treated as being 3 months after month 12.

https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html#ISO_LOCAL_TIME

https://docs.oracle.com/javase/8/docs/api/java/time/format/ResolverStyle.html#STRICT


Now, one question: Would 'strict' be the default in 3.0?


If that's the case, option 1 would still work by flipping the logic around, to return null 'if isLenient'. But if 'lenient' is the default, the 'lenient/strict' method wouldn't meet Ruwen's recommendation, which leads to option #3.

>2. Use `ACCEPT_EMPTY_STRING_AS_NULL_OBJECT` -- problem being that it
>is disabled by default, which is at odds with actual behavior.

>I guess there is also 3rd option: add a setting in Java 8 date/time

>module itself; something that would be removed from 3.0, if used.

Ok, perhaps 'ACCEPT_EMPTY_STRING_AS_NULL_DATE' which defaults to true in 2.10

Then removed and then replaced with `ACCEPT_EMPTY_STRING_AS_NULL_OBJECT` in 3.0, which is already defaulted to false.

Given all that, if option 1 looks ok, I can move the PR from Draft-> PR, unless option 3 is a better way to go, and there my biggest question is what exception to throw, maybe the 'weird string'? Thoughts?

Tatu Saloranta

unread,
Sep 14, 2019, 10:28:32 PM9/14/19
to jacks...@googlegroups.com
On Tue, Sep 10, 2019 at 9:21 PM Michael <mich...@gmail.com> wrote:
>
> >I think I would favor (1) , but it has one downside as well: since it
> >already has meaning for date/time's (for example allowing values like
> >"February 31st, 2019", to overflow into "March 3rd, 2019", I think),
> >this would be extending its semantics.
>
> Yes, we are blurring the definition a little of strict vs. lenient, as is currently used in Jackson, with the overflow. However, from these two links, we seem to be within the definition, as an empty string is not "within the outer range of valid values for the field":

True. One nice thing about strict/lenient settings is that there is
already full support for global/per-type/per-property hierarchic
configuration. And for 3.0 we should be able to add "type category"
defaulting between first two categories too ("for all date/time values
use strict unless overridden by type or property").

> Using strict resolution will ensure that all parsed values are within the outer range of valid values for the field. Individual fields may be further processed for strictness.
>
> For example, resolving year-month and day-of-month in the ISO calendar system using strict mode will ensure that the day-of-month is valid for the year-month, rejecting invalid values.
>
> Using lenient resolution will resolve the values in an appropriate lenient manner. Individual fields will interpret this differently.
>
> For example, lenient mode allows the month in the ISO calendar system to be outside the range 1 to 12. For example, month 15 is treated as being 3 months after month 12.
>
> https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html#ISO_LOCAL_TIME
>
> https://docs.oracle.com/javase/8/docs/api/java/time/format/ResolverStyle.html#STRICT
>
>
> Now, one question: Would 'strict' be the default in 3.0?

Good question. I am not sure.

I have been burnt a few times by this: the biggest example is default
of Jackson to throw exception on unknown properties. To me that was a
no-brainer -- who would want to defaul to "just ignore the crap",
based on frustration this can lead to with mismatching structures,
quietly passing! -- but many users have claimed this as the most
obnoxious default ever. :-)

So. Maybe? If so (or even if not I suppose) we MUST document it
clearly, and document ways to change it.

>
> If that's the case, option 1 would still work by flipping the logic around, to return null 'if isLenient'. But if 'lenient' is the default, the 'lenient/strict' method wouldn't meet Ruwen's recommendation, which leads to option #3.
>
> >2. Use `ACCEPT_EMPTY_STRING_AS_NULL_OBJECT` -- problem being that it
> >is disabled by default, which is at odds with actual behavior.
>
> >I guess there is also 3rd option: add a setting in Java 8 date/time
> >module itself; something that would be removed from 3.0, if used.
>
> Ok, perhaps 'ACCEPT_EMPTY_STRING_AS_NULL_DATE' which defaults to true in 2.10

Problem here is two-fold: first, I do not want to add many
type-specific Features (although TBH, number of types that need these
is kind of limited; Enums, Date/Time...). Second is my concern is
hierarchic handling, if and how to override it.

> Then removed and then replaced with `ACCEPT_EMPTY_STRING_AS_NULL_OBJECT` in 3.0, which is already defaulted to false.
>
> Given all that, if option 1 looks ok, I can move the PR from Draft-> PR, unless option 3 is a better way to go, and there my biggest question is what exception to throw, maybe the 'weird string'? Thoughts?

Weird String is probably fine.

But now I am bit confused myself on what I actually proposed... so
perhaps before PR, full description on intended change, as an issue
(maybe modify existing one?), to be reviewed, would make sense.

And also phasing wrt 2.10, 3.0.

-+ Tatu +-
> To view this discussion on the web visit https://groups.google.com/d/msgid/jackson-dev/CAEBemP14sVqZy0LabBvCHBfuViz1qh0G1ZJ_6KHUkmOLQuGEGQ%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages