DecimalField model validation

1,375 views
Skip to first unread message

Marco Paolini

unread,
Oct 5, 2011, 9:19:43 AM10/5/11
to django-d...@googlegroups.com
Hi,

I would like to share some thoughts regarding django.db.models.DecimalField:

.. (A) silent rounding issue:
when a decimal value is saved in db
its decimal digits exceeding decimal_places are rounded off using
.quantize(). Rounding defaults to ROUND_HALF_EVEN (python default).
There is no mention of this behavior in docs.

.. (B) no model validation for decimal digits:
.full_clean() does not raise any exception:
- if the value has too many decimal digits
- or if value has too many digits
other fields, like CharField, do not validate values that exceed fields constraints

.. (C) errors on save:
decimal values exceeding field's digits boundaries (decimal or total) make .save() raise decimal.InvalidOperation exceptions

In my opinion they should be fixed in a backwards-incompatible way!:
(A) django shuld not round decimal values silently before saving, if the value has too many decimal digits,
raise an exception (just like for values that have too many digits)
(B) decimalfield shoud not validate values that exceed its max_digits and decimal_places constraints
(C) leave as is, but if you solve (A) and (B) it's not that bad

I have a small django testcase that better explains the problem and proposed solution(s)

What do you think?


Marco

Paul McMillan

unread,
Oct 5, 2011, 8:45:11 PM10/5/11
to django-d...@googlegroups.com
> .. (A) silent rounding issue:
>    when a decimal value is saved in db
>    its decimal digits exceeding decimal_places are rounded off using
>    .quantize(). Rounding defaults to ROUND_HALF_EVEN (python default).
>    There is no mention of this behavior in docs.
Docs patches welcomed. This rounding behavior is generally superior to
ROUND_UP (what most of us are taught in gradeschool) for a number of
reasons that are inappropriate to get into here. If we're going to
round, that's the behavior we should use as default.

> .. (B) no model validation for decimal digits:
>    .full_clean() does not raise any exception:
>       - if the value has too many decimal digits
>       - or if value has too many digits
>    other fields, like CharField, do not validate values that exceed fields
> constraints

There's no clearly defined way to deal with overlong CharFields.
Rounding overlong floating point numbers is common and not unexpected.
Decimals work similarly to floats, and so it's not unreasonable to
have similar behavior.

Reading the docs on Python's decimal module, it makes sense to think
of the decimal_places as an analog to python's decimal notion of
precision.
http://docs.python.org/library/decimal.html#module-decimal

> .. (C) errors on save:
>   decimal values exceeding field's digits boundaries (decimal or total) make
> .save() raise decimal.InvalidOperation exceptions

This does sound like an error in validation. If it's possible to pass
a value through form validation to the point where it has errors on
save, that is a bug.

> In my opinion they should be fixed in a backwards-incompatible way!:

Thank you for your opinion, we don't do backwards-incompatible fixes.
There's usually a compatible way to fix behavior.

>  (A) django shuld not round decimal values silently before saving, if the
> value has too many decimal digits,
>      raise an exception (just like for values that have too many digits)

In the general use case, the existing behavior is more user friendly.
Many people would run out and re-add the rounding behavior if we
changed it. Users will enter over-long strings into decimal fields.
It's really unfriendly to paste an 32 place decimal number in and be
told it can only be 17 decimal digits long (and then have to go count
out till you get the correct number). Since there's not a good way to
make this the default value and retain backwards compatibility, this
is unlikely to change. This behavior should be documented.

That said, a no_rounding kwarg sounds like a perfectly reasonable feature.

>  (B) decimalfield shoud not validate values that exceed its max_digits and
> decimal_places constraints

I agree that decimalfield should not validate values that exceed
max_digits. If it does, that is a bug we should fix. We need to be
careful not to create situations where very large numbers validate but
very small but precise decimal numbers get rounded unexpectedly.
Unfortunately, these two parameters overlap in difficult ways. I
disagree about decimal_places, since the expected behavior is rounding
in most other real-world circumstances.

Does the combination of documenting the existing rounding behavior,
fixing the error on save() with max_digits, and adding the no_rounding
feature address your concerns?

-Paul

Marco Paolini

unread,
Oct 6, 2011, 4:13:01 AM10/6/11
to django-d...@googlegroups.com
On 06/10/2011 02:45, Paul McMillan wrote:
>> .. (A) silent rounding issue:
>> when a decimal value is saved in db
>> its decimal digits exceeding decimal_places are rounded off using
>> .quantize(). Rounding defaults to ROUND_HALF_EVEN (python default).
>> There is no mention of this behavior in docs.
> Docs patches welcomed. This rounding behavior is generally superior to
> ROUND_UP (what most of us are taught in gradeschool) for a number of
> reasons that are inappropriate to get into here. If we're going to
> round, that's the behavior we should use as default.
Maybe you were referring to ROUND_HALF_UP, where 0.5 becomes 1?

with current ROUND_HALF_EVEN 0.5 becomes 0.

Yes I agree ROUND_HALF_EVEN in this particular situation might be
better than ROUND_HALF_UP, let's write it in docs.

>
>> .. (B) no model validation for decimal digits:
>> .full_clean() does not raise any exception:
>> - if the value has too many decimal digits
>> - or if value has too many digits
>> other fields, like CharField, do not validate values that exceed fields
>> constraints
> There's no clearly defined way to deal with overlong CharFields.
> Rounding overlong floating point numbers is common and not unexpected.
> Decimals work similarly to floats, and so it's not unreasonable to
> have similar behavior.
>
> Reading the docs on Python's decimal module, it makes sense to think
> of the decimal_places as an analog to python's decimal notion of
> precision.
> http://docs.python.org/library/decimal.html#module-decimal
>
>> .. (C) errors on save:
>> decimal values exceeding field's digits boundaries (decimal or total) make
>> .save() raise decimal.InvalidOperation exceptions
> This does sound like an error in validation. If it's possible to pass
> a value through form validation to the point where it has errors on
> save, that is a bug.

Forms (ModelForms) correctly check decimal places and max_digits.

I was referring to model validation: .full_clean() on model instance
that in turn calls .validate() on field instance

>
>> In my opinion they should be fixed in a backwards-incompatible way!:
> Thank you for your opinion, we don't do backwards-incompatible fixes.
> There's usually a compatible way to fix behavior.
>
>> (A) django shuld not round decimal values silently before saving, if the
>> value has too many decimal digits,
>> raise an exception (just like for values that have too many digits)
> In the general use case, the existing behavior is more user friendly.
> Many people would run out and re-add the rounding behavior if we
> changed it. Users will enter over-long strings into decimal fields.
> It's really unfriendly to paste an 32 place decimal number in and be
> told it can only be 17 decimal digits long (and then have to go count
> out till you get the correct number).

This is exactly what happens right now with forms.DecimalField,

It requires Users to check length of decimal input: it's very picky
and it does a good job: 0.91 will not validate on a form field
with only one decimal digit.

I don't understand who are thu Users you mention...

Users of model.full_clean()
are usually programmers that get frustrated to find out that the model
did validate and then they get errors on save()

Users of web forms are not allowed to enter values with more decimal digits
than expected, even if the value will be rounded. As a matter of fact
forms.DecimalField does not know ronding.



>
> That said, a no_rounding kwarg sounds like a perfectly reasonable feature.
>

ok, the kwarg will do


>> (B) decimalfield shoud not validate values that exceed its max_digits and
>> decimal_places constraints
> I agree that decimalfield should not validate values that exceed
> max_digits. If it does, that is a bug we should fix. We need to be
> careful not to create situations where very large numbers validate but
> very small but precise decimal numbers get rounded unexpectedly.
> Unfortunately, these two parameters overlap in difficult ways. I
> disagree about decimal_places, since the expected behavior is rounding
> in most other real-world circumstances.

ok, that's fine, again let's wirite it in the docs: when you call save() your decimal is rounded
with ROUND_HALF_EVEN to match decimal_places.


>
> Does the combination of documenting the existing rounding behavior,
> fixing the error on save() with max_digits, and adding the no_rounding
> feature address your concerns?

yes, let's summarize:
- (A) add a no_rounding kwarg to models.DecimalField
- (B) implement models.DecimalField .validate() that will raise an exception
if value exceed decimal_places and no_rounding
OR if value exceeds max_digits.
The principle is: if models.DecimalField.validate() passes
then it's safe to call save()
in practice we could use code in forms.fields.DecimalField
as a starting point
- (C) errors on .save() are acceptable ONLY if .validate() raises exception
we keep rounding HALF_EVEN if value has too many decimal digits
we keep raising a decimal.InvalidOperation exception if value exceeds max_digits

If we agree, I will be happy to write a patch with tests and docs

Thanks for your comments,

Marco

Tai Lee

unread,
Oct 7, 2011, 12:29:19 AM10/7/11
to Django developers
Why is ROUND_HALF_EVEN superior? Perhaps for statistics, where
rounding all halves up would skew results, but I guess not for most
other cases.

If the default rounding behaviour produces different results than a
regular calculator, common spreadsheet and accounting software or even
human (mentally adding numbers) would, then I think the default
behaviour is surprising to many users. I was certainly surprised when
I first encountered it, not being familiar with Python's default of
ROUND_HALF_EVEN.

I don't see an easy way for users to change the default rounding
behaviour for `DecimalField`. Would it be considered backwards
incompatible to change the default to ROUND_HALF_UP?

If so, would it be easy enough to make it possible for users to easily
change the default behaviour for a single field or even project-wide?

I hate to suggest adding another setting, but might it be necessary
for some users to change the rounding behaviour of DecimalField that
is defined in a 3rd party app?

On a related note, the `floatformat` template tag explicitly uses
ROUND_HALF_UP. I think we should at least be consistent. And if
there's cause to allow users to change the behaviour for DecimalField,
they should probably be able to change `floatformat` as well.

I agree that a no_rounding argument would also be good, in addition to
being able to specify the rounding method to use.

Cheers.
Tai.


On Oct 6, 11:45 am, Paul McMillan <p...@mcmillan.ws> wrote:
> > .. (A) silent rounding issue:
> >    when a decimal value is saved in db
> >    its decimal digits exceeding decimal_places are rounded off using
> >    .quantize(). Rounding defaults to ROUND_HALF_EVEN (python default).
> >    There is no mention of this behavior in docs.
>
> Docs patches welcomed. This rounding behavior is generally superior to
> ROUND_UP (what most of us are taught in gradeschool) for a number of
> reasons that are inappropriate to get into here. If we're going to
> round, that's the behavior we should use as default.
>
> > .. (B) no model validation for decimal digits:
> >    .full_clean() does not raise any exception:
> >       - if the value has too many decimal digits
> >       - or if value has too many digits
> >    other fields, like CharField, do not validate values that exceed fields
> > constraints
>
> There's no clearly defined way to deal with overlong CharFields.
> Rounding overlong floating point numbers is common and not unexpected.
> Decimals work similarly to floats, and so it's not unreasonable to
> have similar behavior.
>
> Reading the docs on Python's decimal module, it makes sense to think
> of the decimal_places as an analog to python's decimal notion of
> precision.http://docs.python.org/library/decimal.html#module-decimal

Christophe Pettus

unread,
Oct 7, 2011, 1:25:56 AM10/7/11
to django-d...@googlegroups.com

On Oct 6, 2011, at 9:29 PM, Tai Lee wrote:

> Why is ROUND_HALF_EVEN superior?

ROUND_HALF_EVEN is the standard when doing financial calculations, an extremely common use of DecimalField.

--
-- Christophe Pettus
x...@thebuild.com

Marco Paolini

unread,
Oct 7, 2011, 3:59:55 AM10/7/11
to django-d...@googlegroups.com
On 07/10/2011 06:29, Tai Lee wrote:
> Why is ROUND_HALF_EVEN superior? Perhaps for statistics, where
> rounding all halves up would skew results, but I guess not for most
> other cases.
>
> If the default rounding behaviour produces different results than a
> regular calculator, common spreadsheet and accounting software or even
> human (mentally adding numbers) would, then I think the default
> behaviour is surprising to many users. I was certainly surprised when
> I first encountered it, not being familiar with Python's default of
> ROUND_HALF_EVEN.
>
> I don't see an easy way for users to change the default rounding
> behaviour for `DecimalField`. Would it be considered backwards
> incompatible to change the default to ROUND_HALF_UP?
>
> If so, would it be easy enough to make it possible for users to easily
> change the default behaviour for a single field or even project-wide?

I think rounding on .save() is not good
even if you could choose which rounding method to use
or if you changed the default method to a more familiar one.
(more on this below)

>
> I hate to suggest adding another setting, but might it be necessary
> for some users to change the rounding behaviour of DecimalField that
> is defined in a 3rd party app?

Another option you have here is to alter behaviour of your db backend value_to_db_decimal

You can subclass the backend you're using, then
in settings.py DATABASES['default']['ENGINE'] = 'my_proj.backends.RoundingBackend'

Even monkeypatching the original backend will do the trick.

Or listening for pre-save signals in models with DecimalField

>
> On a related note, the `floatformat` template tag explicitly uses
> ROUND_HALF_UP. I think we should at least be consistent. And if
> there's cause to allow users to change the behaviour for DecimalField,
> they should probably be able to change `floatformat` as well.
>
> I agree that a no_rounding argument would also be good, in addition to
> being able to specify the rounding method to use.

ok so the rounding kwarg could be set to 'NO_ROUND', 'ROUND_HALF_UP', 'ROUND_HALF_DOWN', etc...

when, for instance, 'ROUND_HALF_UP' is chosen, rounding should take
place during model validation, this way after calling .full_clean()
all the instance attributes are rounded.

Code in model.clean() will already get the rounded values and
make decisions based on the actual values that will later be stored
in db.

If you choose no rounding (and the digits exceed), then full_clean()
should raise an exception

Letting the rounding happen in .save() is bad and should be avoided whenever possible

Let me repeat that user input through forms is already checked by forms.DecimalField
so you won't see any rounding issues for data coming through forms.

The problem arises when instances are created by serializers or whenever
in your code you assign a value to a decimal attribute of a model instance.

Default serializers won't call .save() or send signals or do any model validation
but data is assumed to be already validated, so I don't see an issue here.

Marco

Reply all
Reply to author
Forward
0 new messages