CustomDataTypesHandler

197 views
Skip to first unread message

Mike Goodwin

unread,
Aug 6, 2020, 11:25:46 AM8/6/20
to h2-da...@googlegroups.com, kat...@gmail.com
Hi Evgenij,

I noticed that this api was removed (unceremoniously).

I was wondering what were the reasons, and is there an alternative?

From my point of view it would have been good to have an explanation as to why it was removed (e.g. in the commit) given that it was a public API.

It's still mentioned on the website.

Any thoughts?

Thanks,

Mike

Evgenij Ryazanov

unread,
Aug 6, 2020, 11:44:20 PM8/6/20
to H2 Database
Hello.

It's a long story. Apache Ignite used released versions of H2 internally. Because it needed more than H2 can provide, some new functionality were contributed to H2, usually without taking own needs of H2 into account. Some of this functionality wasn't used by H2 itself and other projects at all, it only created some unwanted complexity, some hypothetically was usable for other projects, some was actually used in them, such as TableEngine. All this stuff was basically unmaintained, when some problem in H2 required massive changes including these third-party features nobody wanted or be able to help with them. At some point a fork of H2 was made for Apache Ignite due to their own reasons. After that we decided to remove support for Apache Ignite from mainline H2 due to all its complexity and because it actually blocks other long-standing issues. Some features were preserved, however.

CustomDataTypeHandler never was a part of stable API. It was placed into org.h2.api package, but it has warnings in its documentation since the beginning. Existence of this poorly designed class blocked bugfixes for issues with very basic SQL features of H2. There were only two options: drop it completely or redesign it from the scratch without any backward compatibility, but nobody wanted to write a new implementation of it in 2019 and nobody cared about it till then, it's not a functionality that everyone uses and H2 doesn't have commercial customers any more and doesn't provide any guarantees to anyone. Actually there is no place for it in the new type system, it can be used only as a storage container. But we already have storage containers for arbitrary data. We have standard VARBINARY, BINARY, and BLOB data types for them. You can define an own domain to give your custom data type some better name clearly describing its content. H2 also has JAVA_OBJECT data type for serializable objects. There are no good reasons to duplicate functionality of domains here.

The SQL Standard also has user-defined data types, but H2 doesn't support them yet and they can't be easily introduced, it's a large piece of work and we need to implement SQL/PSM first. Maybe both these features will be introduced as real supported and standard-compliant features in the future, but, again, they need a lot of work.

H2 1.4.200 was the last release in 1.4 series of releases. The upcoming H2 2.0 has many incompatibilities with 1.4.200 and not every application can use the new version or snapshot builds from current sources without additional changes. We have some plans to write a migration guide for the upcoming release, but currently it isn't available anywhere.

Mike Goodwin

unread,
Aug 7, 2020, 10:24:16 AM8/7/20
to h2-da...@googlegroups.com
Thanks for the complete reply and all of the hardwork in general. I do think from the outside it has been a little unclear about the direction and motivation of the project, but obviously that clears things up.

The type I wanted to implement was going to be a rational numeric (i.e. a/b).
Do you think that patching H2 is the best way to go? Is it a datatype h2 would want to support internally?

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/a35157a7-fca4-4522-bfb0-c48d72a37785o%40googlegroups.com.

Evgenij Ryazanov

unread,
Aug 7, 2020, 10:46:01 AM8/7/20
to H2 Database
There is no such data type in the SQL Standard and such data type is quite exotic even for general purpose programming languages. I don't see any good reasons to have it in H2.

These numbers are usually stored in two columns or as a character string value. You can also use JAVA_OBJECT data type with custom org.h2.api.JavaObjectSerializer for more efficient storage. You may also need some own user-defined functions to deal with these values if you want to compare them or perform some computational operations directly in SQL.

Matt Pavlovich

unread,
Aug 7, 2020, 11:01:51 AM8/7/20
to H2 Database
Mike-

What do you see as the driver for having the field be a column type? Have you looked into H2 function support?

-Matt Pavlovich
To unsubscribe from this group and stop receiving emails from it, send an email to h2-da...@googlegroups.com.

Mike Goodwin

unread,
Aug 7, 2020, 11:21:58 AM8/7/20
to h2-da...@googlegroups.com
Ok. Yes it was so things such as computation, comparison and other conversions could work seemlessly. Obviously just storing the values is not such a huge challenge.

Probably the way you outline with the  JavaObjectSerializer is the way to go. In fact in my project I am more or less attempting to eliminate SQL, and talk to the database directly so it may be possible to live without sql support, but it will reduce quality of life given that it means all the SQL based tools will no longer work so well.
 

Matt Pavlovich

unread,
Aug 7, 2020, 11:33:14 AM8/7/20
to H2 Database
Mike-

Those functions require data of certain type, so your relational column would have to be converted to numeric for most of those anyway.

Creating a set of functions would get you there. You could then use comparison, numeric functions, etc and bonus-- 3rd party SQL tools.

AB_INSERT()
AB_SELECT_AS_DECIMAL()
AB_SELECT_AS_LONG()
AB_SELECT_AS_DOUBLE()
etc.

Mike Goodwin

unread,
Aug 7, 2020, 11:38:45 AM8/7/20
to h2-da...@googlegroups.com

What do you see as the driver for having the field be a column type? Have you looked into H2 function support?

Hi Matt,

Mostly just selfish reasons, but I do think there is a sound engineering/computer science basis for having another numeric type. Decimal types are arbitrary for computation, and we allow unlimited precision for decimal numbers (I guess in practice they round before too long). It is to avoid unnecessary rounding and wrinkles.

I do think we tend to paper over these kind of issues in our industry and as a consequence end up with mediocre standards such as SQL. The strength of 'SQL' databases is the relational model not SQL. In anycase virtually all databases don't conform and have their own superset of functionality.

- mike

Mike Goodwin

unread,
Aug 7, 2020, 11:43:54 AM8/7/20
to h2-da...@googlegroups.com
Thanks Matt,

Yeah, I know this is possible, but obviously it's pretty cumbersome, and if you do something as simple as SELECT * FROM T, you will not get to see the contents of the rational fields.

Matt Pavlovich

unread,
Aug 7, 2020, 11:53:08 AM8/7/20
to H2 Database
** Combining replies **

Mike-

The reality is computers are limited in how to express these types of numbers. I do agree that keeping the a / b as separate fields is a sound approach that allows you to maintain precision, and allows you to decide when to take the rounding hit.

As Evgenij mentioned, you can store the fields as 2 separate columns, and have the functions operate on those columns.

For example:

create table T {
a BIGINT // or DECIMAL, DOUBLE, etc
b BIGINT // or DECIMAL, DOUBLE, etc
}

select * from T // displays the a, b column values

Create a few helper functions:
AB_INSERT (8765309, 90210) // inserts values into the a / b columns and abstracts how it is stored on the backend -- for example, if you need to change the backend at some point
AB_AS_DECIMAL (a, b) // perform a / b calculation
AB_AS_BIGINT (a, b) // perform a / b calculation
AB_AS_NUMERIC (a, b) // perform a / b calculation

-Matt

Rami Ojares

unread,
Aug 7, 2020, 12:18:09 PM8/7/20
to h2-da...@googlegroups.com

Hi Mike,

When dealing with a fractional representation of a number having numerator and denominator,
the system would have to find the greatest common factor (of numerator and denominator) to represent a number uniquely (when comparing).
For addition you would have to make the denominators equal.
So having this kind of type would require a bit more computing than a decimal number where the denominator is fixed to be 10.

The only upside would be the ability to represent 1/3 (and similar numbers that can only be represented as repeating decimals)  exactly but if you now have 64 bit integer to represent a decimal then you would need a lot of precision before noticing any practical difference. Furthermore you could have even more precision using larger integers than long (eg. java's BigInteger).

Of course in mathematics you might have uses for fractional representation.
In which problem area do you have a need for absolute precision?

And finally we would still be unable to represent the irrational numbers exactly.
What kind of representation would work for all real numbers?
Just curious.

- Rami

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/CAOZwqwnhFnB8-Zny1CcK4G%2By3wduZWkMR%3DOC3ybvLSrag8kS5g%40mail.gmail.com.

Mike Goodwin

unread,
Aug 7, 2020, 1:48:09 PM8/7/20
to h2-da...@googlegroups.com
On Fri, Aug 7, 2020 at 5:18 PM Rami Ojares <rami....@gmail.com> wrote:

Hi Mike,

When dealing with a fractional representation of a number having numerator and denominator,
the system would have to find the greatest common factor (of numerator and denominator) to represent a number uniquely (when comparing).
For addition you would have to make the denominators equal.


Yeah, I understand those tradeoffs ... but I think mostly we are not cpu limited in any case. Strictly speaking you wouldn't need to find the gcf. You could just cross multiply the two and compare them that way ... probably a lot of scope for optimisation.

So having this kind of type would require a bit more computing than a decimal number where the denominator is fixed to be 10.

The only upside would be the ability to represent 1/3 (and similar numbers that can only be represented as repeating decimals)  exactly but if you now have 64 bit integer to represent a decimal then you would need a lot of precision before noticing any practical difference. Furthermore you could have even more precision using larger integers than long (eg. java's BigInteger).

Of course in mathematics you might have uses for fractional representation.
In which problem area do you have a need for absolute precision?


I have rationals in the application level type system. They are for doing calculations without creating rounding anomalys, and they work well. Some are stored (some are calculated) and current state is a hack because they are stored as decimals. In practice it works ok, but I wanted to replay changes (transactions) and I get something different because of this hack essentially. The thing is a replay system slightly different is almost as bad as completely broken because it is hard to verify anything, fix other issues... etc.
 

And finally we would still be unable to represent the irrational numbers exactly.
What kind of representation would work for all real numbers?
Just curious.


Well indeed, otherwise we'd certainly all be using rationals. Fortunately I'm not doing much sin/squareroot when dealing with currencies and quantities ...
The biggest challenge is perhaps handling the rounding, in anycase, in practice it's not really an issue if the value is merely the result of dividing a couple of user inputed (rational) fields.

 

- Rami

On 7.8.2020 18.38, Mike Goodwin wrote:

What do you see as the driver for having the field be a column type? Have you looked into H2 function support?

Hi Matt,

Mostly just selfish reasons, but I do think there is a sound engineering/computer science basis for having another numeric type. Decimal types are arbitrary for computation, and we allow unlimited precision for decimal numbers (I guess in practice they round before too long). It is to avoid unnecessary rounding and wrinkles.

I do think we tend to paper over these kind of issues in our industry and as a consequence end up with mediocre standards such as SQL. The strength of 'SQL' databases is the relational model not SQL. In anycase virtually all databases don't conform and have their own superset of functionality.

- mike

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/CAOZwqwnhFnB8-Zny1CcK4G%2By3wduZWkMR%3DOC3ybvLSrag8kS5g%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages