Hi Thomas,
> If you are interested in the string representation, why don't you just
> use varchar?
I would if I could, but I can't so I won't.
> It's good that you care about your use case, but there
> are other uses cases to be considered.
Yes, let's come to that, and let's be pragmatic.
It's good that you care about the current logic at the low level
implementation, but there is actual usage to be considered.
> The rules about which type is 'higher' or 'lower' than other types is
> something very basic, and I wouldn't want to change it just because it
> might be a bit easier for a few cases (well, just for this case).
The current implementation is, in my view, based on a wrong
assumption: transitivity of type ordering. You consider that if A < B
and B < C, then A < C. So you assigned constants and all was well.
Suddenly, the Java Object comes into play.
Type conversions should be handled by pairs, and not as if there was a
global order. Only for the case of the Java Object, String vs Java
Object should use String. I will explain below my thinking.
But first, my simple not-intrusive change is this:
in "Value.getHigherOrder(int t1, int t2)":
// Start
if (!SysProperties.SERIALIZE_JAVA_OBJECT && (t1 == Value.STRING && t2
== Value.JAVA_OBJECT || t2 == Value.STRING && t1 ==
Value.JAVA_OBJECT)) {
return Value.STRING;
}
// End
int o1 = getOrder(t1);
int o2 = getOrder(t2);
return o1 > o2 ? t1 : t2;
> Another reason why the database tries to convert 'SomeText' to a java
> object is so the database can use an index if there is an index for
> the given column. So basically the database, to allow using an index,
> needs to use:
>
> where columnName = cast('SomeText' as <columnType>)
Again, low-level implementation focus... So according to you, users
are going to do:
SELECT * FROM SomeTable WHERE MyObject='<insert bytes here as a string>'
Yeah, right :)
> That would mean you can't use an index on that column for such
> queries. For you that might be acceptable, but I think it's not.
If a String is to be used for comparison, so be it. Or improve the
index on Object mechanism to also index the String comparison. But you
cannot say that "users can only use byte representation because that
way they can benefit of the index".
Let's move on to why I think this String approach is desirable. Let's
use a hypothetical example so we understand better each other.
Suppose we have this class (pseudo code):
City {
long id;
String name;
toString() { return name; }
compareTo(c) { compare names, then ids}
}
A column is using that City class. Users can join, and so on, which
allows to find the relation of cities from different tables. If a user
selects it, 2 possibilities:
- the tool displaying the result has a high level understanding of
that object and has a custom display (even contextual navigation,
etc.).
- the tool displays the string representation. Note that they know it
is more than the string displayed because of the type of the column
and eventually other factors (documentation, conventions on naming,
etc).
Now, if a VARCHAR were used, then "Paris" could appear several times,
and joins would not work.
When the user sees "Paris", and wants to know more about it, then he
could select cities = 'Paris', and join on it.
The assumption you made that the toString method returns a unique
String that can be mapped to the object is wrong. A String
representation of an object is not necessarily unique. If it were,
then VARCHAR would be used. And having a unique String representation
like "Paris [7834671]" would make it useless in the tools.
You said that Strings should try to convert to the Java Object, but if
the implementor created the Java Object, they probably also exposed a
function to create a City instance: TO_CITY(long, String).
Index would work in that case, but this is for advanced usage because
in most cases the String representation is enough for a user to drill
down or understand the object instances they are looking at. They also
don't have to figure out all the other fields to know to create the
city object.
I hope you will understand that having a string representation makes
it user friendly to view results of a SELECT, and that users expect to
be able to filter using that information.
Hope this helps,
-Christopher