Hi! I'm new to JOOQ!
I'm looking for a solution to replace JPA/Hibernate as the persistence engine in my new Java web application. I'm looking for something closer to SQL and JOOQ seems to be a really good fit!
But I have concerns about how JOOQ can replace Hibernate in some use cases, in particular for loading and saving entities that contain children entities.
Lukas has already helped me on the question but I would like more details. Any help from the JOOQ community would be really appreciated!
Since I already asked my question on StackOverflow, I think the best is to link to it rather than to copy it here (and there is a 100 points bounty on it!) :
http://stackoverflow.com/questions/17860161/replacing-a-full-orm-jpa-hibernate-by-a-lighter-solution-recommended-pattern
Thanks a lot in advance for any input!
electrotype
--
You received this message because you are subscribed to the Google Groups "jOOQ User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jooq-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Thanks for that emphasis on my question Lukas.
Your help and the fact that you are very open-minded about discussing the pros and cons of different solutions really makes JOOQ attractive to me!
findPersonById_simple(long id)
findPersonById_withAdressCity(long id)
findPersonById_withJob(long id)
findPersonById_withAdressCityAndJob(long id)
findPersonById(long id, boolean isLight)
so we either get an object containing only what we consider fundamental data, or all sorts of sub-relations.If we have 1000 people for which we want all this data, we only issue 3 queries. In fact, some of the above calls like findPersonByIds may not even go to the database because we have all sorts of caching in place for frequently requested data. And in some specific cases, we even have dedicated upfront calls like "loadPersonsAndJobsCaches" so that individual calls to our persistence layer do not go to the database.findPersonByIds(long[] ids)
findPerson
AdressCity
(
findPerson
Person
[] persons)
Jobs
(
Person
[]
)
persons
The "light" aspect started to be odd as the product grew, because the granularity of light/full was not enough. We generaly now favor smaller objects, where related data is retrieved as a separate entity. These separate entities are not fields of the main entity.
Chrriis, if I understand well, you would use entities like :
------------
public class Person {
long id;
String name;
String someOtherSimpleProperty;
}
public class Address {
long id;
String streetName;
String someOtherSimpleProperty;
}
public class PersonAddress {
Person person;
Address address;
}
So there is no "address" property directly in the Person entity...You actually have to retrieve the associated PersonAddress in a separate request. And all "associations of entities" are managed as standalone objects?
If a method needs to do some processing on a Person and needs to access some of its associated entities, it would have a signature like :
------------
public void doSomeProcessingOnAPerson(Person person, PersonAddress personAddress, PersonJob personJob, Set<Person> personFriends, ...) {
//...
}
------------
public void doSomeProcessingOnAPerson(Person person) {
//...
}
There are various things that cause difficuilties in Hibernate, and that I hope will be easier with Jooq (still not using Jooq yet, unfortunately):
1) Session demarcation. Any exception invalidates the Session, which means you lose the running transaction and the only recourse is to recreate a Session (easy), reload all data (now where's that promise you can code with entities as if they were standard Pojos...), redo any modifications (bad if that was a lot of work) and commit again (that's easy, again). I hope that Jooq does not tie transactional logic to any fragile resources like that.
2) It's possible to control which dependent entities are autoloaded in Hibernate when loading a main entity, but the API stinks. Several layers of fallbacks, needs twiddling in at least two places which use gratuitiously different terminology, and in some situations the setting is advisory only i.e. Hibernate's behaviour may vary after an upgrade.I think you always control that explicitly in Jooq.
3) Hibernate has three entirely different APIs (Criteria, HQL, SQL), with different abilities and different semantics (SQL doesn't affect the session cache, for example; Criteria and HQL have a largely overlapping feature set and semantics but some things that you can do you cannot do in the other).If you find that one API does not suit you, you need to rewrite, and the APIs are so drastically different that you can throw away the Criteria code you have if you need HQL.Jooq doesn't have this kind of largely overlapping (hence redundant), semantically different (hence requiring full rewrites) nonsense. At least I hope so :-)
4) Hibernate is a bit thin on documenting details. Docs are mostly there, but sometimes distributed across official docs, tips&tricks sections, and books; plus, sometimes the docs are incomplete and you have to experiment (and hope that behaviour doesn't change during the next update).Can't say much about Jooq's documentation since I haven't used it in anger yet, but what I did see didn't look like it left many stones unturned.
5) Hibernate is full of subtle semantic distinctions. They're so subtle (but important) that, sometimes, the Hibernate team decides to correct a mistake. Since the issues are subtle, you don't know your usage of Hibernate was affected until the upgrade hits you right in the face.Jooq has a simpler semantics. That's partly because it doesn't try anything fancy with a session cache. (There have been moves towards subtler semantics though. I'm a bit worried there.)
A bit long, sorry.This is just my experience (partially) replacing OpenJPA with JOOQ in a few projects that have around 250 tables. Not sure it is useful for anybody, but the system is being used right now in the wild, so it might count for something for some people.
I used Hibernate for a few years before switching over to OpenJPA (solely for its compile-time enhancement ability at that time). And for the last six months, my team has used JOOQ along side with OpenJPA. Basically, we use OpenJPA when we need to do any insert/update/delete. We use OpenJPA when it is easier to do, and for small queries where the query time is not important. Of course, the old code with OpenJPA stays the same until we feel the need to:
1. To speed up the query time.2. Whenever we need to modify any query that is in JPQL.Our systems have around 250 tables, so replacing all OpenJPA code with JOOQ all at once is out of the question. And we do like the way JPA handles cascade deletes and insertions and refreshes. The easy way to get a transaction is good as well.
I never benchmarked the speed-up, but our JOOQ queries seem to be at least one order of magnitude faster than the OpenJPA JPQL queries they replaced. That is the reason I started using JOOQ when I first became aware of it (when we needed to get up to 1,000 rows from queries with many joined tables in under 2 seconds or so, and I was spending way too much time fiddling with FetchType.EAGER and FetchType.LAZY and not on the business logics). Our schemas are broadly in 3NF, so joining many tables together is very common for us, sometimes with many left outer joins as well.But speed is not everything for us or we would have used JDBC directly. The type-safety of JOOQ is very useful for us (except that we sometimes have 30 to 40 columns in a select. We can't do a select() without specifying the columns all the time because we would run out of memory as sometimes we do need to join 20 tables at the same time, for instance. As usual, we found out that we can't always use limits and first result row to limit how many rows we might get back with a query. Any solution other than specifying exactly which columns we want? We do re-use many queries, so sometimes we get runtime errors when we miss specifying a column in the select.).Ever since we started using JOOQ, we refactor the database schema even more often as the system evolves, which is a huge advantage for us. JPQL was really that big an inhibitor for us to change anything in the schema at all. Adding tables, etc., is fine, but changing names and such is not fun with JPQL queries.Can't emphasize enough how useful syntax errors are whenever we refactor a database schema and rebuild the JOOQ jar we have. We are a 3-person team with one tester, so we prefer front-loading everything as much as possible. A syntax error is always much easier to handle than a support call from the field because of a spelling mistake in a JPQL statement and a missed regression test case (as said before, it is still possible to get runtime issues in JOOQ if we miss specifying a column in a select, of course). As it happened, I just had that dreadful feeling this morning, trying to find all occurrences of a column in all the JPQL queries we have, just because we changed the name of the column to something more appropriate.
The main thing that I wish I could do easier in JOOQ is probably something that is impossible to do in a framework like JOOQ: the ability to write expressions involving the fields in a shorter way, and still make it type-safe. But it is still only a one-time thing to make sure the expression is correct. Can be difficult to figure out what it is later on though:CAL_IN_WEEKLY.DAY_OF_WEEK.equal(weekday).and(CAL_IN_WEEKLY.START_WEEK_NUMBER.lessOrEqual(startingWeekNumber).and((val(startingWeekNumber).sub(CAL_IN_WEEKLY.START_WEEK_NUMBER)).mod(CAL_IN_WEEKLY.FOR_EVERY).equal(val(0))))Counting the parentheses does remind me of my Lisp coding days. Of course, if this is the wrong way to do, and there is an easier way, I am all ears. By the way, I still prefer this than JPQL.
Hi electrotype,Chrriis, if I understand well, you would use entities like :
------------
public class Person {
long id;
String name;
String someOtherSimpleProperty;
}
public class Address {
long id;
String streetName;
String someOtherSimpleProperty;
}If an address is bound to a person and retrieved passing Person objects, then we could have the Person as a field of Address. A leaf can know its parents if it is always retrieved through the parent.
This means that:public class PersonAddress {
Person person;
Address address;
}... may not be useful.
>> (still not using Jooq yet, unfortunately):Heh. When I'm at a new job I fear... no resources for that shift :-(
>
> When will you finally use it! :-)
But my current employer does have a job opening announced, maybe something will happen.
Well, you'll need to do that anyway. You're starting a new transaction, and somebody may have modified your records behind your back.
> Hmm, I'm surprised that you have to reload all data
> when you get an exception.
The real issue is that you can't reinsert the existing Pojos into the new Session; the new Session insists on its own, new set of Pojos.
Not a problem if you do short transactions^Wconversations exlusively. I.e. never keep any Pojo around for longer than immediately necessary, keep only primary keys. It's the standard model of operation for web services, so that isn't a problem for the standard application domain of Hibernate.
For long-running applications, you want to keep the Pojos around. You keep them in caches. But with Hibernate, you can't, because the Session may die and the Pojo becomes unusable. You could keep track of what caches those Pojos are in and replace them with new ones as a new Session comes up, but (a) this is wasteful because you reload even if the Pojo happens to be never written back but you don't know this, (b) all hopes of modularity go out of the window.
The best you can do is to always work with detached objects, and at commit time, always reload and merge() them. The trouble is that you lose the single-UPDATE optimization, you know have two round trips to the database. For every friggin' object.
And the Hibernate book devotes an entire chapter to long conversations...
> It's a common pattern to try to insert something, check for> update something. [...] So you're saying, with Hibernate you
> a constraint violation, and then insert something else or
> cannot make use of exceptions from constraint violations, orThe official word is that the Session should not be relied on. If you look carefully, you can't even roll it back or close it, the Hibernate docs say it's invalid. There is no distinction between "harmless" and "destroying" exceptions.
> raised from stored procedures?
In practice, people ignore that and simply continue with a Session that had an exception. The trouble is that this will probably work, most of the time, except when it doesn't.
I guess both the hate and the misuse come from misrepresentation. The "Hibernate in Action" book even goes to great lengths to explain how to do "long-running conversations that span multiple transactions".
That's yet another issue. You can't use stored procedures if you want to keep migration costs under control. Or, put another way: If you use stored procedures, you should keep the business logic in the database, and then any shitty ORM will do and you don't need Hibernate that much anymore. (...) Hibernate also doesn't even begin to handle bulk queries (...)
Sounds intriguing.
Are there any whitepapers around?
The revised edition does.
Christian Bauer, Gavin King 2007: Java Persistence with Hibernate.
pp. 401 ff. "Working with detached objects"
No mention of the StaleObjectStateException there.
But I stand corrected: You can indeed reattach an existing Pojo to a Session. Via Session#lock. Talk about intuitive naming...
pp. 423 ff. "Working with detached entity instance"
Oh somebody tell me what the difference between an "object" and an "entity instance" is... oh right, it's the same chapter, but about JPA this time. Except no mention of lock() this time.
But that's just tangential. The conversation beef is chapter 11:
pp. 476 ff. "Implementing conversations"
Talks about all variants:
pp. 478 ff.: Conversation = Transaction = Session
pp. 486 ff.: One conversation holding multiple Transactions = Sessions, detached instances survive between Transactions/Sessions
pp. 489 ff.: Conversation = Session holding multiple Transations, objects stay attached (but that doesn't work as soon as the Session dies due to any exception)
pp. 497 ff.: Next chapter (JPA)
> I can't even imagine myself manipulating
> detached, explosive objects.
Well, I *thought* I could use the model described on pp. 489 ff. Except Sessions are too fragile - and no mention of that in the entire book. (Read that thing multiple times.)
I guess you need that.
Record classes to represent rows, and a relationship representation to know what foreign keys to follow.
--
// QuerySpec alternative var customers = new EntityCollection<CustomerEntity>(); using(var adapter = new DataAccessAdapter()) { var qf = new QueryFactory(); var q = qf.Customer .Where(CustomerFields.Country=="Germany") .WithPath(CustomerEntity.PrefetchPath.Orders .WithSubPath(OrderEntity.PrefetchPathOrderDetails), CustomerEntity.PrefetchPathVisitingAddress); adapter.FetchQuery(q, customers); }
var employees= new EntityCollection<EmployeeEntity>(); using(DataAccessAdapter adapter = new DataAccessAdapter()) { var qf = new QueryFactory(); var q = qf.Employee .WithPath(EmployeeEntity.PrefetchPathOrders .WithOrdering(OrderFields.OrderDate.Descending()) .WithLimit(1)); adapter.FetchQuery(q, employees); }
using(var adapter = new DataAccessAdapter()) { var qf = new QueryFactory(); var q = qf.Order .From(QueryTarget.InnerJoin(OrderEntity.Relations.CustomerEntityUsingCustomerId)) .Where(CustomerFields.CustomerId==myCustomer.CustomerId); adapter.FetchQuery(q, myCustomer.Orders); }
Hi lukas,I used llblgen before linq came out, at this time, there were many commercial ORMs available and nhibernate was there as a free alternative. Unlike in the java world, non free frameworks are quite popular and developers don't mind paying for useful stuff.
What I liked most about llblgen was it's stongly typed querying API which was a direct benefit of code generation (just like Jooq).Being more SQL focused, jooq would have been a slightly better tool for writing complex queries and fetching tabular data.Llblgen chose an higher level of abstraction but without abusing the POJO principle, chose a mix between object and SQL, went further in handling relationships and graphs.
Here is an exemple taken from their current documentation about how you would load a collection of Customers, along with Their Address and a collection of orders// QuerySpec alternative var customers = new EntityCollection<CustomerEntity>(); using(var adapter = new DataAccessAdapter()) { var qf = new QueryFactory(); var q = qf.Customer .Where(CustomerFields.Country=="Germany") .WithPath(CustomerEntity.PrefetchPath.Orders .WithSubPath(OrderEntity.PrefetchPathOrderDetails), CustomerEntity.PrefetchPathVisitingAddress); adapter.FetchQuery(q, customers); }Notice the .WithPath parameter...The framework would then load the Customers, and then fill the Order relationship using a second query, with either an SQL IN or an EXISTS subquery depending on how much customers were returned.
Here is an example that goes even further :var employees= new EntityCollection<EmployeeEntity>(); using(DataAccessAdapter adapter = new DataAccessAdapter()) { var qf = new QueryFactory(); var q = qf.Employee .WithPath(EmployeeEntity.PrefetchPathOrders .WithOrdering(OrderFields.OrderDate.Descending()) .WithLimit(1)); adapter.FetchQuery(q, employees); }Of course you can do that using Jooq too, but you'll write the queries yourself and implement a nice, non reusable algorithm for populating your graph but let's face it, everyone hates writing that kind of code. I do that many times a day, and that's the reason why I am still using hibernate.
Below is an interesting feature for Jooq :using(var adapter = new DataAccessAdapter()) { var qf = new QueryFactory(); var q = qf.Order .From(QueryTarget.InnerJoin(OrderEntity.Relations.CustomerEntityUsingCustomerId)) .Where(CustomerFields.CustomerId==myCustomer.CustomerId); adapter.FetchQuery(q, myCustomer.Orders); }Notice the OrderEntity.Relations.CustomerEntityUsingCustomerId which is a generated object...Wouldn't that be nice to have FK-based relations in the jooq Table classes and be able to pass them to the join(..) method in the query DSL rather than relying on join(X).onkey() ?
To conclude I'd say that llblgen is a good showcase of what a data access solution can do without reyling on osbcure concepts.
I don't think we can ask Jooq to do the same things, they are both different tools, but some help when dealing with graphs would greatly complement it and as shown above, it may be possible to do without implementing a crazy caching mechanism.
Hi Lukas,I had a quick look but from what I understand, mapping a base entity having for example 2 one-many relationships would still be a nightmare.Using a join would get you tons of duplicates (basically a cartesian product between both relations) and using subqueries is still going to have you write in-memory PK-FK based joins.
However, I find modelmapper interesting for achieving for the exact opposite : turning a graph into a DTO.
Huh, I haven't given this proper thought. Yes, you're probably right. The show-case is probably good for linear "graphs" only, not for graphs with cartesian products in them. I guess that's where LLBLGen shines, as the proprietary query extensions holds all the relevant meta-data to know how to "untangle" the cartesian product between the >1 joined relationships.
Huh, I haven't given this proper thought. Yes, you're probably right. The show-case is probably good for linear "graphs" only, not for graphs with cartesian products in them. I guess that's where LLBLGen shines, as the proprietary query extensions holds all the relevant meta-data to know how to "untangle" the cartesian product between the >1 joined relationships.Exactly... Efficiently loading graphs often requires multiple queries (eg: 1 for the base entity, and 1 for each of the branches, except one-to-one that can be obtained through joins). The base idea is that eliminating duplicates from large resultsets isn't going to be that much faster than issuing a second SQL query, not talking about network traffic and memory bloat. IN (....) statement or EXISTS subqueries are usually well optimized in modern rdbms.I am not sure that llblgen actually used joins for 1:1 and 0:1 at the time I was using it, it followed the one branch -> one query principle which was already good enough in most use cases, so much better out of the box than any N+1.
> The base idea is that eliminating duplicatesThat's actually not always true. Depending on network latencies, the additional round trips can be devastating.
> from large resultsets isn't going to be that
> much faster than issuing a second SQL query,
> not talking about network traffic and memory
> bloat.
E.g. with a 10 millisecond roundtrip and 10 queries, you're already at 0.1 seconds, which can be above acceptability.
@Joaquim
Ah sorry, I had thought this was being proposed for each element of a join path.
Hadn't read closely enough, sorry for the spam.
I agree that cartesian product blowup would be worse than latency.
I am well aware that IN queries have their limitations too.Here is how llblgen used to solve this problem :1) It would fill the main collection first, eg :SELECT *FROM ordersWHERE orders.customerID = @customerIDORDER BY orders.checkout_date2) then it would decide which query to use for fetching the relationa) In case there less than 50 orders (you could configure how many values exactly) It would us an IN predicate like this one:SELECT *FROM orderdetailsWJERE orderdetails.orderID IN ( 1, 5, 8, 20 ...)b) In cas there is more orders than that, it would re-use the first querySELECT *FROM orderdetailsWHERE orderdetails.orderID IN (SELECT orderIdFROM ordersWHERE orders.customerID = @customerId)
Maybe it could be better to use EXISTS, but it's already a nice alternative to an infamous N+1.
Best
Le vendredi 9 août 2013 15:23:25 UTC+2, Lukas Eder a écrit :2013/8/9 Durchholz, Joachim <Joachim....@hennig-fahrzeugteile.de>
> Stéphane was referring to fetching 2 or moreAh sorry, I had thought this was being proposed for each element of a join path.
> subpaths, which usually denormalises results
> to produce cartesian products.
Hadn't read closely enough, sorry for the spam.
I agree that cartesian product blowup would be worse than latency.
I believe that the point about limits to the list of values still stands.You mean, limits to the number of bind values? Yes, there are some limits:- Oracle IN predicate: 1000 elements- Ingres: 1024 total bind values- SQLite: 999 total bind values- Sybase ASE: 2000 total bind values- SQL Server 2008 R2: 2100 total bind valuesjOOQ can transform SQL if any of the above events occur, though, either by splitting the IN predicate, or by inlining bind values.
--