Quarkus 1.13.0.Final -Dquarkus.profile does not actually apply profile values

400 views
Skip to first unread message

David Hoffer

unread,
Apr 6, 2021, 7:02:05 PM4/6/21
to Quarkus Development mailing list
I need to be able to have N runtime profiles (currently for datasource/hibernate-orm configs in application.yml).

We thought we had this working but in fact at runtime it is not actually taking effect.

Here is our approach.
1. We build with a default profile, via: <quarkus.profile>dataWarehouseA</quarkus.profile> in pom.

2. We run with -Dquarkus.profile=dataWarehouseA or -Dquarkus.profile=dataWarehouseB.

dataWarehouseA works fine however dataWarehouseB does not apply the application.yml values for dataWarehouseB it runs with the values from dataWarehouseA.

I can see this when I inspect the value here:

@Inject
@PersistenceUnit("datawarehousePU")
EntityManagerFactory dwEmf;

If I inspect the value of dwEmf at runtime...down in jdbcEnviornment...the current schema is the schema of dataWarehouseA not dataWarehouseB.

Please advise what is wrong here.  This must be a bug in Quarkus, right?  Note we could not try this prior to 1.13.0 as that version allows us to set the default profile in the POM.

Thanks,
-Dave

Roberto Cortez

unread,
Apr 6, 2021, 7:49:29 PM4/6/21
to dhof...@gmail.com, Quarkus Development mailing list
Hi David,

The datasource / persistence unit configuration is fixed at build time (except for things like username or password). This means that even if you change the active profile at runtime, the datasource name, driver and hibernate unit are fixed for the configuration used at build time.

A solution is to build a second binary with the required profile. 

Another alternative is to reaugment your application with the desired configuration. Please check the following Zulip conversation about it: https://quarkusio.zulipchat.com/#narrow/stream/187030-users/topic/quarkus.20change.20db.20driver.20runtime/near/232271336

Hope it helps!

Cheers,
Roberto

--
You received this message because you are subscribed to the Google Groups "Quarkus Development mailing list" group.
To unsubscribe from this group and stop receiving emails from it, send an email to quarkus-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/quarkus-dev/fbeea6f6-f2b6-4ad9-9ff8-21df908f703an%40googlegroups.com.

David Hoffer

unread,
Apr 6, 2021, 8:19:19 PM4/6/21
to Roberto Cortez, Quarkus Development mailing list
Wow that is not good news, I didn't expect that answer.  We don't want to have to have N binaries.  I did not see any documentation that reflected this.  Do you have plans to be more flexible regarding this configuration?

I will investigate the 'reaugment' approach but if that requires a JDK at runtime that isn't going to be an option either.

Hum, we are going to have to think through what this means for our various deployments and now possibly our builds.

Thanks,
-Dave

Ivan St. Ivanov

unread,
Apr 7, 2021, 2:10:57 AM4/7/21
to dhof...@gmail.com, Roberto Cortez, Quarkus Development mailing list
Hi Dave,

I was equally surprised like you when I found it out. But I'm sure there are merits to this approach.

Anyway, to answer your question about documenting this feature. If you go to the extension documentation (https://quarkus.io/guides/hibernate-orm#quarkus-hibernate-orm_configuration), you'll see that some entries have a little lock in front. This lock denotes which property is fixed at build time (and can't be overridden at runtime).

I guess this applies to all the extensions documentation.

Cheers,
Ivan

David Hoffer

unread,
Apr 7, 2021, 3:24:52 AM4/7/21
to Ivan St. Ivanov, Roberto Cortez, Quarkus Development mailing list
Hi Ivan,

Good to hear I'm not the only one surprised by this.  I wonder what those merits are.

I had heard that some Quarkus properties were read only but I could never find any documentation on which ones those were until recently I did see that lock symbol on some Quarkus build time configuration properties and that seemed okay to me.  I have never seen that lock symbol on the JDBC/Hibernate config options before.  When was that added? 

I see now that even JDBC/Hibernate fine tuning config parameters, such as fetch and batch size, max depth, bind-parameters...all the config options are now build time only.  That is going to go like a lead balloon at our company.  I don't know how we or anyone can know all this with 100% certainty at build time only.

How does anyone deal with this in the real world that has to both produce and support a real production application?  I need to know so I can present options/reasons/merits for this.

Thanks,
-Dave


Sanne Grinovero

unread,
Apr 7, 2021, 9:38:04 AM4/7/21
to David Hoffer, Ivan St. Ivanov, Roberto Cortez, Quarkus Development mailing list
Hi David,

On Wed, 7 Apr 2021 at 08:24, David Hoffer <dhof...@gmail.com> wrote:
Hi Ivan,

Good to hear I'm not the only one surprised by this.  I wonder what those merits are.

Apologies I've presented some talks about such optimisations but I seem to not have written down all such reasons. Will add to my TODO list to document this properly - there are good reasons and many strong benefits.
 

I had heard that some Quarkus properties were read only but I could never find any documentation on which ones those were until recently I did see that lock symbol on some Quarkus build time configuration properties and that seemed okay to me.  I have never seen that lock symbol on the JDBC/Hibernate config options before.  When was that added? 

They have always been there. When we started the project, in its early POC state you couldn't change absolutely anything: configuration would get fully cast in stone during the build - we gradually moved some properties to runtime where it really is essential: database URL, credentials.

I do tend to push back on anything which isn't conceptually essential though, as many desires to change things at "runtime" tend to stem from habit and the ability of having been able to do so in other platforms, rather than real need - but we've made some more reasonable exceptions.

Once upon a time, people would build a jar and throw it over the wall to ops and application server people, who would then edit massive amounts of XML to bind multiple dependencies together and configure app servers.

Today the mantra really should be to build immutable container images, via an automated pipeline, which start quickly and consume as less as possible for sake of easy scalability and keeping the bill of clouds low. For sake of immutability but also to make it even lighter, there is no expectation of you changing e.g. the Database vendor after the image has been built; however of course one might need to inject database credentials as one doesn't want to store them within the image - it's a good practice to inject them in the runtime later, so that the same container can be used in various staging environments and production without any modification.
 
I see now that even JDBC/Hibernate fine tuning config parameters, such as fetch and batch size, max depth, bind-parameters...all the config options are now build time only.  That is going to go like a lead balloon at our company.  I don't know how we or anyone can know all this with 100% certainty at build time only.

When it comes to batch sizes and similar we're in territory of performance tuning; this is a different area which is not absolutely essential to be "runtime changeable" but for which we tend to agree that it should indeed be possible to update; among the reasons for this there's of course the fact that performance requires quick adaptability to the environment and load.
However sometimes it's technically very difficult for us to make sure a portion of the configuration is cast in stone while another is not - so it's still work in progress to refine and identify which properties really should be made adaptable at runtime.  If there's a specific property in this area which is not updateable, please open an issue, we'll evaluate them individually - but please do remember that we're not going to be willing to make all configuration properties updateable at runtime: for some it's just not a good idea.

As a guideline, I can give you these extremes of the range:

 - Changing database driver -> we're unlikely to ever allow that at runtime as it has massive implications
 - anything in between -> up for discussion
 - changing credentials, domain names, etc.. -> to be allowed at runtime.

Generally speaking if there's anything that would imply you will also need to change some code, such as changing an internal implementation of some service, that's a sign that it might not be a good idea.

Specifically if you need to support multiple databases with the same application, there are many options that can be explored without changing the Quarkus design.

For example, you could configure two data sources and two Hibernate ORM instances (with different names) , and decide which one you're going to use at runtime. This does sound like it would present a big overhead, and you'd be right, but I assure you it would still be a lower overhead than having us support all options at runtime. Of course this option doesn't scale well if you need to support many of them, in such case there are possibly better alternatives.

The alternative I like better is to let the final build step being deferred; I don't know what kind of application you're building, but if you're for example shipping a highly customizable product in which people can choose a database and possibly include some plugins, you might need to defer to end users to run some last-minute build script (hopefully in a container?) for people to bake their own customized and optimised output; this would be necessary anyway, as otherwise  "plug-ins" can't be loaded dynamically, especially if you look at native-image and need to comply with its closed world assumption.

And of course there's the option to ship different flavours. If you have only a selected set of databases that you "certify" and test for, it might not be too bad to ship say 4 binaries which are about 70MB each. It's certainly better than shipping a single binary of 1+ GB which is what you'd have without these limitations.


How does anyone deal with this in the real world that has to both produce and support a real production application?  I need to know so I can present options/reasons/merits for this.

There's many notions of "production application" so I'm not sure how helpful this is for your case, but generally we tend to recommend running the same database vendor + version for testing and production. Allowing a container image to change its nature so dramatically as to move from one database vendor to another doesn't seem like a requirement for "production".

As an example, just the code for supporting special cases and special needs in Oracle amounts to 400+ classes in Hibernate ORM; we're able to not only remove these classes when building for a non-oracle database, and removing these classes also allows the compiler to "fold" a significant amount of branches into constants: more dead code optimisations, less polymorphic invocations, less memory consumption which I hope is very welcome for all users who're not interested in the Oracle database support. Not to mention the size of the Oracle driver alone, which would double the size of your container. Apply such costs to each database vendor, and you might as well run a full application server.  Now this was "just" the angle on database vendor choice - you can repeat the same process on other angles and the benefits quickly compound as you're in multi-dimensional optimisations territory.

HTH
Sanne


 

Stuart Douglas

unread,
Apr 7, 2021, 8:11:26 PM4/7/21
to Sanne Grinovero, David Hoffer, Ivan St. Ivanov, Roberto Cortez, Quarkus Development mailing list
On Wed, 7 Apr 2021 at 23:38, Sanne Grinovero <sa...@hibernate.org> wrote:
Hi David,

On Wed, 7 Apr 2021 at 08:24, David Hoffer <dhof...@gmail.com> wrote:
Hi Ivan,

Good to hear I'm not the only one surprised by this.  I wonder what those merits are.

Apologies I've presented some talks about such optimisations but I seem to not have written down all such reasons. Will add to my TODO list to document this properly - there are good reasons and many strong benefits.
 

I had heard that some Quarkus properties were read only but I could never find any documentation on which ones those were until recently I did see that lock symbol on some Quarkus build time configuration properties and that seemed okay to me.  I have never seen that lock symbol on the JDBC/Hibernate config options before.  When was that added? 

They have always been there. When we started the project, in its early POC state you couldn't change absolutely anything: configuration would get fully cast in stone during the build - we gradually moved some properties to runtime where it really is essential: database URL, credentials.

I do tend to push back on anything which isn't conceptually essential though, as many desires to change things at "runtime" tend to stem from habit and the ability of having been able to do so in other platforms, rather than real need - but we've made some more reasonable exceptions.

Once upon a time, people would build a jar and throw it over the wall to ops and application server people, who would then edit massive amounts of XML to bind multiple dependencies together and configure app servers.

Today the mantra really should be to build immutable container images, via an automated pipeline, which start quickly and consume as less as possible for sake of easy scalability and keeping the bill of clouds low. For sake of immutability but also to make it even lighter, there is no expectation of you changing e.g. the Database vendor after the image has been built; however of course one might need to inject database credentials as one doesn't want to store them within the image - it's a good practice to inject them in the runtime later, so that the same container can be used in various staging environments and production without any modification.
 
I see now that even JDBC/Hibernate fine tuning config parameters, such as fetch and batch size, max depth, bind-parameters...all the config options are now build time only.  That is going to go like a lead balloon at our company.  I don't know how we or anyone can know all this with 100% certainty at build time only.

When it comes to batch sizes and similar we're in territory of performance tuning; this is a different area which is not absolutely essential to be "runtime changeable" but for which we tend to agree that it should indeed be possible to update; among the reasons for this there's of course the fact that performance requires quick adaptability to the environment and load.
However sometimes it's technically very difficult for us to make sure a portion of the configuration is cast in stone while another is not - so it's still work in progress to refine and identify which properties really should be made adaptable at runtime.  If there's a specific property in this area which is not updateable, please open an issue, we'll evaluate them individually - but please do remember that we're not going to be willing to make all configuration properties updateable at runtime: for some it's just not a good idea.

As a guideline, I can give you these extremes of the range:

 - Changing database driver -> we're unlikely to ever allow that at runtime as it has massive implications
 - anything in between -> up for discussion
 - changing credentials, domain names, etc.. -> to be allowed at runtime.

Generally speaking if there's anything that would imply you will also need to change some code, such as changing an internal implementation of some service, that's a sign that it might not be a good idea.

Specifically if you need to support multiple databases with the same application, there are many options that can be explored without changing the Quarkus design.

For example, you could configure two data sources and two Hibernate ORM instances (with different names) , and decide which one you're going to use at runtime. This does sound like it would present a big overhead, and you'd be right, but I assure you it would still be a lower overhead than having us support all options at runtime. Of course this option doesn't scale well if you need to support many of them, in such case there are possibly better alternatives.

Should we look into supporting this option better? At the moment I don't think it would actually work, as you would need both connections to be present or boot would fail.

We now have the notion of UnconfiguredDataSource internally, which is a DS that was configured at build time, but did not have URL provided so will fail if anything tries to actually use it. This means that if hibernate attempts to start and detects an UnconfiguredDataSource it will fail to boot.

We could add an 'optional' config property to named PU's so that they simply won't boot if the datasource is unconfigured, so the config would look something like:

quarkus.datasource."postgres".db-kind=postgres
quarkus.hibernate-orm."postgres".database.generation=drop-and-create
quarkus.hibernate-orm."postgres".datasource=postgres
quarkus.hibernate-orm."postgres".optional=true

quarkus.datasource."mysql".db-kind=mysql
quarkus.hibernate-orm."mysql".database.generation=drop-and-create
quarkus.hibernate-orm."mysql".datasource=mysql
quarkus.hibernate-orm."mysql".optional=true

We also add a 'ConfiguredPersistenceUnits' CDI Bean to the user can query which ones are enabled:

interface ConfiguredPersistenceUnits {
Set<String> getConfigured();
}

Then the user can use a CDI producer method to produce the correct EM:


public class HibernateProvider {

    @Inject
    ConfiguredPersistenceUnits pus;

    EntityManager em;

    @PostConstruct
    void init() {
        Set<String> configured = pus.getConfigured();
        if (configured.size() != 1) {
            throw new IllegalStateException();
        }
        em = CDI.current().select(EntityManager.class, NamedLiteral.of(configured.iterator().next()));
    }

    @Produces
    EntityManager em() {
        return em;
    }
}

There would be some work to do around this (e.g. making sure we don't try and generate the proxies twice), but maybe this is a solution to the multiple datasource problem, that means that we get to keep all out optimisations, but users can still use multiple data sources if they really need it.

Stuart

 

David Hoffer

unread,
Apr 7, 2021, 8:23:51 PM4/7/21
to Stuart Douglas, Sanne Grinovero, Ivan St. Ivanov, Roberto Cortez, Quarkus Development mailing list
First I want to say thanks for the prior detailed response on this issue.  I have been testing various options/approaches today regarding this issue.

Second regarding the option of the app having all the data sources defined and us selecting which one(s) we want at runtime, that did work.  We currently have 3 data sources, one is always available, but also only one of the other two will be available.  I tested that by giving that data source a bogus URL (as our dev stack has all 3).  What happened is a nasty exception and stack trace for that one but the others worked and the app did run.  Based on your comment it sounds like you expected it to fail?  Perhaps because the two connected, that is why it did not fail to run?

Yes I do like your suggestion of making this more flexible.  

-Dave

Stuart Douglas

unread,
Apr 7, 2021, 8:37:45 PM4/7/21
to David Hoffer, Sanne Grinovero, Ivan St. Ivanov, Roberto Cortez, Quarkus Development mailing list
On Thu, 8 Apr 2021 at 10:23, David Hoffer <dhof...@gmail.com> wrote:
First I want to say thanks for the prior detailed response on this issue.  I have been testing various options/approaches today regarding this issue.

Second regarding the option of the app having all the data sources defined and us selecting which one(s) we want at runtime, that did work.  We currently have 3 data sources, one is always available, but also only one of the other two will be available.  I tested that by giving that data source a bogus URL (as our dev stack has all 3).  What happened is a nasty exception and stack trace for that one but the others worked and the app did run.  Based on your comment it sounds like you expected it to fail?  Perhaps because the two connected, that is why it did not fail to run?

What is the stack trace? I would have expected this to cause the application to fail to start (it does if you only have one datasource). Are you using Hibernate for all 3 datasources?

Stuart

David Hoffer

unread,
Apr 7, 2021, 8:39:44 PM4/7/21
to Stuart Douglas, Sanne Grinovero, Ivan St. Ivanov, Roberto Cortez, Quarkus Development mailing list
Yes Hibernate for all 3.  I can get stack trace in the morning, not at that system at the moment.

-Dave

Sanne Grinovero

unread,
Apr 8, 2021, 9:12:26 AM4/8/21
to David Hoffer, Stuart Douglas, Ivan St. Ivanov, Roberto Cortez, Quarkus Development mailing list

Right, we can improve on this. But I'd also love to understand why one wouldn't prefer using a differently built application? There are significant benefits from having dedicated build pipelines for each application which is significantly different.

@David : please open a feature request and add the stack traces there ?

@Stuart : +1 to improve on this, hopefully after having heard more. Not sure about "optional" - maybe a simple "disabled" which can be set at runtime to skip starting it?  It's useful to know if there was explicit intent to disable.

David Hoffer

unread,
Apr 8, 2021, 10:27:38 AM4/8/21
to Quarkus Development mailing list
@Shane Regarding why we/one might not want differently built application.  There are possibly many reasons for this.  First they are not significantly different applications, they are exactly the same application just that one DS points to a different provider, seems like way overkill to have to have separate binaries to switch a DS provider.  Some of the other objections might be...do we now have to have multiple DEV and TEST stacks to test the multiple binaries?  What about compliance/IA issues on approving the binaries, is that now duplicated?  Tons of possible issues I can foresee. 

@Stuart Here is the stack trace I get when I configure one of our DS to have a bad URL to simulate that DS not being available.  After these warnings the app continued to run and connect to 2 other DSs.  I can investigate the feature request but wanted to get this data out to folks to review.

2021-04-07 15:56:39,056 WARN  (main) [org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService()] HHH000342: Could not obtain connection to query metadata: java.sql.SQLException: [Amazon][HiveJDBCDriver](500164) Error initialized or created transport for authentication: java.net.UnknownHostException: junk-host.
        at com.amazon.hiveserver2.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.api.ServiceDiscoveryFactory.createClient(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source)
        at com.amazon.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
        at io.agroal.pool.ConnectionFactory.createConnection(ConnectionFactory.java:200)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:452)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:434)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at io.agroal.pool.util.PriorityScheduledExecutor.beforeExecute(PriorityScheduledExecutor.java:65)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1126)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Caused by: com.amazon.hiveserver2.support.exceptions.GeneralException: [Amazon][HiveJDBCDriver](500164) Error initialized or created transport for authentication: java.net.UnknownHostException: junk-host.
        ... 13 more
Caused by: com.amazon.hive.jdbc41.internal.apache.thrift.transport.TTransportException: java.net.UnknownHostException: junk-host
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSocket.open(TSocket.java:185)
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
        at com.amazon.hiveserver2.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.api.ServiceDiscoveryFactory.createClient(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source)
        at com.amazon.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
        at io.agroal.pool.ConnectionFactory.createConnection(ConnectionFactory.java:200)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:452)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:434)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at io.agroal.pool.util.PriorityScheduledExecutor.beforeExecute(PriorityScheduledExecutor.java:65)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1126)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.net.UnknownHostException: junk-host
        at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:220)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
        at java.base/java.net.Socket.connect(Socket.java:609)
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSocket.open(TSocket.java:180)
        ... 16 more

2021-04-07 15:56:43,599 WARN  (main) [org.hibernate.engine.jdbc.spi.TypeInfo.extractTypeInfo()] HHH000362: Unable to retrieve type info result set : java.sql.SQLException: Could not find the "XcoreXamazonathenaX200X7522.fza@3b5365b8".
2021-04-07 15:56:43,601 WARN  (main) [org.hibernate.engine.jdbc.spi.TypeInfo.extractTypeInfo()] HHH000362: Unable to retrieve type info result set : java.sql.SQLException: Could not find the "XcoreXamazonathenaX200X7522.fza@2b31269d".
2021-04-07 15:56:45,214 WARN  (main) [org.hibernate.engine.jdbc.spi.TypeInfo.extractTypeInfo()] HHH000362: Unable to retrieve type info result set : java.sql.SQLException: Could not find the "XcoreXamazonathenaX200X7522.fza@2eda15dd".
2021-04-07 15:56:45,215 WARN  (main) [org.hibernate.engine.jdbc.spi.TypeInfo.extractTypeInfo()] HHH000362: Unable to retrieve type info result set : java.sql.SQLException: Could not find the "XcoreXamazonathenaX200X7522.fza@76d3e6e1".
2021-04-07 15:56:45,921 WARN  (agroal-21) [io.agroal.pool.onWarning()] Datasource 'emrDS': [Amazon][HiveJDBCDriver](500164) Error initialized or created transport for authentication: java.net.UnknownHostException: junk-host.
2021-04-07 15:56:45,922 WARN  (main) [org.hibernate.engine.jdbc.env.internal.JdbcEnvironmentInitiator.initiateService()] HHH000342: Could not obtain connection to query metadata: java.sql.SQLException: [Amazon][HiveJDBCDriver](500164) Error initialized or created transport for authentication: java.net.UnknownHostException: junk-host.
        at com.amazon.hiveserver2.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.api.ServiceDiscoveryFactory.createClient(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source)
        at com.amazon.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
        at io.agroal.pool.ConnectionFactory.createConnection(ConnectionFactory.java:200)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:452)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:434)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at io.agroal.pool.util.PriorityScheduledExecutor.beforeExecute(PriorityScheduledExecutor.java:65)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1126)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
Caused by: com.amazon.hiveserver2.support.exceptions.GeneralException: [Amazon][HiveJDBCDriver](500164) Error initialized or created transport for authentication: java.net.UnknownHostException: junk-host.
        ... 13 more
Caused by: com.amazon.hive.jdbc41.internal.apache.thrift.transport.TTransportException: java.net.UnknownHostException: junk-host
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSocket.open(TSocket.java:185)
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:248)
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
        at com.amazon.hiveserver2.hivecommon.api.HiveServer2ClientFactory.createTransport(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.api.ServiceDiscoveryFactory.createClient(Unknown Source)
        at com.amazon.hiveserver2.hivecommon.core.HiveJDBCCommonConnection.establishConnection(Unknown Source)
        at com.amazon.hiveserver2.jdbc.core.LoginTimeoutConnection.connect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.BaseConnectionFactory.doConnect(Unknown Source)
        at com.amazon.hiveserver2.jdbc.common.AbstractDriver.connect(Unknown Source)
        at io.agroal.pool.ConnectionFactory.createConnection(ConnectionFactory.java:200)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:452)
        at io.agroal.pool.ConnectionPool$CreateConnectionTask.call(ConnectionPool.java:434)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at io.agroal.pool.util.PriorityScheduledExecutor.beforeExecute(PriorityScheduledExecutor.java:65)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1126)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.net.UnknownHostException: junk-host
        at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:220)
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
        at java.base/java.net.Socket.connect(Socket.java:609)
        at com.amazon.hive.jdbc41.internal.apache.thrift.transport.TSocket.open(TSocket.java:180)
        ... 16 more  

Stuart Douglas

unread,
Apr 11, 2021, 11:33:48 PM4/11/21
to David Hoffer, Quarkus Development mailing list
I realised that I am usually using create+drop with hibernate, so if there is no underlying connection the attempt to create the schema will fail, which fails boot.

Stuart

Sanne Grinovero

unread,
Apr 13, 2021, 5:26:12 AM4/13/21
to David Hoffer, Quarkus Development mailing list
On Thu, 8 Apr 2021 at 15:27, David Hoffer <dhof...@gmail.com> wrote:
@Shane Regarding why we/one might not want differently built application.  There are possibly many reasons for this.  First they are not significantly different applications, they are exactly the same application just that one DS points to a different provider, seems like way overkill to have to have separate binaries to switch a DS provider.

Honestly this puzzles me a bit. If you switch DS provider, there's a lot of aspects that need to be adapted, ranging from the most obvious ones (the SQL generated constants, the actual JDBC driver and all its dependencies) to much more subtle ones: optimal connection management strategies, different error handling strategies, ID generation strategies, which have an impact as to which thread generates IDs, how we can pool them, when they are assigned including concerns such as performing such operations within a transaction, with or without locks, who owns the lock, who owns the sub-transactions, at which point database constraints are verified, which types are available to be mapped, which ranges are actually valid within such types, and questions such as if your application model is ready to deal with persistent objects whose ID is assigned in some deferred strategy. Which in turn has implications on code paths beyond ORM: the Transaction Manager code paths, the available options for Cache implementation strategies, and possibly trickles down to many more components as clearly if some operations are deferred or performed in other threads that have hard to predict implications across the board.

So when operations happen, in which thread and in which consistency guarantees, and different implementations of all internal components are being used. I would definitely recommend having separate integration tests for each of your DS providers.

As I mentioned in a previous email, just support for the Oracle DB in Hibernate ORM comes in at affecting about 400 classes - that's more than 50% of the classes you're going to actually use, and this is just the ORM bits. The Oracle JDBC driver alone contributes ~50MB out of a Quarkus application of 80MB - so that's 30MB for everything else, of which I'd say it would be a reasonable guess at least half is going to have different semantics when switching DB vendors. So that's - very roughly - less than 20% of code which is "the same" across two simple applications when you change database (These figures could use some better testing: I'm basing it on past experience so I might be off a bit and it definitely depends on other aspects of your application, but my point hopefully comes across).

The good news is that apparently we're doing a very good job at hiding all these details, since most people seem unaware :)

But please don't assume that an application which has been thoroughly tested on one DB will work the same on a different DB. The abstraction is effective enough to not need worry about such details during development, allowing you to produce and maintain such different application flavours at minimal cost, but in terms of quality and risk management I think you're better off considering them as what they are: different output targets, with separate requirements when it comes to QA, certifications and compliance.

Hopefully a good deal of compliance checks also don't need to be fully repeated, for example it generally helps simplify things that the two different output targets share the same source tree.

HTH


 

David Hoffer

unread,
Apr 13, 2021, 11:37:54 AM4/13/21
to Sanne Grinovero, Quarkus Development mailing list
@Sanne Thanks again for your detailed reply.  Yes we understand that there are a lot of differences at runtime when different data sources are used.  However yes you do have a good abstraction layer at @PersistenceUnit() injection so our application logic is 100% the same using @PersistenceUnit("A") or  @PersistenceUnit("B").  Our only differences is in application.yml where we configure A & B.  We understand we have to test & tune each of these separately and that is why we prefer to have 'tuning' parameters changeable at runtime and not just at build time (we still need to check which ones we need and which ones are runtime configurable...hopefully they are all configurable).

We did present to our management the option of separate quarkus binaries for A & B but that approach was not liked so we are using the approach of having A & B in our single app binary and we added runtime logic to switch between A & B.

-Dave

David Hoffer

unread,
Apr 13, 2021, 12:06:02 PM4/13/21
to Sanne Grinovero, Quarkus Development mailing list
@Stuart Per prior request in this thread for me to create a feature request to improve how Quarkus handles multiple data sources that might not be available at runtime, I have created the following ticket.


Per your use of create+drop...yeah we never use that as we are always connecting to existing databases with usually tons of existing data.

Thanks,
-Dave
Reply all
Reply to author
Forward
0 new messages