PDI - Connection Pool's Context

Anderson F.

unread,

Sep 11, 2018, 8:19:30 AM9/11/18

to Pentaho Community

Hi guys.

I have a question regarding the PDI's connection pool context

Whats is the context? Is it by steps? transformations? jobs?

Example:

Assuming I've enabled the Connection Pooling(initial: 5 maximun: 50), and that I am using a real ETL.

The ETL has a main job, and it has several sub-jobs, and each of them has several transformations, and each transformation has a step (with n copies) that accesses the database.

The Connection Pooling is covering the main job?

Or it is covering each of transformations?

Or is covering each of steps?

Best regards

Matt Casters (i-Bridge)

unread,

Sep 11, 2018, 10:07:36 AM9/11/18

to pentaho-...@googlegroups.com

It's usually in the context of a transformation which gets executed repeatedly in a job or another transformation (loops). So most often where you have short lived transformations connecting/disconnecting all the time.

Remember that connection pooling is only used to reduce connect/disconnect delays. It can't magically reduce the number of connections made to the database.

---

Matt Casters <m att.c...@ibridge.be>
Zaakvoerder i-Bridge bvba
Fonteinstraat 70, 9400 OKEGEM - Belgium - Cell : +32 486 97 29 37 - skype: mattcasters

Op di 11 sep. 2018 om 14:19 schreef Anderson F. <anderson...@gmail.com>:

--
You received this message because you are subscribed to the Google Groups "Pentaho Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pentaho-commun...@googlegroups.com.
To post to this group, send email to pentaho-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/a325c5c4-d958-4c23-a360-7484b34256a1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Anderson F.

unread,

Sep 11, 2018, 10:18:44 AM9/11/18

to pentaho-...@googlegroups.com

Hi Matt

Thanks for the answer. 🙂

I do not expect a magic solution.😁

I thought I would start using connection pooling for the following problem: The database is configured to have 200 connections at most, and since I am using parallelism in the steps of the transformations, the database started to display the following error: sorry, too many clients already " 😡

You received this message because you are subscribed to a topic in the Google Groups "Pentaho Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pentaho-community/h7liXJ1nQdY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pentaho-commun...@googlegroups.com.

To post to this group, send email to pentaho-...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAM1YkNJV8nYh0FT_4_Bfq%2B2crsvkn3TOLQULWQ%2B30WY8ZByk8A%40mail.gmail.com.

Matt Casters (i-Bridge)

unread,

Sep 11, 2018, 10:26:31 AM9/11/18

to pentaho-...@googlegroups.com

So you did expect some kind of magic to happen :-)

Op di 11 sep. 2018 om 16:18 schreef Anderson F. <anderson...@gmail.com>:

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAGLwoOtt7ikzibYv0JawYZwAwoh6h8m6ZdQ7qe2gfkJDwKNCtQ%40mail.gmail.com.

Anderson F.

unread,

Sep 11, 2018, 10:41:28 AM9/11/18

to pentaho-...@googlegroups.com

Hmm Is there any solution other than increasing the number of connections? 🤔

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAM1YkN%2BLxmT2xs%3Dwb_aG8%3DfkDex_zjsANqyG%2BRUPBxey9qFQ6w%40mail.gmail.com.

Nelson Sousa

unread,

Sep 11, 2018, 10:42:17 AM9/11/18

to pentaho-...@googlegroups.com

Decrease the number of clients. ;)

(Sorry, couldn’t resist)

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAGLwoOt1GN8jQjppajpiAGVJ4nxA5eC%2BK%3DKW9WU0moRvwYYyBA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

--

Sent from Gmail Mobile

Anderson F.

unread,

Sep 11, 2018, 11:02:12 AM9/11/18

to pentaho-...@googlegroups.com

Decrease the number of clients. ;)

"(Sorry, couldn’t resist)"

No problem😂

Me: Sorry boss, but if you want to increase performance by increasing the number of copies, we will have to change the database settings.

Boss: Hmmm, I did not give you responsibility for the database server either? (I'm an analyst, not DBA) So it's easy for you to sort this out. For a moment I thought we had a problem. 😅

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAO6xvZMus6ZChV6_1CHkkb9fN4OuW3t9Uru6rCwaFQFK9YY9rw%40mail.gmail.com.

Matt Casters (i-Bridge)

unread,

Sep 11, 2018, 11:50:56 AM9/11/18

to pentaho-...@googlegroups.com

You're not doing the database or the ETL server a favor by loading with 200 step copies in parallel.

I bet you're doing database lookups in massive amounts of copies or something silly like that.

Most of the time this is done much faster by either loading some data in memory or doing a merge join.

Op di 11 sep. 2018 om 17:02 schreef Anderson F. <anderson...@gmail.com>:

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAGLwoOv3_CxcM%2BKD5t2if7wBFwfeS--t97j4p7bD_Ps6g9SY4A%40mail.gmail.com.

Anderson F.

unread,

Sep 11, 2018, 3:56:33 PM9/11/18

to pentaho-...@googlegroups.com

Its true. 😅😅😅

I chose an approach of putting the heaviest part of the load to the side of the database. Reason: We experienced memory overflow problems (the java server has 16 gigabytes of RAM)

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAM1YkN%2Btbe5xKqWTDq45odb7LkYRexA-7meULX_sQeEunnEhow%40mail.gmail.com.

Matt Casters (i-Bridge)

unread,

Sep 12, 2018, 3:22:40 AM9/12/18

to pentaho-...@googlegroups.com

With a merge join you can very efficiently join large data sets. By sorting the data on the database you can do it quickly without using any memory.

Op di 11 sep. 2018 om 21:56 schreef Anderson F. <anderson...@gmail.com>:

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAGLwoOvwHKP_2%2BOzYVaYBUCmVGjZHVkvxruP_CouzsEaop3ETQ%40mail.gmail.com.

Anderson F.

unread,

Sep 12, 2018, 9:33:57 AM9/12/18

to pentaho-...@googlegroups.com

Guys, thank you for all the answers 🙂

To view this discussion on the web visit https://groups.google.com/d/msgid/pentaho-community/CAM1YkNLbgRZ%3DJRBkrgU9J3AeS%3Dn8GYbmo1PzV3R31J0W5qtDFw%40mail.gmail.com.

Reply all

Reply to author

Forward