Npgsql is slow in a high demand environement

3,246 views
Skip to first unread message

alberto...@infojobs.com.br

unread,
Mar 19, 2014, 8:35:04 AM3/19/14
to npgsq...@googlegroups.com
Hi,

We are currently in the process of migrating data from SQLServer to PostgreSQL Databases and we have a problem:

When we have a lot of connections on Postgres the system fall down.

Our Applicattion:
   - It's made with asp.net / c#/ framework 4.0
   - We connect to Postgres with Npgsql 2.0.12
   - We access to postgres through database functions
   - We use an architecture of 5 methods to access to the database:
Read - do a select by primary key.
Update - modify an entire row by primary key
Insert - Inserts a row
List - select a rows filtering by a filter and a pagination, and return the number of registers total that complaint this filter. The connection of this function is made as a transaction (READUNCOMMITTED), returning two refcursors.
Count - return the number of registers total that complaint the filter

Each of these methods has associated a function in postgreSQL Database.


When fall down:
   - We have between 400 and 800 simultaneous connections and we expect that this number will increase.

Our Server:
   - Linux Ubuntu 12.04
   - 12 Cores
   - 72 GB RAM
work_mem = 300MB
shared_buffers = 10GB # (change requires restart)
temp_buffers = 2GB
maintenance_work_mem = 2GB
max_connections 1000
   - We tested yields pgbench the server work as expected.


Where is the problem?
Connection Pool? Transactions? Cursors?  What we are doing incorrectly?
Is it possible than In release 2.1 you have solved this problem?

We attach the pgbench test:

pgbench emp_bra_lctr -C- c 700- j 50 - T 60 -U postgres
starting vacuum ... end .
transaction type: TPC- B ( sort of)
scaling factor : 1
query mode: single
number of clients: 700
number of threads: 50
duration: 60 s
number of transactions actually processed : 8673
tps = 135.914877 (including connections Establishing )
tps = 138.361335 (excluding connections Establishing )





Regards,
Candidate_Stack_Test.sql

Francisco Figueiredo Jr.

unread,
Mar 19, 2014, 4:56:27 PM3/19/14
to alberto...@infojobs.com.br, npgsql-help



On Wed, Mar 19, 2014 at 9:35 AM, <alberto...@infojobs.com.br> wrote:
Hi,


Hi, Alberto!
 
We are currently in the process of migrating data from SQLServer to PostgreSQL Databases and we have a problem:


Nice you are migrating your database to Postgresql!

Sad that Npgsql is giving you this problem. Let's check it out and solve that!
Very nice setup!

Which errors are you getting? Are you receiving "connection timeout" problems or the system simply starts to fail to respond everything? 

Which postgresql version are you using? 9.3? If so, Npgsql 2.0.14.3 had some fixes for Npgsql regarding error messages handling from Postgresql 9.3

In your architecture, are you open a new connection for each requisition? This is the recommended pattern. Each operation should open a connection, do the work and then close the connection. This pattern will enable you maximize your scalability. 


Regarding connection pool, we received a patch which I'm reviewing that adds some scalability improvements. This may be related to your problem with many connections. https://github.com/npgsql/Npgsql/pull/178
We could try this patch too.

What is the value of MaxPoolSize you are using in Npgsql connection string? 

 
We attach the pgbench test:

pgbench emp_bra_lctr -C- c 700- j 50 - T 60 -U postgres
starting vacuum ... end .
transaction type: TPC- B ( sort of)
scaling factor : 1
query mode: single
number of clients: 700
number of threads: 50
duration: 60 s
number of transactions actually processed : 8673
tps = 135.914877 (including connections Establishing )
tps = 138.361335 (excluding connections Establishing )




Thanks! I'll have a look at it too.


--
Regards,

Francisco Figueiredo Jr.
Npgsql Lead Developer
http://www.npgsql.org
http://gplus.to/franciscojunior
http://fxjr.blogspot.com
http://twitter.com/franciscojunior

alberto...@infojobs.com.br

unread,
Mar 20, 2014, 8:02:07 AM3/20/14
to npgsq...@googlegroups.com, alberto...@infojobs.com.br, fran...@npgsql.org
Hi,

Very happy for your quick response. First of all, thank you and your development team the great work done with this connector.


Read more about our data access layer and on our servers:

Our application :
- It's divided into 2 layers of access , the first containing a Helper hits on PostgreSQL, SQLServer , MySQL , ODBC ... the second is a layer representing each of the tables in our database .
- I have attachado our DataHelper and a class of data access layer as an example.
- Right now we are testing the Npgsql version 2.1RC.

DataHelper:

     - ADataBase: abstract class that contains all the functionality of access.
            
. ExecureReader
            
. ExecuteQuery
            
. ExecuteScalar
            
. ExecuteNonQuery
    
- MySqlDataBase , PostgreSQLDataBase , SqlServerDataBase: contains Connections, DataAdapters , DataCommands and DataParameters , system specific for each database connection
    
- DataParameter : Generic dataParameter for DataHelper
.

Data layer specifies:

   - An example of it is that I send in class "Candidate_Stack_Test", which is one of our tables and complies with this:

   - We use an architecture of 5 methods to access to the database:
Read - do a select by primary key.
Update - modify an entire row by primary key
Insert - Inserts a row
List - select a rows filtering by a filter and a pagination, and return the number of registers total that complaint this filter. The connection of this function is made as a transaction (READUNCOMMITTED), returning two refcursors.
Count - return the number of registers total that complaint the filter



Our server:
 - PostgreSQL 9.3 installed

Answering your questions:


Which errors are you getting? Are you receiving "connection timeout" problems or the system simply starts to fail to respond everything?

 - Timeout errors are usually because the database can not be removed fast enough queries, but contrary to what it may seem, the timeout are completely random, occurs in queries easy to resolve.

Some examples:

some examples:
 
" System.Exception : Timeout while getting a connection from pool .
at Npgsql.NpgsqlConnectorPool.RequestConnector ( NpgsqlConnection Connection)
at Npgsql.NpgsqlConnection.Open ( )
at IFCustomData.Data.ADataBase.ExecuteQuery ( IsolationLevel pTransactio (...) ...
System.Exception : Connector Message - DIC_Profile_Lst - : Timeout while getting a connection from pool . - Failed to perform an operation against the database ---> System.Exception : Timeout while getting a connection from pool .
Npgsql.NpgsqlConne at ( ... )
"

" System.Exception :
Message Connector - Candidate_Stack_Test_Lst - :
ERROR : 57014 : canceling statement due to statement timeout - Error when performing an operation against the database
---> Npgsql.NpgsqlException : ERROR : 57014 : canceling statement due to st ( ... ) "

" System.Exception :
Message Connector - Candidate_Stack_Test_Lst - :
ERROR : 57014 : canceling statement due to statement timeout - Error when performing an operation against the database
---> Npgsql.NpgsqlException : ERROR : 57014 : canceling statement due to st ( ... ) "

" System.Exception :
Message Connector - Candidate_Stack_Test_Lst - :
A timeout has occured . If you were Establishing a connection , Increase Timeout value in ConnectionString .
If you were executing a command , Increase the CommandTimeout value in ConnectionStr ( ... ) "

Which postgresql version are you using? 9.3?

PostgreSQL 9.3

In your architecture, are you open a new connection for each requisition?

If we use the pattern of opening and closing connection on each call , as you will see in the classes that I send you.

Regarding connection pool, we received a patch which I'm reviewing that adds some scalability improvements. This may be related to your problem with many connections.
We could try this patch too.


Just this Wednesday 19th applied this patch on the Npgsql version 2.1RC

We do a load test at "REAL" environment and worked better, but still does not seem good enough.

Anyway this coming Monday we planning at REAL environment to all our applications already include Npgsql 2.1RC version with fix 178 and 182, to check "in- situ" with what we already have migrated to PostgreSQL if managed load better .


What is the value of MaxPoolSize you are using in Npgsql connection string?

The value of MaxPoolSize is that you have the default ( 20).

An example connectionString :

<add name="DataBase.PostgreSql.Wrt" connectionString="Server=brapostgressqldev01.infobras.corp;Port=5432;Database=emp_bra;User Id=postgres;"/>


A good value for us could be ( 100)?

Not sure if it is for the following reasons:
- How to use the queries through database functions. Especially the "_Lst" together with the Npgsql.
-
ConnectionString setup for PostgreSQL.
-
Or that the connector Npgsql is less effective under certain conditions.


Thank you very much!
We are also reviewing our side. If we see something of interest I will communicate to you.
DataParameter.cs
PostgreSQLDataBase.cs
SqlServerDataBase.cs
ADataBase.cs
GenericHelper.cs
MySqlDataBase.cs
OleDBDataBase.cs
Candidate_Stack_Test.cs

Francisco Figueiredo Jr.

unread,
Mar 20, 2014, 1:04:58 PM3/20/14
to alberto...@infojobs.com.br, npgsql-help
On Thu, Mar 20, 2014 at 9:02 AM, <alberto...@infojobs.com.br> wrote:
Hi,


Hi, Alberto!
 
Very happy for your quick response. First of all, thank you and your development team the great work done with this connector.


Thank you! I'm glad to hear that!
 

Read more about our data access layer and on our servers:

Our application :
- It's divided into 2 layers of access , the first containing a Helper hits on PostgreSQL, SQLServer , MySQL , ODBC ... the second is a layer representing each of the tables in our database .
- I have attachado our DataHelper and a class of data access layer as an example.

Ok. I'll have a look at it.
 

- Right now we are testing the Npgsql version 2.1RC.

Yesterday we released the final version. You can use that instead of RC.
 

DataHelper:

     - ADataBase: abstract class that contains all the functionality of access.
            
. ExecureReader
            
. ExecuteQuery
            
. ExecuteScalar
            
. ExecuteNonQuery
    
- MySqlDataBase , PostgreSQLDataBase , SqlServerDataBase: contains Connections, DataAdapters , DataCommands and DataParameters , system specific for each database connection
    
- DataParameter : Generic dataParameter for DataHelper
.


Ok.
 
Data layer specifies:
   - An example of it is that I send in class "Candidate_Stack_Test", which is one of our tables and complies with this:

   - We use an architecture of 5 methods to access to the database:
Read - do a select by primary key.
Update - modify an entire row by primary key
Insert - Inserts a row
List - select a rows filtering by a filter and a pagination, and return the number of registers total that complaint this filter. The connection of this function is made as a transaction (READUNCOMMITTED), returning two refcursors.
Count - return the number of registers total that complaint the filter



Ok. I'll have a look at it.
 


Our server:
 - PostgreSQL 9.3 installed

Answering your questions:


Which errors are you getting? Are you receiving "connection timeout" problems or the system simply starts to fail to respond everything?

 - Timeout errors are usually because the database can not be removed fast enough queries, but contrary to what it may seem, the timeout are completely random, occurs in queries easy to resolve.

Some examples:

some examples:
 
" System.Exception : Timeout while getting a connection from pool .
at Npgsql.NpgsqlConnectorPool.RequestConnector ( NpgsqlConnection Connection)
at Npgsql.NpgsqlConnection.Open ( )
at IFCustomData.Data.ADataBase.ExecuteQuery ( IsolationLevel pTransactio (...) ...
System.Exception : Connector Message - DIC_Profile_Lst - : Timeout while getting a connection from pool . - Failed to perform an operation against the database ---> System.Exception : Timeout while getting a connection from pool .
Npgsql.NpgsqlConne at ( ... )

"

That's what I thought... A lot of problems with Npgsql in high demand is regarded to problems in the connection pool. 
Another cause for this problem are connections not being closed and so they aren't returned to the pool. 
 

" System.Exception :
Message Connector - Candidate_Stack_Test_Lst - :
ERROR : 57014 : canceling statement due to statement timeout - Error when performing an operation against the database
---> Npgsql.NpgsqlException : ERROR : 57014 : canceling statement due to st ( ... ) "

" System.Exception :
Message Connector - Candidate_Stack_Test_Lst - :
ERROR : 57014 : canceling statement due to statement timeout - Error when performing an operation against the database
---> Npgsql.NpgsqlException : ERROR : 57014 : canceling statement due to st ( ... ) "

" System.Exception :
Message Connector - Candidate_Stack_Test_Lst - :
A timeout has occured . If you were Establishing a connection , Increase Timeout value in ConnectionString .
If you were executing a command , Increase the CommandTimeout value in ConnectionStr ( ... ) "


Hmmmmm, you shouldn't be receiving this message unless you are under high pressure in your database server and it can't return data in the timeout time period. We need to check this problem closer. We don't receive this error frequently. 
  

Which postgresql version are you using? 9.3?

PostgreSQL 9.3

In your architecture, are you open a new connection for each requisition?

If we use the pattern of opening and closing connection on each call , as you will see in the classes that I send you.

Regarding connection pool, we received a patch which I'm reviewing that adds some scalability improvements. This may be related to your problem with many connections.
We could try this patch too.


Just this Wednesday 19th applied this patch on the Npgsql version 2.1RC

We do a load test at "REAL" environment and worked better, but still does not seem good enough.

Thanks for the feedback! Let's squash all the remaining bugs... :)
 


Anyway this coming Monday we planning at REAL environment to all our applications already include Npgsql 2.1RC version with fix 178 and 182, to check "in- situ" with what we already have migrated to PostgreSQL if managed load better .


Great! Let me know how it goes. This feedback will be very important.

 

What is the value of MaxPoolSize you are using in Npgsql connection string?

The value of MaxPoolSize is that you have the default ( 20).

An example connectionString :

<add name="DataBase.PostgreSql.Wrt" connectionString="Server=brapostgressqldev01.infobras.corp;Port=5432;Database=emp_bra;User Id=postgres;"/>


A good value for us could be ( 100)?


I think it would be  a good idea to increase this value. As you have a high load, it is very possible that all the connections are being used for some time which prevents new requests to  be satisfied. 
This is exact the problem the #178 and #182 patches are supposed to fix. But even so, there may be a load so high in the connections that you may be hitting a timeout problem when getting a connection from the pool. By increasing the pool, you may diminish this problem. Try to increase the value to 40 and see what happens.

If the problem persists, you may have a problem of connection contention. Where you may not be closing the connection and it is not returning to the pool. This is the most common cause of the timeout problem when getting a connection from the pool.

 
Not sure if it is for the following reasons:
- How to use the queries through database functions. Especially the "_Lst" together with the Npgsql.

There is no special way to use functions with Npgsql which would impact in those issues. I think you can discard this as a cause for your problems.
 
- ConnectionString setup for PostgreSQL.

I think that increasing the maxpoolsize would be the only change you can make which could help you with this issue.
 
- Or that the connector Npgsql is less effective under certain conditions.


I don't discard the fact that Npgsql may not be performing so well as it should be. We don't have so powerful machines to make tests with hundreds or thousands of clients like you are doing. I'm really sorry for that.  Most of the time we fix issues based on reports like yours which help us tune Npgsql.

But I hope that with your feedback we will fix more related bugs in Npgsql which will make it work o in your workload and this will help others who are using Npgsql under high load.

Thanks in advance for all your feedback and help.


 

Thank you very much!
We are also reviewing our side. If we see something of interest I will communicate to you.

Excellent!!  I'm looking forward hearing from you.

 

--
You received this message because you are subscribed to the Google Groups "Npgsql Help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to npgsql-help...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/npgsql-help/f3d1ba73-7dfe-4d36-b833-b9d0e8264b85%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

alberto...@infojobs.com.br

unread,
Apr 8, 2014, 5:04:41 AM4/8/14
to npgsq...@googlegroups.com
Hi,

Sorry I have not contacted you so far, but we have been busy.

First, thank you for the interest and rapid response through the release of the new version.

We are currently using version 2.1 with the following patches (# 178 and # 182 ), seems to work properly.

Our performance problems seem solved at the end.

I think eventually our problems had different origins.

Ignorance of this new database system , which stemmed in:
- A poor server configuration.
- Some queries not behaved the same way in "SqlServer" and "PostgreSQL", especially by the different management of Index between "SqlServer" and "PostgreSQL".
- And finally it seems that the recursion in the "ConnectionPool" makes a strong claim to the "Pool" is saturated and reject new connections, with Patches (# 178 and # 182 ) runs much better under strong demand and no longer rejects connections.

I have a seemingly simple question to answer.

I have seen that you have only initiate transactions available the following methods ( REPEATABLE READ, SERIALIZABLE , READ COMMITTED ), with READ COMMITTED the default choice .

I've also added the following method (READ UNCOMMITTED) and works well with PostgreSQL version 9.3.

It's as simple as this:

else if ( == isolation System.Data.IsolationLevel.ReadUncommitted )
{
commandText.Append ( "READ UNCOMMITTED ");
}
else
{
/ / Set default isolation level to read committed .
_isolation = IsolationLevel.ReadCommitted ;
commandText.Append ( "READ COMMITTED ");
}


In the constructor.

That is something in particular?
Having only these 3 methods of isolation? Is it a compatibility issue with earlier versions of postgreSQL?

Regards,

Francisco Figueiredo Jr.

unread,
May 21, 2014, 2:06:56 PM5/21/14
to npgsq...@googlegroups.com


On Tuesday, April 8, 2014 6:04:41 AM UTC-3, alberto...@infojobs.com.br wrote:
Hi,

Sorry I have not contacted you so far, but we have been busy.


Hi, No problem.

Sorry for late reply too.

  

First, thank you for the interest and rapid response through the release of the new version.

You are welcome! 
 

We are currently using version 2.1 with the following patches (# 178 and # 182 ), seems to work properly.


Our performance problems seem solved at the end.


Excellent! Thank you for your  feedback. This will help us get them merged in our master branch. 

I think eventually our problems had different origins.

Ignorance of this new  database system , which stemmed in:
 - A poor server configuration.
 - Some queries not behaved the same way in "SqlServer" and "PostgreSQL", especially by the different management of Index between "SqlServer" and "PostgreSQL".
 - And finally it seems that the recursion in the "ConnectionPool" makes a strong claim to the "Pool" is saturated and reject new connections, with Patches (# 178 and # 182 ) runs much better under strong demand and no longer rejects connections.


Great!
 

I have a seemingly simple question to answer.

I have seen that you have only initiate transactions available the following methods ( REPEATABLE READ, SERIALIZABLE , READ COMMITTED ), with READ COMMITTED the default choice .

I've also added the following method (READ UNCOMMITTED) and works well with PostgreSQL version 9.3.

It's as simple as this:

            else if ( == isolation System.Data.IsolationLevel.ReadUncommitted )
            {
                commandText.Append ( "READ UNCOMMITTED ");
            }
            else
            {
                / / Set default isolation level to read committed .
                _isolation = IsolationLevel.ReadCommitted ;
                commandText.Append ( "READ COMMITTED ");
            }


In the constructor.

That is something in particular?
Having only these 3 methods of isolation? Is it a compatibility issue with earlier versions of postgreSQL?


Very good!

Is it possible to you to create a pull request with this patch? If not, I can create the pull request. I'm asking so you can be assigned as the author. :)

And regarding your question, yes. It is a compatibility issue with earlier versions of postgresql. At first, Postgresql only supported two isolation methods. Later they added the other methods. I missed adding this method to Npgsql. 

Thanks for the heads up. 

 

Regards,

Reply all
Reply to author
Forward
0 new messages