Debugging "Out of memory (Needed XXXXXXX bytes)" error

984 views
Skip to first unread message

Ann Su

unread,
Feb 15, 2017, 12:12:15 PM2/15/17
to Google Cloud SQL discuss
Ever since we've migrated to Google Cloud SQL we are hitting an error every now and then:

/usr/sbin/mysqld: Out of memory (Needed 2576982012 bytes)

The number of bytes needed is always the same (~2.5G) and increasing RAM on an instance doesn't help.

Since there are no super privileges or direct access to configs and the scope of adjustable flags from UI/console is very limited we can't experiment with mysql configuration to find out what's causing the problem.

Do you have any ideas how we can debug this issue?

thanks,
Ann




paynen

unread,
Feb 15, 2017, 1:25:29 PM2/15/17
to Google Cloud SQL discuss
Hey Ann,

Is there a particular query which is being run when this occurs? My guess would be that the memory is being used up by a particular query and the demands it makes on internal temporary tables, etc. Also, could you provide a sample of the error messages, along with the size of the instance memory that was configured when each one occurred? Finally, in order to help debug this, you may want to start logging the exact query used whenever catching an error like this.

Cheers,

Nick
Cloud Platform Community Support

Ann Su

unread,
Feb 20, 2017, 4:48:58 AM2/20/17
to Google Cloud SQL discuss
Hi Nick,

there are several queries that are causing problems. I believe this could be related to some kind of sorting ("group by", "distinct" etc.) and small buffer sizes. The table is using myisam for performance reasons.

The instance has 52 GB of memory.
The full error is:

{
 textPayload: "2017-02-20T09:32:32.931234Z 3045181 [ERROR] /usr/sbin/mysqld: Out of memory (Needed 2576982012 bytes)"  
 insertId: "s=6d6327081d4e4a1bab5830443a5d7d2f;i=3275fa;b=b2909fad69874f529830c69e12050af7;m=f0b6ccf14d;t=548f2f167843a;x=e9e749ac80b48145@a3"  
 resource: {
  type: "cloudsql_database"   {
   project_id: "project-12345"    
   database_id: "project-12345:master1"    
   region: "europe"    
  }
 }
 timestamp: "2017-02-20T09:32:32.932480Z"  
 severity: "ERROR"  
 logName: "projects/project-12345/logs/cloudsql.googleapis.com%2Fmysql.err"  
}

thanks,
Ann

Ann Su

unread,
Feb 20, 2017, 5:19:20 AM2/20/17
to Google Cloud SQL discuss
UPDATE:

Actually, all the queries are related to use of DISTINCT and removing it from query makes the "out of memory error" vanish. However, DISTINCT is used for a reason, so we can remove it.

Ann

Ann Su

unread,
Feb 20, 2017, 6:27:26 AM2/20/17
to Google Cloud SQL discuss
UPDATE 2:

It seems that the error also disappears after reducing the session size of tmp_table_size and max_heap_table_size.

Still, I don't understand why can't the system allocate more than 2,5GB on 52GB machine.

cheers,
Ann

paynen

unread,
Feb 22, 2017, 6:21:45 PM2/22/17
to Google Cloud SQL discuss
Hey Ann Su,

Thanks for keeping us updated with your findings. The issue seems to rest entirely on max_heap_table_size and tmp_table_size, since these are used to limit the size that internal in-memory temporary tables - the kind needed to run queries - can grow to. But I'm a bit confused at this point - did you decrease these values, or increase them? It seems like you're saying you decreased them, which ought to cause the error more frequently by limiting even further the size which the system can allocate to run queries.


Cheers,

Nick
Cloud Platform Community Support

Anka Sulkowska

unread,
Feb 23, 2017, 8:06:51 AM2/23/17
to google-cloud...@googlegroups.com
Hi Nick,

yes, I DECREASED the values. Initially, I had max_heap_table_size and tmp_table_size set to maximum possible value. 

There is an old bug filed related to "out of memory" and "count(distinct)" on mysql site:


Someone in the thread mentioned that it helped him to reduce max_heap_table_size and tmp_table_size (at least that's how I interpreted his answer) to avoid allocating more memory that system can handle.

That would lead me to two conclusions

1) there could be a bug causing misallocation of memory (too much of it) to perform count(distinct)
2) there could be a configuration issue on your servers that disallows allocating more memory than 2.5G 

which were to some extent confirmed by the fact, that decreasing the limits actually helped me.

cheers,
Lukasz


--
You received this message because you are subscribed to a topic in the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-cloud-sql-discuss/-JHM_F405jM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-cloud-sql-discuss+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/bcac486d-c06e-4e60-8e59-7474237416c1%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

paynen

unread,
Feb 23, 2017, 5:40:54 PM2/23/17
to Google Cloud SQL discuss
Hey Ann Su,

After some short research I believe we can refine our theory behind this behaviour. When the system sees a temp table exceeding the size given by the system variables, it will convert from the MEMORY engine to a disk storage engine (probably MyISAM). From the MySQL manual:

The maximum size of internal in-memory temporary tables. This variable does not apply to user-created MEMORY tables.

The actual limit is determined from whichever of the values of tmp_table_size and max_heap_table_size is smaller. If an in-memory temporary table exceeds the limit, MySQL automatically converts it to an on-disk temporary table. As of MySQL 5.7.5, theinternal_tmp_disk_storage_engine option defines the storage engine used for on-disk temporary tables. Prior to MySQL 5.7.5, the MyISAM storage engine is used.

I had figured that an allocation error would be thrown, hence why you saw the error above. Instead, it will switch to disk storage.

The reason I say this explains the behaviour is because in lowering the size of temporary tables, you lowered the demands on memory each query will take. It's likely that you had many queries and connections using up instance memory when the failing query was run, which meant that of the total memory (52 G), it didn't have enough - due to the competition of the other queries/connections - to allocate a needed 2.5 G. In lowering the threshold for conversion of temp tables to disk storage, you lowered the stress put on memory storage.

Let me know if you have any further questions. I'll be happy to assist.


Cheers,

Nick
Cloud Platform Community Support

On Thursday, February 23, 2017 at 8:06:51 AM UTC-5, Ann Su wrote:
Hi Nick,

yes, I DECREASED the values. Initially, I had max_heap_table_size and tmp_table_size set to maximum possible value. 

There is an old bug filed related to "out of memory" and "count(distinct)" on mysql site:


Someone in the thread mentioned that it helped him to reduce max_heap_table_size and tmp_table_size (at least that's how I interpreted his answer) to avoid allocating more memory that system can handle.

That would lead me to two conclusions

1) there could be a bug causing misallocation of memory (too much of it) to perform count(distinct)
2) there could be a configuration issue on your servers that disallows allocating more memory than 2.5G 

which were to some extent confirmed by the fact, that decreasing the limits actually helped me.

cheers,
Lukasz

Anka Sulkowska

unread,
Feb 23, 2017, 8:51:41 PM2/23/17
to google-cloud...@googlegroups.com
Nick,

I had this issue on a server that was set up specifically for testing, no traffic/stress except of test queries. And I was adding more and more memory to the instance and still had exactly the same error, with exactly the same memory limit. 

I will need to set up a controlled environment for testing and see if your theory that the table is dumped to disk is right by checking the amount of created temporary tables on disk but it seems unlikely as the queries are running normally. Not too mention that these are not overly complex queries and the entire issue only happens with count(distinct).

thanks for your help!
Ann



To unsubscribe from this group and all its topics, send an email to google-cloud-sql-discuss+unsubscr...@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-cloud-sql-discuss/-JHM_F405jM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-cloud-sql-discuss+unsub...@googlegroups.com.

paynen

unread,
Feb 24, 2017, 10:43:11 AM2/24/17
to Google Cloud SQL discuss
Hey Ann,

Do you have a particularly large dataset that the query might be running on? Or, despite its simplicity, does it contain a lot of JOINs, perhaps? You might want to check the output of running "EXPLAIN <query>\G" when logged into the mysql client, to view the query plan.

I'm thinking that it might be the case that the error for requesting (and failing) an allocation of 2.5G could be coming from a sub-unit of computation part of the overall query plan, which up til that point, had consumed > 49.5 G (which seems only possible to explain with large datasets).

There may be another more obscure cause for this, and I'll read back over the mechanics of COUNT(distinct) to see how it might be contributing in this case.


Cheers,

Nick
Cloud Platform Community Support

To unsubscribe from this group and all its topics, send an email to google-cloud-sql-discuss+unsub...@googlegroups.com.

Ann Su

unread,
Feb 27, 2017, 8:28:36 AM2/27/17
to Google Cloud SQL discuss
Hi,

you could say the data set is large (~10G) but the particular query that caused troubles is using a temporary table with just a fraction of the entire table. It is using JOIN but I'm 99% sure that we had the same error on a query with just a single, relatively small table.

Before migration to GC we had no issues with this even on much larger data sets. Since the application's codebase didn't change nor the dataset increased too much, the issue seems to be related to a) mysql settings b) version of mysql c) server setup - because that's what was changed as result of migration.

Since the issue seems to be resolved I can't spend too much time on this now but I'll try to prepare a test data set so it is easy to reproduce the issue and share it with you. 

best,
Ann

paynen

unread,
Feb 27, 2017, 9:13:31 AM2/27/17
to Google Cloud SQL discuss
Has the issue resolved by - as you had mentioned earlier - changing the query to avoid COUNT(DISTINCT) or has it resolved by just the passing of time (that is, you can run the original query)?

If it's the former, determining the cause might help clarify why it's not working or how to make it work. If it's the latter, it might be difficult to determine the cause.

As for the data set, describing it roughly (tables involved, schema, etc.) and supplying the query would be the best way to go about allowing us to attempt a reproduction of the behaviour. Feel free to email me at pay...@google.com to supply that if you get the time.

Best of luck! 


Nick
Cloud Platform Community Support

Reply all
Reply to author
Forward
0 new messages