Can I drop the failover CloudSQL instance and re-create it?

164 views
Skip to first unread message

Roshan Dawrani

unread,
Apr 30, 2018, 2:02:25 AM4/30/18
to Google Cloud SQL discuss
Hi,

I have a cloud-sql instance setup with a failover. It's noticed that there is a drop in the performance of db queries every 4-5 minutes. It's very regular. There is a feeling that it could be due to some internal operations that need to be done to keep the failover instance in sync.

In order to verify or refute that, I was thinking of dropping the failover instance temporarily and see whether the dip in queries-performance goes away and then bring it back.

Is it a bad idea to do this experiment - other than the fear that for the duration of this experiment, there won't be a failover instance?

My requirement is just to temporarily get rid of the failover instance. If, instead of deleting this failover instance, I can detach it and then re-attach it later, that might be even better. As that may result in less work it'll need to do to bring an up-to-date failover.

Any vies / inputs will be highly appreciated.

Regards,
Roshan

Fady (Google Cloud Platform)

unread,
Apr 30, 2018, 4:48:00 PM4/30/18
to Google Cloud SQL discuss

Hello Roshan,


According to this document “ To minimize performance impact on the master, while ensuring that changes are never lost, the replica logs the update events, and then performs the updates in order.”  Therefore, I highly doubt that there is specific task that runs every 4- 5 minutes to keep both instances synced, but rather in seconds per order of each operation . You may verify that by checking the replication lag metrics (seconds behind master).  


As for stopping/pausing the failover replica, it is not currently possible. As you mentioned, you would have to delete it, and this would render your master not to be configured with high availability. If you choose to recreate the replica, you may check this guide for quick configurations. I hope this helps


Roshan Dawrani

unread,
May 2, 2018, 3:28:50 AM5/2/18
to Google Cloud SQL discuss
Hi Fady,

Thanks a lot for replying.

I checked and the replica lag for my instance is set to 0 (zero). I suppose that means that failover is almost being kept instantaneously in sync with the main instance?

So, you are saying that keeping failover instance in-sync couldn't be the reason for the drop I am (almost religiously) seeing every 4-5 minutes in the performance of the queries (both read and write)?

1) Is there any ideas you have on how I may dig up further to figure what might be happening on the db instance at such regular intervals? Could there be some MySQL level compaction, etc triggering at these 4-5 minutes interval? How do I start digging up such an issue?

2) Since the pausing of failover is not possible, are there any huge downsides to deleting and re-creating the replica (apart from the fact that for the duration of experimentation, the instance will not have a failover backup)?

     * When the failover is re-introduced, how will it be re-constructed? Will it be gradual, or are we looking at a db-performance risk because it'll be working too hard to get the failover up again? Our db is not on the small side, so I am a little worried about the effort it'll need to put in to get a new failover up-to-date.

Cheers,
Roshan

Roshan Dawrani

unread,
May 2, 2018, 4:15:56 AM5/2/18
to Google Cloud SQL discuss
Hi,

If I see the the logs for my database on the "Logs Viewer", there are following kind of messages that seem to be appearing almost 4 minutes apart each time:

InnoDB: page_cleaner: 1000ms intended loop took 13945ms. The settings might not be optimal.

Could the two things be interrelated (process of flushing the buffers to disk and drop in queries performance at the same time)?

Can I access the CloudSql logs in real-time on console somehow? Not sure whether on Stackdriver Logs Viewer they appear immediately or after some lag. It'll be interesting to see them on console to check whether the timing of the two things coincides.

Cheers,
Roshan

Fady (Google Cloud Platform)

unread,
May 2, 2018, 5:34:21 PM5/2/18
to Google Cloud SQL discuss

Hello Roshan,


The InnoDB page cleaner log message does not seem to be impacting performance, and per the explanation at this Stackoverflow link, it is due to high rate of changes to the database, and innodB should gradually catch up.  


That said, I would like to know how you are measuring the performance of your instance. Is it a Google Cloud metric? You may also privately send me a redacted screenshots of it, along with your project ID for a quick inspection (if you like).


As for the replication process, I will check with the Cloud SQL team for best practices, but meanwhile you may check these documents [1] [2].   


Roshan Dawrani

unread,
May 2, 2018, 8:46:58 PM5/2/18
to google-cloud...@googlegroups.com
On Thu, May 3, 2018 at 3:04 AM, 'Fady (Google Cloud Platform)' via Google Cloud SQL discuss <google-cloud...@googlegroups.com> wrote:


That said, I would like to know how you are measuring the performance of your instance. Is it a Google Cloud metric? You may also privately send me a redacted screenshots of it, along with your project ID for a quick inspection (if you like).


Hello Fady,

Regarding sending you some information privately: how should I send it? Could you please share an email id of yours where I can send it?

Thanks,
Roshan


As for the replication process, I will check with the Cloud SQL team for best practices, but meanwhile you may check these documents [1] [2].   


--
You received this message because you are subscribed to a topic in the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-cloud-sql-discuss/f2WslJAdbmc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-cloud-sql-discuss+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/09949f74-44b7-404e-822c-6ae269e22038%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Roshan Dawrani

unread,
May 2, 2018, 8:48:44 PM5/2/18
to Google Cloud SQL discuss
On Thursday, 3 May 2018 03:04:21 UTC+5:30, Fady (Google Cloud Platform) wrote:


That said, I would like to know how you are measuring the performance of your instance. Is it a Google Cloud metric? You may also privately send me a redacted screenshots of it, along with your project ID for a quick inspection (if you like).


Fady (Google Cloud Platform)

unread,
May 3, 2018, 9:44:35 AM5/3/18
to Google Cloud SQL discuss

Hello Roshan,


You can send me a private message by clicking the drop down next to the reply button ( top- right corner of this message), and then clicking on (Reply privately to author). A better explanation is available at this support  article. In this case, I will be the only person checking this message as I would receive it in my email inbox. Hence, no one else would be able to help if I am out of office.


Alternatively, you may create a report at issue tracker with this private component ( Public Trackers > Cloud Platform > GCP Private Issues) (click this link), referencing this thread, and we will be glad to help. This is a preferred method as it is visible to other colleagues, and if I am out of office, another colleague would help.


Reply all
Reply to author
Forward
0 new messages