Possibility of Graceful Switchover?

I have been using PITR master-slave solution for long time for “failover” to slave server when there is a problem with the master server. It has proven a valuable solution for most of clients needs. The problem with “failover” solution is that we have to rebuild old-master as slave from completely new base backup from newly promoted master server. Oracle provides nice feature called “Graceful Switchover“. In this method, you can switch duty between master and slave servers and don’t have to rebuild old-master as slave from newly promoted master server. It could be very useful during server or data center move as we can shutdown old database and can have total control over xlogs to be processed before we start traffic on newly promoted master database.

I couldn’t find any graceful switchover document or blog for the PostgreSQL . Today, I spent some time on testing same theory as Oracle uses for the switchover. Oracle controls switchover using controlfile, Why can’t I use pg_control to do the same in totally controlled environment?
For PostgreSQL graceful switchover, I tested following scenario on very small test database server with pre configured setup of master and PITR slave database:
  • Step-1 : Shutdown current master database
  • Step-2 : Apply all wal logs including the one created during shutdown process to slave database server
  • Step-3: Promote slave as master in read/write mode using triggered failover file
  • Step-4: Before pointing production traffic to newly promoted master server, take backup of $PGDATA/global/pg_control and $PGDATA/global/pgstat.stat files. These two files differ between master and slave after failing over slave database into production mode.
  • Step-5: Setup old – master server for slave for newly promoted master sever and replace $PGDATA/global/pg_control and $PGDATA/global/pgstat.stat with the files backed up in Step-4
  • Step-6: Check log files on old-master to make sure that it’s recovering copied wal logs from newly promoted master server(old-slave server)
Yay! I could successfully switched over slave to master and vice versa !
At this stage, old-master(new slave) started playing wal logs coming from new-master(old-slave) server. After that, I failed over new slave server to make sure we can actually bring the database in production mode. It worked like a charm!! I ran database wide vacuum after failover and it ran fine without any error.
I have done limited testing as I have tested this with very tiny setup of “pagila” database on test server and I don’t recommend you to use on the “production database” servers as of now.
Please test this in your test or dev environment and let me know your results and concerns!!

4 thoughts on “Possibility of Graceful Switchover?”

Comments are closed.