![]() | We do the math so you don't have to! | ||||||||||||
| Taygeta Scientific Inc. |
This note describes ways to implement High Availibility (HA) for PostgreSQL. HA gives the ability to transparently fail-over database connections to an alternative system in the event of some sort of failure of the primary system.
The HA-PostgreSQL system works the following way: The HA cluster consists of two machines running heartbeat that use both a serial port and a second network interface to provide the failover heartbeat. The primary system keeps the secondary system database files synchronized by periodic calls to a script that runs rsync.
The heartbeat software provides HA in the form of a hot standby 2-node cluster. In the event of a failure of the primary database system, the heartbeat software causes the secondary to take over. Any database changes that may have occurred between the last synchronization and the failure will be lost, so this synchronization must be done relatively frequently. The synchronization is done with the same network interface that is also doing the heartbeat in order to reduce the volume of traffic on the primary network.
cd /home/pgsql/base; rm -rf *
There are at least three synchronization methods that can be used:
We can now test the primary-to-secondary transfer with one of the following on the appropriate system:
| Method | Command | run from |
|---|---|---|
| pg_dump |
pg_dump -b -h 10.0.0.1 (dbname) > /tmp/db.tar; pg_restore -c -d (dbname) /tmp/db.tar; rm /tmp/db.tar |
secondary |
| pg_dumpall |
pg_dumpall -h 10.0.0.1 > /tmp/db.dump; pg_restore -f /tmp/db.dump template1; rm /tmp/db.dump |
secondary |
| rsync |
/usr/bin/rsync -auv --delete /home/pgsql/ 10.0.0.2::postgresql/ |
primary |
| rserv |
You should now see a copy of all the postgres files on the secondary system in /home/pgsql
If the primary-to-secondary transfer is working, we next need to test the
secondary-to-primary transfer.
First, temporarily move the database directory out of the way (we will keep
it, just in case) on the primary.
mv /home/pgsql/base /home/pgsql/base.TMP
mkdir /home/pgsql/base
chown postgres:postgres /home/pgsql/base
chmod 700 /home/pgsql/base
Then try the transfer from the secondary by going to the opposite system from the previous test and typing:
| Method | Command | run from |
|---|---|---|
| pg_dump |
pg_dump -b -h 10.0.0.2 (dbname) > /tmp/db.tar; pg_restore -c -d (dbname) /tmp/db.tar; rm /tmp/db.tar |
primary |
| pg_dumpall |
pg_dumpall -h 10.0.0.2 > /tmp/db.dump; pg_restore -f /tmp/db.dump template1; rm /tmp/db.dump |
primary |
| rsync |
/usr/bin/rsync -auq --delete /home/pgsql/ 10.0.0.1::postgresql/ |
secondary |
| rserv |
rm -rf /home/pgsql/base.TMP
The rsync version:
#!/bin/sh
# db_sync sync the db with failover copy
# rsync version
# the hb interface of the other side
rhost="hb-pollux"
lockfile="/var/lock/subsys/db_sync"
master=`/sbin/ifconfig | grep "eth0:0" | wc -l`
if [ $master -eq "0" ]; then
# this side not the master
exit 0
fi
# this side is the master
if [ -f $lockfile ]; then
# have not finished since the last time this script was invoked
logger -p daemon.notice -t db_sync "locked, sync already active"
exit 0
fi
touch $lockfile
logger -p daemon.notice -t db_sync "syncing database with $rhost"
sync
/usr/bin/rsync -auq --delete-after /home/pgsql/ $rhost::postgresql/
rm -f $lockfile
exit 0
The pg_dumpall version:
#!/bin/shThis script assumes that 'eth0:0' is the HA interface of your cluster. A copy of this script is on BOTH systems (with 'rhost' set appropriately). It should be called periodically by 'cron', e.g.
# db_sync sync the db with failover copy
# pg_dumpall version
# the pg_dump version is a simple modification of this version
# the hb interface of the other side
rhost="hb-pollux"
lockfile="/var/lock/subsys/db_sync"
master=`/sbin/ifconfig | grep "eth0:0" | wc -l`
if [ $master -eq "1" ]; then
# this side not the slave
exit 0
fi
# this side is the slave
if [ -f $lockfile ]; then
# have not finished since the last time this script was invoked
logger -p daemon.notice -t db_sync "locked, sync already active"
exit 0
fi
touch $lockfile
logger -p daemon.notice -t db_sync "syncing database with $rhost"
pg_dumpall -U postgres -h $rhost > /tmp/db.dump$$
pg_restore -U postgres -f /tmp/db.dump$$ template1
rm -f $lockfile /tmp/db.dump$$
exit 0
0,5,10,15,20,25,30,35,40,45,50,55 * * * * /usr/local/sbin/db_syncin root's crontab file, would cause it to be invoked every 5 minutes. The frequency of this update involves a trade-off. The more often the update is done then the finer the granularity of the mirror copy of the database is, and so there will be less of a data loss when switching over to the secondary. If the synchronization is done too often, then there may be a large volume of network traffic and its possible that the synchronization has not completed when it is time do start another. The script above handles the second case (if you see 'locked, sync already active' messages in the system logs, then you know that there is too much data to transfer in the alloted time).
rsync -auq --delete-after hb-pollux::postgresql/ /home/pgsql/and then start postgresql like normal.
This is done so that the primary has a fresh copy of the database in the event that it was down long enough that the secondary copy has substantially changed. For the secondary, it provides the chance to get anything that is new before it takes over. Doing this makes the periodic updates less critical, since that version of the database is only going to be used if the opposite system is unreachable.
dbserver.mydomain.com 192.168.1.110 postgresqland restart heartbeat.
Now postgresql connections to dbserver.mydomain.com will go to castor if its up and to pollux if its not.
This setup provides basic High Availability for PostgreSQL. It cannot handle problems that will happen to connections that are active DURING the failover; these will just fail. One should be particularly aware of this kind of problem when doing transactions with a COMMIT (i.e. the transaction is set up, a failover happens, and then the commit is executed).
|
|