Slony-I REL_1_1_0 Documentation | ||||
---|---|---|---|---|
Prev | Fast Backward | Fast Forward | Next |
It is common for managers to have a desire to operate systems using some available, documented set of "best practices." Documenting that sort of thing is essential to ISO 9000, ISO 9001, and other sorts of organizational certifications.
It is worthwhile to preface a discussion of "best practices" by mentioning that each organization that uses Slony-I is unique, and there may be a need for local policies to reflect unique local operating characteristics. It is for that reason that Slony-I does not impose its own policies for such things as failover ; those will need to be determined based on the overall shape of your network, of your set of database servers, and of your usage patterns for those servers.
There are, however, a number of things that early adopters of Slony-I have discovered which can at least help to suggest the sorts of policies you might want to consider.
Slony-I is a complex multi-client, multi-server system, with the result that there are almost an innumerable set of places where problems can arise.
As a natural result, maintaining a clean environment is really valuable, as any sort of environmental "messiness" can either cause unexpected problems or mask the real problem.
Numerous users have reported problems resulting from mismatches between Slony-I versions, local libraries, and PostgreSQL libraries. Details count; you need to be clear on what hosts are running what versions of what software.
Principle: Use an unambiguous, stable time zone such as UTC or GMT.
Users have run into problems when their system uses a time zone that PostgreSQL was unable to recognize such as CUT0 or WST. It is necessary that you use a timezone that PostgreSQL can recognize correctly.
It is furthermore preferable to use a time zone where times do not shift around due to Daylight Savings Time.
The "geographically unbiased" choice seems to be TZ=UTC or TZ=GMT.
See also Section 3.3.
Principle: Long running transactions are Evil
The FAQ has an entry on growth of pg_listener which discusses this in a fair bit of detail; the long and short is that long running transactions have numerous ill effects. They are particularly troublesome on an "origin" node, holding onto locks, preventing vacuums from taking effect, and the like.
Failover policies should be planned for ahead of time.
This may simply involve thinking about what the priority lists should be of what should fail to what, as opposed to trying to automate it. But knowing what to do ahead of time cuts down on the number of mistakes made.
At Afilias, some internal [The 3AM Unhappy DBA's Guide to...] guides have been created to provide checklists of what to do when "unhappy" things happen; this sort of material is highly specific to the applications running, so you would need to generate your own such documents.
MOVE SET should be used to allow preventative maintenance to prevent problems from becoming serious enough to require failover .
VACUUM policy needs to be carefully defined.
As mentioned above, "long running transactions are Evil." VACUUMs are no exception in this. A VACUUM on a huge table will open a long-running transaction with all the known ill effects.
Running all of the slon daemons on a central server for each network has proven preferable.
Each slon should run on a host on the same local network as the node that it is servicing, as it does a lot of communications with its database.
In theory, the "best" speed would come from running the slon on the database server that it is servicing.
In practice, having the slon processes strewn across a dozen servers turns out to be really inconvenient to manage, as making changes to their configuration requires logging onto a whole bunch of servers. In environments where it is necessary to use sudo for users to switch to application users, this turns out to be seriously inconvenient. It turns out to be much easier to manage to group the slon processes on one server per local network, so that one script can start, monitor, terminate, and otherwise maintain all of the nearby nodes.
That also has the implication that configuration data and configuration scripts only need to be maintained in one place, eliminating duplication of configuration efforts.
The Database Schema Changes section outlines some practices that have been found useful for handling changes to database schemas.
Handling of Primary Keys
Discussed in the section on Replication Sets, it is ideal if each replicated table has a true primary key constraint; it is acceptable to use a "candidate primary key."
It is not recommended that a Slony-I-defined key be used to introduce a candidate primary key, as this introduces the possibility that updates to this table can fail due to the introduced unique index, which means that Slony-I has introduced a new failure mode for your application.
Grouping tables into sets suggests strategies for determining how to group tables and sequences into replication sets.
It should be obvious that actions that can delete a lot of data should be taken with great care; the section on Dropping things from Slony-I Replication discusses the different sorts of "deletion" that Slony-I supports.
Certain Slony-I operations, notably set add table , move set , lock set , and execute script require acquiring exclusive locks on the tables being replicated.
Depending on the kind of activity on the databases, this may or may not have the effect of requiring a (hopefully brief) database outage.
Slony-I-specific user names.
It has proven useful to define a slony user for use by Slony-I, as distinct from a generic postgres or pgsql user.
If all sorts of automatic "maintenance" activities, such as vacuuming and performing backups, are performed under the "ownership" of a single PostgreSQL user, it turns out to be pretty easy to run into deadlock problems.
For instance, a series of vacuums that unexpectedly run against a database that has a large SUBSCRIBE_SET event under way may run into a deadlock which would roll back several hours worth of data copying work.
If, instead, different maintenance roles are performed by different users, you may, during vital operations such as SUBSCRIBE_SET, lock out other users at the pg_hba.conf level, only allowing the slony user in, which substantially reduces the risk of problems while the subscription is in progress.
Path configuration
The section on Path Communications discusses the issues surrounding what network connections need to be in place in order for Slony-I to function.
The section on listen paths discusses the issues surrounding the table sl_listen.
As of Slony-I 1.1, its contents are computed automatically based on the communications information available to Slony-I which should alleviate the problems found in earlier versions where this had to be configured by hand. Many seemingly inexplicable communications failures, where nodes failed to talk to one another even though they technically could, were a result of incorrect listen path configuration.
Configuring slon
As of version 1.1, slon configuration may be drawn either from the command line or from configuration files. "Best" practices have yet to emerge from the two options:
Configuration via command line options
This approach has the merit that all the options that are active are visible in the process environment. (And if there are a lot of them, they may be a nuisance to read.)
Unfortunately, if you invoke slon from the command line, you could forget to include log shipping configuration and thereby destroy the sequence of logs for a log shipping node.
Unlike when command line options are used, the active options are not visible. They can only be inferred from the name and/or contents of the slon configuration file, and will not reflect subsequent changes to the configuration file.
By putting the options in a file, you won't forget including any of them, so this is safer for log shipping.
Things to do when subscribing nodes
When a new node is running the COPY_SET event for a large replication set (e.g. - one which takes several hours to subscribe) it has been found to be desirable to lock all users other than the slony user out of the new subscriber because:
Applications will run into partially-copied, half-baked data that is not totally consistent.
It is possible for applications (and maintenance scripts) to submit combinations of queries that will get the system into a deadlock situation, thereby terminating the COPY_SET event, and requiring the subscription to start over again.
It may be worth considering turning the
PostgreSQL fsync
functionality off during the
copying of data, as this will improve performance, and if the database
"falls over" during the COPY_SET
event, you will be restarting the copy of the whole replication
set.
Managing use of slonik
The notes on Using Slonik describe some of the lessons learned from managing large numbers of slonik scripts.
Handling Very Large Replication Sets
Some users have set up replication on replication sets that are tens to hundreds of gigabytes in size, which puts some added "strain" on the system, in particular where it may take several days for the COPY_SET event to complete. Here are some principles that have been observed for dealing with these sorts of situtations.
Drop all indices other than the primary key index while the COPY_SET event is run.
When data is copied into a table that has indices on it, PostgreSQL builds the indices incrementally, on the fly. This is much slower than simply copying the data into the table, and then recreating each index "ex nihilo", as the latter can take substantial advantage of sort memory.
In a future release, it is hopeful that indices will be dropped and recreated automatically, which would eliminate this.
If there are large numbers of updates taking place as the large set is being copied, this can lead to the subscriber being behind by some enormous number of SYNC events.
If a SYNC is generated about once per second, that leads to the subscriber "falling behind" by around 90,000 SYNCs per day, possibly for several days.
There will correspondingly be an enormous growth of sl_log_1 and sl_seqlog. Unfortunately, once the COPY_SET completes, users have found that the queries against these tables wind up reverting to Seq Scans so that even though a particular SYNC processing event is only processing a small number of those 90,000 SYNC events, it still reads through the entire table. In such a case, you may never see replication catch up.
Several things can be done that will help, involving careful selection of slon parameters:
On the subscriber's slon, increase the number of SYNC events processed together, with the slon_conf_sync_group_maxsize parameter to some value that allows it to process a significant portion of the outstanding SYNC events.
On the subscriber's slon, set the desired_sync_time to 0, as the adaptive SYNC grouping system will start with small groupings that will, under these circumstances, perform poorly.
Increase the slon_conf_sync_interval on the origin's slon so that SYNC events are generated less frequently. If a SYNC is only generated once per minute instead of once per second, that will cut down the number of events by a factor of 60.
It is likely to be worthwhile to use slon_conf_vac_frequency to deactivate slon-initiated vacuuming in favor of running your own vacuum scripts, as there will be a buildup of unpurgeable data while the data is copied and the subscriber starts to catch up.