Bric::DBA - Bricolage Database Administration Guide.
$Revision: 1.1.1.1.2.1 $
$Date: 2001/10/09 21:51:05 $
This guide is intended for the database administrator in charge of maintaining the Bricolage PostgreSQL database.
By default, PostgreSQL has pretty lax authentication settings, at least locally. Client authentication is controlled by the file pg_hba.conf in the $PGDATA directory, e.g., /usr/local/pgsql/data/pg_hba.conf. The defaul settings allow completely trusted connections to the server by any local user using either Unix domain sockets or TCP/IP connections. These settings are as follows:
local all trust
host all 127.0.0.1 255.255.255.255 trust
These settings allow anyone connecting via local Unix domain sockets to connect to all databases in a completely trusted manor. Theye won't even be prompted for a password! Similarly, hosts connecting from the local IP address, 127.0.0.1 with Mask 255.255.255.255, are completely trusted to connect to all databases.
While these settings are fine for limited local use (provided you trust all users on the local system!), they are not good for a production environment. We recommend that clients be required to provide a password -- and preferably that it be encrypted. If the Bricolage web server runs on the local box and no other box will need to connect, we recommend the following settings:
local all crypt
# host all 127.0.0.1 255.255.255.255 crypt
These are similar to the defaults, except that they require a password, but the password is sent over the wire encrypted using a simple challenge-response protocol. We have commented out the host entry because Bric is running locally and can therefore rely solely on Unix domain sockets. (In this case, you may also wish to remove the -i switch from the call PostgreSQL startup script.) Enable it if you find a need to connect locally via TCP/IP.
If you're running PostgreSQL on its own box, and Bricolage (or any subset of the Bricolage application, such as the distribution server) runs on another box, you will need to enable secure connections to that box. To limit the connections to come from that single box, use this configuration:
local all crypt
host all 127.0.0.1 255.255.255.255 crypt
host bric_dbname 192.168.1.2 255.255.255.255 crypt
Where the IP address 192.168.1.2 is the IP address of the box running Bricolage, and 255.255.255.255 is its mask. If you feel comfortable allowing more broad access to the server -- say you have a number of database applications running on multiple servers in the same subnet to connect -- try the following configuration:
local all crypt
host all 127.0.0.1 255.255.255.255 crypt
host bric_dbname 192.168.1.0 255.255.255.255 crypt
Here the 192.168.1.0 IP address represents the 192.168.1 subnet on which the mulitiple clients live.
Many other combinations of authentication settings are of course possible via the flexible pg_hba.con configuration syntax, but the above are our recommended settings for ensuring the security and integrity of your data. Refer to the PostgreSQL documentation at http://www.postgresql.org/ for more details, including other levels of encryption.
PostgreSQL supports backups and restores via its pg_dump, pg_dumpall, and psql utilties. File system backups are also a good idea. We recommend that you use the pg_dumpall utility on a regular basis to backup the Bricolage database:
pg_dumpall > outfile
This program outputs a series of SQL statements that can then be reloaded into the database using the psql utility:
psql < infile
We recommend the use of pg_dumpall in order to preserve the the entire database system. Read the pg_dumpall man page for more details. However, if you use your PostgreSQL server for other databases, you may wish to place them on separate backup schedules. In that case, use pg_dump to backup each of the databases, including the Bricolage database:
pg_dump brid_dbname > outfile
The restoration is the same as for the pg_dumpall file, except that you must be sure to create the database and users, first.
psql bric_dbname < infile
Read the pg_dump man page for more details. Note that neither pg_dump nor pg_dumpall prevent database access by other processes (e.g., Bricolage), but they will affect performance. It's a good idea to dump the file to a different partition or disk or even server (since pg_dump and pg_dumpall can be used on any server with access to the PostgreSQL server). Changes made to the database during the process of the backup will not be backed up.
In either case, if your Bricolage database is getting big, you may wish to compress the backupfile, or break it into smaller chunks. This can be done using *nix system utilities. For example, to compress a backupfile, use gzip:
pg_dumpall | gzip > filename.gz
gunzip -c filename.gz | psql
You can also use split to break the backup file into smaller chunks. This example breaks the file up into 1 MB chunks:
pg_dumpall | split -b 1m - filename
cat filename.* | psql
New to PostgreSQL is the custom dump format. You can use it to compress dumpfiles on the fly. Consult the pg_dump and pg_dumpall man pages for more information.
File system backups are another option, and may in fact be useful as a backup backup methodology. Use whatever methology for file system backups that you prefer. Here is an example using tar:
tar -cf backup.tar /usr/local/pgsql/data
There are two restrictions to file system backups, however. First, the PostgreSQL server must be shut down during the backup or during restoration. Second, you can't do selective backups of databases or tables, only the whole database cluster.
For more detail on PostgreSQL backup and restore, please consult the PostgreSQL documentation at http://www.postgresql.org/.
Write Ahead Logging (WAL) is a standard approach to transaction logging. It is automatically enabled in the PostgreSQL server. WAL ensures that the log is written before database records are altered, thereby minimizing the number of disk writes; and in the event of a crash, the database can be recovered using using the log.
Please consult the PostgreSQL documentation for more information on configuring WAL and recovering databasees with WAL.
VACUUM is a PostgreSQL SQL command that Cleans and analyzes a Postgres database. It's purpose is to reclaim storage and to collect information for the optimizer. VACUUM opens every table in the database, cleans out records from rolled back transactions, and updates statistics in the system catalogs. The statistics maintained include the number of tuples and number of pages stored in all tables. VACUUM ANALYZE collects statistics representing the dispersion of the data in each column. This information is valuable when several query execution paths are possible. Running VACUUM periodically will increase the speed of the database in processing user queries.
We recommend that the Bricolage database be VACUUM-ed nightly, in order to remove expired rows. VACUUM ANALYZE needs to be run less often (indeed, the PostgreSQL documentation only suggests running it after copying a large table into Postgres or after deleting a large number of records).
There are two ways to <VACUUM> a database. The first is through an interactive shell such as psql. We suggest, however, that you instead schedule a nightly cron job to do the task, and use the vacuumdb utility. Here's an example:
set PGPASSWORD=password; vacuumdb -d bric_dbname -U username
If the vacuum is successful, vacuumdb will output "VACUUM". In the event of a failure, it will output vacuumdb: Vacuum failed.
Consult the PostgreSQL documetation for more information.
David Wheeler <david@wheeler.net>
perl(1), Bric (2).