Home PostgreSQL PostgreSQL backup and Restore

PostgreSQL backup and Restore

0
94

As with every other database, PostgreSQL databases should also be backed up regularly.  There are three fundamentally different approaches to backing up PostgreSQL data. Each has its own strengths and weaknesses.

SQL dump backup and restore

pg_dump

Each Database can be backed up using 

  1. pg_dump dbname > outfile
/usr/local/bin/pg_dump dvdrental > "/usr/local/var/backup/dvdrental.txt"

Restore the database using following commands. In this example I am restoring in another database dvdrental_test.

2. psql dbname < infile

psql dvdrental_test < "/usr/local/var/backup/dvdrental.txt"

3. pg_dump -h host1 dbname | psql -h host2 dbname

In this way you can take backup of a database from “host1” and restore the same in a remote host “host2”.

Cons: Even though process is very simple, it can take backup of only a single schema.Hence, next process is to take backup of all databases using pg_dumpall.

pg_dumpall

  1. Backup of all databases using pg_dumpall > outfile . 
/usr/local/bin/pg_dumpall > "/usr/local/var/backup/all_databases.sql"

2. Restore the full dump using psql. You can restore the same in another other remote host or remote machine as well by transferring the backup there.

psql -f "/usr/local/var/backup/all_databases.sql" postgres

Handling Large Databases

Some operating systems have maximum file size limits that cause problems when creating large pg_dump output files. Fortunately, pg_dump can write to the standard output, so you can use standard Unix tools to work around this potential problem. There are several possible methods:

  1. You can zip the file and then use | to take that as input for restoring.
gunzip -c filename.gz | psql dbname
or
cat filename.gz | gunzip | psql dbname

2.split. The split command allows you to split the output into smaller files that are acceptable in size to the underlying file system. For example, to make chunks of 1024 MB.

pg_dump dbname | split -b 1024m - filename

Reload with: (Here * is important as there will be multiple file starting with name filename)

cat filename* | psql dbname

pg_restore

It is used for restoring or importing the database(single or multiple or all databases). Here you have multiple option to chose and perform a restore into your postgres instance.

  1. You can clean all objects before re-storing the database or objects using -c or –clean option.
  2. You can create the target database(-C or –create)  as well from pg_restore itself which was not possible in case of psql.
  3. You can restore objects of only few schemas using –schema=<schema_name> . Use this multiple times for multiple schemas.
  4. You can exclude few schemas using –exclude-schema option while restoring.
  5. All the options I have listed down. Check for any other requirement from those options.
pg_restore -d dbname -f filename
  -d, --dbname=NAME        connect to database name
  -f, --file=FILENAME      output file name (- for stdout)
  -F, --format=c|d|t       backup file format (should be automatic)
  -l, --list               print summarized TOC of the archive
  -v, --verbose            verbose mode
  -V, --version            output version information, then exit
  -?, --help               show this help, then exit

Options controlling the restore:
  -a, --data-only              restore only the data, no schema
  -c, --clean                  clean (drop) database objects before recreating
  -C, --create                 create the target database
  -e, --exit-on-error          exit on error, default is to continue
  -I, --index=NAME             restore named index
  -j, --jobs=NUM               use this many parallel jobs to restore
  -L, --use-list=FILENAME      use table of contents from this file for
                               selecting/ordering output
  -n, --schema=NAME            restore only objects in this schema
  -N, --exclude-schema=NAME    do not restore objects in this schema
  -O, --no-owner               skip restoration of object ownership
  -P, --function=NAME(args)    restore named function
  -s, --schema-only            restore only the schema, no data
  -S, --superuser=NAME         superuser user name to use for disabling triggers
  -t, --table=NAME             restore named relation (table, view, etc.)
  -T, --trigger=NAME           restore named trigger
  -x, --no-privileges          skip restoration of access privileges (grant/revoke)
  -1, --single-transaction     restore as a single transaction
  --disable-triggers           disable triggers during data-only restore
  --enable-row-security        enable row security
  --if-exists                  use IF EXISTS when dropping objects
  --no-comments                do not restore comments
  --no-data-for-failed-tables  do not restore data of tables that could not be
                               created
  --no-publications            do not restore publications
  --no-security-labels         do not restore security labels
  --no-subscriptions           do not restore subscriptions
  --no-tablespaces             do not restore tablespace assignments
  --section=SECTION            restore named section (pre-data, data, or post-data)
  --strict-names               require table and/or schema include patterns to
                               match at least one entity each
  --use-set-session-authorization
                               use SET SESSION AUTHORIZATION commands instead of
                               ALTER OWNER commands to set ownership
Connection options:
  -h, --host=HOSTNAME      database server host or socket directory
  -p, --port=PORT          database server port number
  -U, --username=NAME      connect as specified database user
  -w, --no-password        never prompt for password
  -W, --password           force password prompt (should happen automatically)
  --role=ROLENAME          do SET ROLE before restore

pg_basebackup and restore

backup

All the above backups that we have taken using pg_dump or pg_dumpall can’t be used for streaming replication process or point-in-time recovery. Hence, we have another method of taking full filesystem level using pg_basebackup.

pg_basebackup is used to take base backups of a running PostgreSQL database cluster. These are taken without affecting other clients to the database, and can be used both for point-in-time recovery and as the starting point for a log shipping or streaming replication standby servers.

pg_basebackup makes a binary copy of the database cluster files, while making sure the system is put in and out of backup mode automatically. Backups are always taken of the entire database cluster; it is not possible to back up individual databases or database objects. 

pg_basebackup --host=localhost -p 5432 --wal-method=stream -r 1024M -R --format=t -z -P --tablespace-mapping=/opt/PosgreSQL/13/user_tablespace=/opt/PosgreSQL/13/user_tablespace_new -D /opt/PosgreSQL/13/pgbasebackup/bck5.tar

All available options are as follows.

Options controlling the output: 
-D, --pgdata=DIRECTORY receive base backup into directory

  -F, --format=p|t       output format (plain (default), tar)
  -r, --max-rate=RATE    maximum transfer rate to transfer data directory
                         (in kB/s, or use suffix "k" or "M")
  -R, --write-recovery-conf
                         write configuration for replication
  -T, --tablespace-mapping=OLDDIR=NEWDIR
                         relocate tablespace in OLDDIR to NEWDIR
      --waldir=WALDIR    location for the write-ahead log directory
  -X, --wal-method=none|fetch|stream
                         include required WAL files with specified method
  -z, --gzip             compress tar output
  -Z, --compress=0-9     compress tar output with given compression level

General options:
  -c, --checkpoint=fast|spread
                         set fast or spread checkpointing
  -C, --create-slot      create replication slot
  -l, --label=LABEL      set backup label
  -n, --no-clean         do not clean up after errors
  -N, --no-sync          do not wait for changes to be written safely to disk
  -P, --progress         show progress information
  -S, --slot=SLOTNAME    replication slot to use
  -v, --verbose          output verbose messages
  -V, --version          output version information, then exit
      --manifest-checksums=SHA{224,256,384,512}|CRC32C|NONE
                         use algorithm for manifest checksums
      --manifest-force-encode
                         hex encode all file names in manifest
      --no-estimate-size do not estimate backup size in server side
      --no-manifest      suppress generation of backup manifest
      --no-slot          prevent creation of temporary replication slot
      --no-verify-checksums
                         do not verify checksums
  -?, --help             show this help, then exit

Connection options:
  -d, --dbname=CONNSTR   connection string
  -h, --host=HOSTNAME    database server host or socket directory
  -p, --port=PORT        database server port number
  -s, --status-interval=INTERVAL
                         time between status packets sent to server (in seconds)
  -U, --username=NAME    connect as specified database user
  -w, --no-password      never prompt for password
  -W, --password         force password prompt (should happen automatically)

Restore

  • 1. Copy all files from the backup folder to the desired folder using copy command in same server or scp command for remote server.
  • 2. Make sure to create tablespace directory structure as you have created while taking pg_basebackup for a remote server before performing step1.
  • 3. Change the permission of data folder with “0700” using “chmod 0700 data”.
  • 5. Add privileges in pg_hba.conf file for allowing access to other required servers.
  • 4. Change the postgresql.conf file listensing_address , port, archive_location and any other parameters you want in the new server. You have to change primary_conn_info for standby server for version 13 else use step 5 for older versions.
  • 5. Make sure to remove recovery.done file and create recovery.conf file(restore_command, primary_conn_info, recovery_target_time to latest) for older versions like 9,10,11. For 12 and 13 recovery.conf is not needed.
  • 5. Start the server.

Cold Backup and restore

Backup

1. Shutdown the Postgres Instance.

2. Make a tar file of the data folder using following command.

tar -cf backup.tar /usr/local/pgsql/data

Restore

  1. Copy or scp  tar file to required location. 
  2. Uncompress using following command.
tar -xf backup.tar

3. Change the ph_hba.conf file for access, postgresql.conf file for the required parameters like we mentioned in pg_basebaup and restart the server.