Skip to content

Commit 4b8237e

Browse files
committed
Add backup notes for other versions
1 parent 76e8d25 commit 4b8237e

File tree

5 files changed

+157
-101
lines changed

5 files changed

+157
-101
lines changed

product_docs/docs/pgd/5.6/backup.mdx

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,5 +278,4 @@ you can add a `WHERE slot_name LIKE 'bdr%'` clause, but this is rarely
278278
useful.
279279

280280
!!! Warning
281-
Never use these commands to drop replication slots on a live PGD node
282-
281+
Never use these commands to drop replication slots on a live PGD node

product_docs/docs/pgd/5.7/backup.mdx

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,5 +278,4 @@ you can add a `WHERE slot_name LIKE 'bdr%'` clause, but this is rarely
278278
useful.
279279

280280
!!! Warning
281-
Never use these commands to drop replication slots on a live PGD node
282-
281+
Never use these commands to drop replication slots on a live PGD node

product_docs/docs/pgd/5.8/backup.mdx

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,5 +278,4 @@ you can add a `WHERE slot_name LIKE 'bdr%'` clause, but this is rarely
278278
useful.
279279

280280
!!! Warning
281-
Never use these commands to drop replication slots on a live PGD node
282-
281+
Never use these commands to drop replication slots on a live PGD node

product_docs/docs/pgd/5.9/backup.mdx

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,5 +278,4 @@ you can add a `WHERE slot_name LIKE 'bdr%'` clause, but this is rarely
278278
useful.
279279

280280
!!! Warning
281-
Never use these commands to drop replication slots on a live PGD node
282-
281+
Never use these commands to drop replication slots on a live PGD node

product_docs/docs/pgd/6.1/reference/backup-restore.mdx

Lines changed: 153 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@ redirects:
88
- /pgd/latest/backup/ #generated for DOCS-1247-PGD-6.0-Docs
99
---
1010

11-
1211
PGD is designed to be a distributed, highly available system. If
1312
one or more nodes of a cluster are lost, the best way to replace them
1413
is to clone new nodes directly from the remaining nodes.
@@ -21,12 +20,73 @@ recovery (DR), such as in the following situations:
2120
as a result of data corruption, application error, or
2221
security breach
2322

24-
## Backup
25-
26-
### pg_dump
23+
## Logical backup and restore
2724

2825
You can use pg_dump, sometimes referred to as *logical backup*,
29-
normally with PGD.
26+
normally with PGD. But in order to reduce the risk of global lock
27+
timeouts, we recommend dumping pre-data, data, and post-data
28+
separately. For example:
29+
30+
```console
31+
pg_dump -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB -v --exclude-schema='"bdr"' --exclude-extension='"bdr"' --section=pre-data -Fc -f pgd-pre-data.dump
32+
pg_dump -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB -v --exclude-schema='"bdr"' --exclude-extension='"bdr"' --section=data -Fc -f pgd-data.dump
33+
pg_dump -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB -v --exclude-schema='"bdr"' --exclude-extension='"bdr"' --section=post-data -Fc -f pgd-post-data.dump
34+
```
35+
36+
And restore by directly executing these SQL files:
37+
38+
```console
39+
pg_restore -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB --section=pre-data -f pgd-pre-data.dump
40+
pg_restore -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB --section=data -f pgd-data.dump
41+
psql -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB -c 'SELECT bdr.wait_slot_confirm_lsn(NULL, NULL)'
42+
pg_restore -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB --section=post-data -f pgd-post-data.dump
43+
psql -h $PG_HOST -p $PG_PORT -U $PG_USER -d $PGD_DB -c 'SELECT bdr.wait_slot_confirm_lsn(NULL, NULL)'
44+
```
45+
46+
After which point the dump will be restored on all nodes in the cluster.
47+
48+
In contrast if you do not split sections out with a naive pg_dump and
49+
pg_restore, the restore will likely fail with a global lock timeout.
50+
51+
You should also temporarily set the following settings in `postgresql.conf`:
52+
53+
```
54+
# Increase from the default of `1GB` to something large, but still a
55+
# fraction of your disk space since the non-WAL data must also fit.
56+
# This decreases the frequency of checkpoints.
57+
max_wal_size = 100GB
58+
59+
# Increase the number of writers to make better use of parallel
60+
# apply. Default is 2. Make sure this isn't overriden lower by the
61+
# node group config num_writers setting.
62+
bdr.writers_per_subscription = 5
63+
64+
# Increase the amount of memory for building indexes. Default is
65+
# 64MB. For example, 1GB assuming 128GB total RAM.
66+
maintenance_work_mem = 1GB
67+
68+
# Increase the receiver and sender timeout from 1 minute to 1hr to
69+
# allow large transactions through.
70+
wal_receiver_timeout = 1h
71+
wal_sender_timeout = 1h
72+
```
73+
74+
Additionally:
75+
76+
- Make sure the default bdr.streaming_mode = 'auto' is not overridden so that transactions are streamed.
77+
- Make sure any session or postgresql.conf settings listed above are not overriden by node group-level settings in general.
78+
79+
And if you continue to get global lock timeouts during initial load,
80+
temporarily set `bdr.ddl_locking = off` for the initial load.
81+
82+
### Prefer restoring to a single node
83+
84+
Especially when initially setting up a cluster from a Postgres dump,
85+
we recommend you restore to a cluster with a single PGD node. Then run
86+
`pgd node setup` for each node you want in the cluster which will do a
87+
physical join that uses `bdr_init_physical` under the hood.
88+
89+
### Sequences
3090

3191
pg_dump dumps both local and global sequences as if
3292
they were local sequences. This behavior is intentional, to allow a PGD
@@ -51,7 +111,7 @@ dump only with `bdr.crdt_raw_value = on`.
51111
Technical Support recommends the use of physical backup techniques for
52112
backup and recovery of PGD.
53113

54-
### Physical backup
114+
## Physical backup and restore
55115

56116
You can take physical backups of a node in an EDB Postgres Distributed cluster using
57117
standard PostgreSQL software, such as
@@ -82,7 +142,92 @@ PostgreSQL backup techniques to PGD:
82142
local data and a backup of at least one node that subscribes to each
83143
replication set.
84144

85-
### Eventual consistency
145+
### Restore
146+
147+
While you can take a physical backup with the same procedure as a
148+
standard PostgreSQL node, it's slightly more complex to
149+
restore the physical backup of a PGD node.
150+
151+
#### EDB Postgres Distributed cluster failure or seeding a new cluster from a backup
152+
153+
The most common use case for restoring a physical backup involves the failure
154+
or replacement of all the PGD nodes in a cluster, for instance in the event of
155+
a data center failure.
156+
157+
You might also want to perform this procedure to clone the current contents of a
158+
EDB Postgres Distributed cluster to seed a QA or development instance.
159+
160+
In that case, you can restore PGD capabilities based on a physical backup
161+
of a single PGD node, optionally plus WAL archives:
162+
163+
- If you still have some PGD nodes live and running, fence off the host you
164+
restored the PGD node to, so it can't connect to any surviving PGD nodes.
165+
This practice ensures that the new node doesn't confuse the existing cluster.
166+
- Restore a single PostgreSQL node from a physical backup of one of
167+
the PGD nodes.
168+
- If you have WAL archives associated with the backup, create a suitable
169+
`postgresql.conf`, and start PostgreSQL in recovery to replay up to the latest
170+
state. You can specify an alternative `recovery_target` here if needed.
171+
- Start the restored node, or promote it to read/write if it was in standby
172+
recovery. Keep it fenced from any surviving nodes!
173+
- Clean up any leftover PGD metadata that was included in the physical backup.
174+
- Fully stop and restart the PostgreSQL instance.
175+
- Add further PGD nodes with the standard procedure based on the
176+
`bdr.join_node_group()` function call.
177+
178+
#### Cleanup of PGD metadata
179+
180+
To clean up leftover PGD metadata:
181+
182+
1. Drop the PGD node using [`bdr.drop_node`](/pgd/6.1/reference/tables-views-functions/functions-internal#bdrdrop_node).
183+
2. Fully stop and restart PostgreSQL (important!).
184+
185+
#### Cleanup of replication origins
186+
187+
You must explicitly remove replication origins with a separate step
188+
because they're recorded persistently in a system catalog. They're
189+
therefore included in the backup and in the restored instance. They
190+
aren't removed automatically when dropping the BDR extension because
191+
they aren't explicitly recorded as its dependencies.
192+
193+
To track progress of incoming replication in a crash-safe way,
194+
PGD creates one replication origin for each remote master node. Therefore,
195+
for each node in the previous cluster run this once:
196+
197+
```
198+
SELECT pg_replication_origin_drop('bdr_dbname_grpname_nodename');
199+
```
200+
201+
You can list replication origins as follows:
202+
203+
```
204+
SELECT * FROM pg_replication_origin;
205+
```
206+
207+
Those created by PGD are easily recognized by their name.
208+
209+
#### Cleanup of replication slots
210+
211+
If a physical backup was created with `pg_basebackup`, replication slots
212+
are omitted from the backup.
213+
214+
Some other backup methods might preserve replications slots, likely in
215+
outdated or invalid states. Once you restore the backup, use these commands to drop all replication slots:
216+
217+
```
218+
SELECT pg_drop_replication_slot(slot_name)
219+
FROM pg_replication_slots;
220+
```
221+
222+
If you have a reason to preserve some slots,
223+
you can add a `WHERE slot_name LIKE 'bdr%'` clause, but this is rarely
224+
useful.
225+
226+
!!! Warning
227+
Never use these commands to drop replication slots on a live PGD node
228+
229+
230+
## Eventual consistency
86231

87232
The nodes in an EDB Postgres Distributed cluster are *eventually consistent* but not
88233
*entirely consistent*. A physical backup of a given node provides
@@ -199,89 +344,4 @@ of changes arriving from a single master in COMMIT order.
199344

200345
!!! Note
201346
This feature is available only with EDB Postgres Extended.
202-
Barman doesn't create a `multi_recovery.conf` file.
203-
204-
## Restore
205-
206-
While you can take a physical backup with the same procedure as a
207-
standard PostgreSQL node, it's slightly more complex to
208-
restore the physical backup of a PGD node.
209-
210-
### EDB Postgres Distributed cluster failure or seeding a new cluster from a backup
211-
212-
The most common use case for restoring a physical backup involves the failure
213-
or replacement of all the PGD nodes in a cluster, for instance in the event of
214-
a data center failure.
215-
216-
You might also want to perform this procedure to clone the current contents of a
217-
EDB Postgres Distributed cluster to seed a QA or development instance.
218-
219-
In that case, you can restore PGD capabilities based on a physical backup
220-
of a single PGD node, optionally plus WAL archives:
221-
222-
- If you still have some PGD nodes live and running, fence off the host you
223-
restored the PGD node to, so it can't connect to any surviving PGD nodes.
224-
This practice ensures that the new node doesn't confuse the existing cluster.
225-
- Restore a single PostgreSQL node from a physical backup of one of
226-
the PGD nodes.
227-
- If you have WAL archives associated with the backup, create a suitable
228-
`postgresql.conf`, and start PostgreSQL in recovery to replay up to the latest
229-
state. You can specify an alternative `recovery_target` here if needed.
230-
- Start the restored node, or promote it to read/write if it was in standby
231-
recovery. Keep it fenced from any surviving nodes!
232-
- Clean up any leftover PGD metadata that was included in the physical backup.
233-
- Fully stop and restart the PostgreSQL instance.
234-
- Add further PGD nodes with the standard procedure based on the
235-
`bdr.join_node_group()` function call.
236-
237-
#### Cleanup of PGD metadata
238-
239-
To clean up leftover PGD metadata:
240-
241-
1. Drop the PGD node using [`bdr.drop_node`](/pgd/latest/reference/tables-views-functions/functions-internal#bdrdrop_node).
242-
2. Fully stop and restart PostgreSQL (important!).
243-
244-
#### Cleanup of replication origins
245-
246-
You must explicitly remove replication origins with a separate step
247-
because they're recorded persistently in a system catalog. They're
248-
therefore included in the backup and in the restored instance. They
249-
aren't removed automatically when dropping the BDR extension because
250-
they aren't explicitly recorded as its dependencies.
251-
252-
To track progress of incoming replication in a crash-safe way,
253-
PGD creates one replication origin for each remote master node. Therefore,
254-
for each node in the previous cluster run this once:
255-
256-
```
257-
SELECT pg_replication_origin_drop('bdr_dbname_grpname_nodename');
258-
```
259-
260-
You can list replication origins as follows:
261-
262-
```
263-
SELECT * FROM pg_replication_origin;
264-
```
265-
266-
Those created by PGD are easily recognized by their name.
267-
268-
#### Cleanup of replication slots
269-
270-
If a physical backup was created with `pg_basebackup`, replication slots
271-
are omitted from the backup.
272-
273-
Some other backup methods might preserve replications slots, likely in
274-
outdated or invalid states. Once you restore the backup, use these commands to drop all replication slots:
275-
276-
```
277-
SELECT pg_drop_replication_slot(slot_name)
278-
FROM pg_replication_slots;
279-
```
280-
281-
If you have a reason to preserve some slots,
282-
you can add a `WHERE slot_name LIKE 'bdr%'` clause, but this is rarely
283-
useful.
284-
285-
!!! Warning
286-
Never use these commands to drop replication slots on a live PGD node
287-
347+
Barman doesn't create a `multi_recovery.conf` file.

0 commit comments

Comments
 (0)