Hitachi, Ltd.
There are two methods for PostgreSQL backup. One is offline backup that is performed when there is no transaction, and the other is online backup that is performed as a background process when there is a transaction.
Each backup method has its corresponding restoration method.
Table 1 shows the backup and restoration methods used in this measurement. Functional descriptions of the methods are also described.
Table 1 Backup and Restoration Methods
| Backup method |
Execution method and functional description |
Availability during DBT-1 execution |
Restoration execution method and functional description |
| pg_dump |
Creates a backup file for each database. The output format for the backup file can be selected by using options. pg_dump -F p [Target DB Name] > [Output Filename]: Creates a backup file in script format. pg_dump -F c [Target DB Name] ] > [Output Filename]: Creates a backup file in custom archive format. |
Available |
Restoration is executed for each database and for only valid area, irrespective of the output file format. Restoration method for script format file is as follows: psql -f [Backup Filename] [Target DB Name]: Restoration method for archive format file is as follows: pg_restore -d [Target DB Name] [Backup Filename] |
| pg_dumpall |
Creates a backup file in a script format file that contains all the databases in a cluster. In this method, template0,1 and postgres DB are also restored. Thus, the information on postgres user is also backed up. Execution method is, pg_dumpall > [Output Filename] |
Unavailable |
Restores only valid area of the entire database. Execution method is, psql -f [Backup Filename] [Target DB Name] |
| tar czf |
Creates a backup file by duplicating a cluster. For this backup, tar command provided by the OS is used. In addition to the normal tar archive, gzip compression is executed at the same time. When PITR is not configured, postmaster must be inactive. Execution method is, tar czf [Output Filename] [Name of Data Cluster Directory] |
Available only when PITR is configured |
Restoration is executed for a file system including invalid area.Execution method is, tar xzf [Backup Filename] |
pg_dump and pg_dumpall can only be executed when PosqgreSQL is active because backing up by using those commands issue SQL statements.
Thus, strictly speaking, offline backup cannot be executed with pg_dump and pg_dumpall. In this consideration, however, we say "offline backup" meaning a backup with no transaction.
We investigated the points below.
Contents
= Backup =
1. Effects of Increase of Invalid Area in Database on Backup Time
2. Backup Time and Backup File Size in Each Backup Method
= Restoration =
3. Time Required for Restoring Backed-Up Files
= PITR =
4. Online Backup Time
5. Effects of Online Backup on Operational Performance
6. PITR and WAL Increase
= Backup and Restoration =
7 Consideration of Operation Time Concerning Backup and Restoration
- System Running 24 Hours a Day, 365 Days a Year
- System in Which Operation Halt Time (Monthly / Daily) Is Periodically Configured
- Consideration of Database Size Concerning Backup and Restoration
- Consideration of Problem Recovery Time Concerning Backup and Restoration
1. Effects of Increase of Invalid Area in Database on Backup Time
In this measurement, we used two tables. One is a table of about 7 GB that was created by using "./datagen -d PGSQL -i 10000 -u 1000," and the other is a table of about 53 GB that was created by using "./datagen -d PGSQL -i 10000 -u 8000."
The measurement was performed by increasing the invalid area in each data cluster through execution of DBT-1 on each table. During the measurement, we gradually increased the DBT-1 execution time, from three to seven, and then seven to 10 hours.
Figure 1 illustrates the amount of increase of the invalid area during the DBT-1 execution on the table of about 53 GB.

Figure 1 Increase of Invalid Area Measured at Intervals During DBT-1 Execution
We measured the two tables with eu=1000.
As a result of this measurement, we found that table size does not affect the amount of increase of the invalid area.
In addition, absolute amount of increase of the invalid area is not enough. Thus, we could not obtain any specific result to tell effects of long-time execution of DBT-1 and VACUUMFULL on the performance of backup and restoration.
Figure 2 illustrates changes in backup (tar czf) time measured at intervals during the DBT-1 execution on the 53 GB table.

Figure 2 Changes in Backup (tar czf) Time Measured at Intervals During DBT-1 Execution on 53 GB Table
Generally, increase of invalid area is considered to cause increase of backup time.However, when we execute the DBT-1 for only 10 hours, it would be better to think what affects the backup time most is not the increase of invalid area but table size. Therefore, we investigate by focusing on two tables of 7GB and 53GB each.
2.Backup Time and Backup File Size in Each Backup Method
In this section, we investigate the backup method in which service inactive time for backup can be allocated.
First, we analyze backup speed.
Figure 3 illustrates backup time for each backup method.

Figure 3 Backup Time for Each Backup Method
The measurement results show that tar czf requires the longest backup time. In order to check if there is any change in backup efficiency depending on the table size, we divided data size of backup target by backup time (seconds) to obtain backup efficiency. Table 2 shows backup amount per second.
Table 2 Backup Amount per Second for Each Backup Method
| Backup method |
7 GB table (A) |
53 GB table (B) |
(A) - (B) |
| pg_dump -F c |
14511 (KB/s) |
11159 (KB/s) |
3352 |
| tar czf |
10267 (KB/s) |
7801 (KB/s) |
2466 |
| pg_dumpall |
16930 (KB/s) |
16634 (KB/s) |
296 |
| pg_dump -F p |
17057 (KB/s) |
17369 (KB/s) |
-312 |
The measurement results show that backup performance of tar czf and pc_dump -F c deteriorates as backup data size becomes larger.
Therefore, we think that pg_dump -F p backup is more effective than tar czf if we want to shorten service inactive time caused by the backup.
Second, we investigate the amount of compression enabled by each backup method.
Figure 4 illustrates backup file sizes enabled by each backup method.

Figure 4 Backup File Sizes Enabled by Each Backup Method
When invalid area of backup target is small, backup files are created with sizes shown in figure 4.
Table 3 shows backup compression ratio of each backup method. The ratio is calculated by dividing each backup file size by data cluster size of backup target data, those sizes are shown in figure 4.
Table 3 Backup Compression Ratio of Each Backup Method/caption>
|
7 GB table |
53 GB table |
| (A) Data cluster size of backup target (KB) |
6810012 |
53064140 |
| (B) Size of backup file created from tar czf (KB) |
3030008 |
23678000 |
| (C) Size of backup file created from pg_dumpall (KB) |
3203308 |
25857000 |
| (D) Size of backup file created from pg_dump -F c (KB) |
1933640 |
15464000 |
| (E) Size of backup file created from pg_dump -F p (KB) |
3203308 |
25857000 |
| Compression ratio of tar czf ((B)/(A)) |
0.445 |
0.446 |
| Compression ratio of pg_dumpall ((C)/(A)) |
0.47 |
0.487 |
| Compression ratio of pg_dump -F c ((D)/(A)) |
0.283 |
0.291 |
| Compression ratio of pg_dump -F p ((E)/(A)) |
0.47 |
0.487 |
Each backup method equally reduces the original file size to less than half. There was not so much difference between compression ratio of each backup method.
Therefore, we think that pg_dump -F c backup is effective when we want to make backup file size small.
3. Time Required for Restoring Backed-Up Files
In this section, we investigate restoration performance. Figure 12 illustrates the time required for restoring each backed-up file.

Figure 12 Time Required for Restoring Each Backed-Up File
The measurement results show that tar czf is the fastest method to restore files and there was not so much difference between pg_dump methods.
Therefore, we think that the tar czf method is more effective than the pg_dump methods if we want to shorten problem recovery time.
In order to check if there is any difference in processing efficiency depending on the table size, we divided the size of backup target data by restoration time. Table 4 shows the results.
Table 4 Amount of Data Restored per Second Through Each Backup Method
| Backup method |
Restoration method |
7 GB table (A) |
50 GB table (B) |
(A) - (B) |
| pg_dump -F c |
pg_restore |
4676 (KB/s) |
3808 (KB/s) |
868 |
| tar czf |
tar xzf |
41020 (KB/s) |
41231 (KB/s) |
-211 |
| Pg_dumpall |
psql -f |
4686 (KB/s) |
3817 (KB/s) |
869 |
| pg_dump -F p |
psql -f |
4618 (KB/s) |
3963 (KB/s) |
655 |
The measurement results show that restoration performance of Pg_dump methods deteriorates as the size of backed-up data becomes larger.
The result of DBT-1 measurement shows that restoring of a backed-up file created from tar czf will be completed within 30 minutes even when the size of data cluster is 53 GB.
On the other hand, restoring of a backed-up file created from pg_dump and pg_dumpall will take less than 30 minutes for 7 GB data cluster and less than four hours for 53 GB data cluster.
4. Online Backup Time
In this section, we investigate the backup method that is appropriate for backup during the DBT-1 execution.
Figure 5 illustrates online backup time and offline backup time for each backup method.

Figure 5 Online Backup Time for Each Backup Method
We think that pg_dump -F p backup is more effective than tar czf if we want to shorten processing time of the online backup.
5. Effects of Online Backup on Operational Performance
Figure 6 illustrates effects of online backup on DBT-1 processing. The focuse is on changes in BT value.

Figure 6 BT Values Measured During DBT-1 Execution (with or without Online Backup)
Figure 6 indicates the comparison result between a BT value generated through the continual backing up by tar czf and the other BT value generated without such backing up. The premise is,53 GB table, the number of emulate users (eu)=1000 and 40 minutes execution of DBT-1.
The measurement results show that online backup does not affect the processing capability of DBT-1.
At the same time, this indicates that DBT-1 does not affect the system so much if it is executed when the number of emulate users (eu)=1000.
Figure 7 and 8 illustrate changes in CPU utilization rate. Figure 7 is based on online backup and figure 8 is based on the situation in which no such backup performed. Figure 9 illustrates highly-ranked results obtained by executing oprofile during the measurements of figure 7 and 8.

Figure 7 CPU Utilization Rate During Online Backup

Figure 8 CPU Utilization Rate When DBT-1 Only Is Executed

Figure 9 Execution Results of oprofile
The measurement results of figure 8 explains the stable processing capability of DBT-1 during online backup illustrated in figure 6. That is, the stable processing capability was due to sufficient capability allowed for system resources.
In figure 9, postgres processes show a slight degradation during online backup in comparison with when DBT-1 only is executed. Putting these minute changes aside, there was not so much difference as a whole.
Sufficient capability of system resources is about 30%, which is the difference between average % idles shown in figure 7 and 8.
This was assigned to "gzip:(no symbol)" shown in figure 9 as backup process.
We also attempted the measurement with eu=2000, which resulted in an error in which DBT-1 could not operate normally when CPU utilization rate became 100%.
It seems that CPU utilization rate of user application must be kept to about 50% when online backup is performed.
For reference, figure 10 illustrates the CPU utilization rate measured when eu=1500 is set. eu=1500 is the measurable upper limit for this online backup measurement.

Figure 10 CPU Utilization Rate During Online Backup with eu=1500
6. PITR and WAL Increase
PITR must be configured before online backup using tar czf is performed.
WAL is output after PITR is configured.
Figure 11 illustrates the amount of change in data cluster and WAL.

Figure 11 Amount of Change in Data Cluster and WAL During Execution
In this evaluation, WAL increased by 20 to 30 folds of the size of data cluster, which increases following the DBT-1 processing increase.
Figure 12 illustrates BT values measured during the figure 11 measurement.

Figure 12 BT Values Measured During DBT-1 Execution
As a result of comparing figure 11 and 12, we could find that the amount of change in WAL increases following the BT values that become large in proportion to the length of DBT-1 execution time.
Figure 13 illustrates total recovery time for PITR by table size.

Figure 13 Total Recovery Time for PITR by Table Size
The base restoration time shown in figure 13 is measured based on tar recovery in which the backed-up files created from tar czf (illustrated in figure 12) were recovered.
WAL for PITR was targeted to an area on which DBT-1 run for four hours after base backup.
Base restoration time is always required before PITR execution. Thus, for restoration time, it is inevitable that there arises difference between table sizes.
7 Consideration of Operation Time Concerning Backup and Restoration
- System Running 24 Hours a Day, 365 Days a Year
In this system, service cannot be made inactive during backup. Therefore, there is no alternative but to select online backup.
If the importance is placed on restoration performance, tar czf backup with PITR configured should be selected. However, we think that allocating a large amount of disk space for storing WAL is difficult. Therefore, it would be practical to perform online backup by using pg_dump -F c.
Restoration time of backed-up file created by pg_dump tends to become longer in proportion to the size of data cluster.
- System in Which Operation Halt Time (Monthly / Daily) Is Periodically Configured
If the size of data cluster is 7 GB, a backup can be created from pg_dump, pg_dumpall or tar czf with 15 minutes of service inactive time. Selection of operation halt interval, daily or monthly, does not matter.
(The backup time is dependent on I/O performance and subject to change.)
In this case, restoration time is within 30 minutes irrespective of the selection of backup method.
If the size of data cluster is 53 GB, an hour of service inactive time is needed even when backup method of pg_dump or pg_dumpall is selected.
If two hours of service inactive time can be allocated, then backup method of tar czf can also be selected.
In this case, restoration time is less than 4 hours for backed-up files created from pg_dump or pg_dumpall.
In contrast, for backed-up file created from tar czf, restoration is completed within 30 minutes.
(The restoration time is dependent on I/O performance and subject to change.)
- Consideration of Database Size Concerning Backup and Restoration
If the size of data cluster is 7 GB, a backup can be created from pg_dump, pg_dumpall or tar czf with 15 minutes of service inactive time.
In this case, restoration time is within 30 minutes irrespective of the selection of backup method.
If the size of data cluster is 53 GB, an hour of service inactive time is needed even when backup method of pg_dump or pg_dumpall is selected.
If two hours of service inactive time can be allocated, then backup method of tar czf can also be selected.
In this case, restoration time is less than 4 hours for backed-up files created from pg_dump or pg_dumpall.
In contrast, for backed-up file created from tar czf, restoration is completed within 30 minutes.
(The restoration time is dependent on I/O performance and subject to change.)
- Consideration of Problem Recovery Time Concerning Backup and Restoration
In terms of shortening of problem recovery time, restoration of the backup file created from tar czf is the fastest. For this backup file, restoration is completed within 30 minutes even when 53 GB data cluster is used.
When restoring a backup file created from pg_dump or pg_dumpall, significant performance difference does not arise. It takes less than 4 hours to restore 53 GB data cluster.
(The restoration time is dependent on I/O performance and subject to change.)
Related Performance Data
Related Consideration Data