This document describes the system configuration parameters in the Write-Ahead Log (WAL) category.
Note!
To ensure system stability and security, manually modifying these parameters should be done with extreme caution.
Adds a time delay (in microseconds) before a WAL flush is initiated by COMMIT.
commit_delay. A delay is only applied if at least commit_siblings (see below) other active transactions exist when the flush is about to occur; otherwise, the delay is wasted.synchronous_commit is disabled.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 0 | 0 ~ 100000 | master; session; reload; superuser |
Minimum number of concurrent active transactions required to apply commit_delay.
| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 5 | 0 ~ 1000 | master; session; reload |
When this parameter is on, the YMatrix server writes the entire content of each page to the WAL during its first modification after a checkpoint.
fsync, though smaller. This setting should only be disabled when fsync can also be safely disabled.| Data Type | Default Value | Context |
|---|---|---|
| boolean | on | segments; system; reload |
If enabled, the YMatrix server ensures that updates are physically written to disk by issuing fsync() system calls or equivalent methods (see wal_sync_method). This guarantees the database cluster can recover to a consistent state after an OS or hardware crash.
fsync often improves performance, it may result in unrecoverable data corruption in the event of power loss or system crash. Therefore, fsync should only be disabled when the entire database can be easily rebuilt from external data.fsync include: initial loading of a new database cluster from a backup, using a cluster to process batch data before dropping and recreating it, or maintaining a read-only clone that is frequently rebuilt and not used for failover. High-quality hardware alone is not sufficient justification for disabling fsync.fsync from off to on, all modified kernel buffers must be forced to persistent storage for reliable recovery. This can be achieved when the cluster shuts down, when fsync is enabled via pg_ctl, during sync, unmounting the filesystem, or restarting the server.synchronous_commit for less critical transactions can yield significant performance gains without risking data corruption.full_page_writes (see above).| Data Type | Default Value | Context |
|---|---|---|
| boolean | on | segments; system; reload |
Controls the level of synchronization. Determines whether a transaction must wait for its WAL record to be written to disk before the command returns a success indicator to the client.
on. When set to off, there is a delay between reporting success to the client and actually guaranteeing the transaction will survive a server crash (maximum delay is three times wal_writer_delay). Unlike fsync, setting this to off does not risk database inconsistency: a crash may lose some recently committed transactions, but the database state will be as if those transactions were fully aborted. Thus, when performance is more important than full transaction durability, disabling synchronous_commit can be an effective alternative.remote_apply, the primary waits until the current synchronous standby acknowledges receipt of the commit record, flushes it to disk, and applies the transaction so it becomes visible to queries on the standby.on, the primary waits until the synchronous standby acknowledges receipt and has flushed the record to disk.remote_write, the primary waits until the synchronous standby acknowledges receipt and has written the record to its OS buffer (but not necessarily flushed to disk).local, the commit waits only for local disk flush, not for replication completion. This is typically not desired when using synchronous replication, but provided for completeness.synchronous_standby_names is empty, settings remote_apply, remote_write, on, and local all provide the same level: waiting only for local disk flush.| Setting | Local Durability | Standby Durability after YMatrix Crash | Standby Durability after OS Crash | Standby Query Visibility |
|---|---|---|---|---|
| remote_apply | Y | Y | Y | Y |
| on | Y | Y | Y | |
| remote_write | Y | Y | ||
| local | Y | |||
| off |
| Data Type | Default Value | Range | Context |
|---|---|---|---|
| enum | on | on / off / true / false / yes / no / 1 / 0 / remote_apply / remote_write / local | segments; session; reload |
Amount of shared memory used for WAL data that has not yet been written to disk (in units of WAL blocks, i.e., XLOG_BLCKSZ bytes).
-1 selects a size equal to 1/32 of shared_buffers (about 3%), but not less than 32kB and not more than the WAL segment size (typically 16MB).wal_buffers values offer little benefit. However, setting it to several megabytes may improve write performance on busy servers where many clients commit simultaneously.-1 enables auto-tuning, which yields reasonable results in most cases.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | -1 | -1 ~ INT_MAX/XLOG_BLCKSZ | segments; system; restart |
Enables compression of full-page writes in WAL.
on, and if full_page_writes is also on or during a base backup, the YMatrix server compresses full page images written to WAL. Compressed images are decompressed during WAL replay.| Data Type | Default Value | Context |
|---|---|---|
| boolean | off | segments; session; reload; superuser |
When set to on, new WAL files are initialized with zeros.
off, only the last byte is written when the file is created, giving it the expected size.| Data Type | Default Value | Context |
|---|---|---|
| boolean | on | segments; session; reload; superuser |
Determines how much information is written to the WAL.
replica writes enough data to support WAL archiving and replication, including read-only queries on standby servers. minimal omits all records except those required for crash or immediate shutdown recovery. logical adds information required for logical decoding. Each level includes all information from lower levels.minimal level, WAL logging for certain bulk operations can be safely skipped, making those operations faster. Applicable operations include: INSERT, UPDATE, DELETE, COPY into tables created or truncated within the same transaction.archive_mode) and streaming replication, replica or higher must be used.logical level, in addition to replica-level information, extra data is logged to allow extraction of logical change sets from WAL. Using logical increases WAL volume, especially when many tables are configured for REPLICA IDENTITY FULL and numerous UPDATE and DELETE statements are executed.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| enum | replica | replica / minimal / logical | segments; system; restart |
Perform full-page writes even for non-critical updates.
on, the YMatrix server writes the entire content of a disk page to WAL during its first modification after a checkpoint, even for non-critical changes such as hint bit updates.| Data Type | Default Value | Context |
|---|---|---|
| boolean | off | segments; system; restart |
Recycle WAL files.
on, WAL files are reused by renaming, avoiding creation of new files.| Data Type | Default Value | Context |
|---|---|---|
| boolean | on | segments; session; reload; superuser |
Method used to force WAL updates to disk.
fsync is disabled, this setting is irrelevant because WAL file updates will not be forced to disk. Possible values: open_datasync (write WAL file with O_DSYNC), fdatasync (call fdatasync() on each commit), fsync (call fsync() on each commit), fsync_writethrough (call fsync() with write-through directive), open_sync (write WAL file with O_SYNC).open_* options may also use O_DIRECT (if available).| Data Type | Default Value | Range | Context |
|---|---|---|---|
| enum | fsync / fdatasync (default on Linux) | open_datasync / fdatasync / fsync / fsync_writethrough / open_sync | segments; system; reload |
Specifies how often the WAL writer flushes WAL, in milliseconds.
wal_writer_delay, unless awakened early by an asynchronous commit.wal_writer_delay ago and less than wal_writer_flush_after bytes of WAL have been generated, WAL is written to the OS buffer only, not flushed to disk.wal_writer_delay to a value not divisible by 10 has the same effect as setting it to the next higher multiple of 10.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 200 | 1 ~ 10000 | segments; system; reload |
Specifies how often the WAL writer flushes WAL, in units of WAL blocks (i.e., XLOG_BLCKSZ bytes).
wal_writer_delay ago and less than wal_writer_flush_after bytes of WAL have been generated, WAL is written to the OS buffer only, not flushed to disk.wal_writer_flush_after is set to 0, WAL data is always flushed immediately.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 1048576/XLOG_BLCKSZ | 0 ~ INT_MAX | segments; system; reload |
Target duration for checkpoint completion.
| Data Type | Default Value | Range | Context |
|---|---|---|---|
| floating point | 0.5 | 0.0 ~ 1.0 | segments; system; reload |
When writing more than this amount of data during a checkpoint (in BLOCKS), attempt to force the OS to send these writes to underlying storage.
fsync is issued or when the OS performs large background writebacks.0, writeback forcing is disabled.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 0 | 0 ~ 256 | segments; system; reload |
Maximum time between automatic WAL checkpoints, in seconds.
86400 (i.e., 60 days).| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 300 | 30 ~ 86400 | segments; system; reload |
If checkpoints occur more frequently than this interval (in seconds) due to WAL segment fill, a warning message is written to the server log (indicating that max_wal_size should be increased).
0 to disable warnings. No warning is issued if checkpoint_timeout is less than this value.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 30 | 0 ~ INT_MAX | segments; system; reload |
Maximum size (in MB) to which WAL can grow between automatic WAL checkpoints.
max_wal_size under heavy load, checkpoint_timeout failure, or high wal_writer_delay.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 4096 | 2 ~ INT_MAX/1024 | segments; system; reload |
Minimum WAL size (in MB) between automatic WAL checkpoints.
| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 320 | 2 ~ INT_MAX/1024 | segments; system; reload |
Local shell command to archive a completed WAL segment file.
%p in the string is replaced by the path to the file to be archived, and %f is replaced by the file name (relative to the cluster's data directory). To embed a literal % character, use %%. Example: archive_command = 'cp %p /mnt/server/archivedir/%f'0 on success.archive_mode enabled. If archive_mode is enabled and archive_command is an empty string (default), WAL archiving is temporarily disabled, but the server continues to accumulate WAL segment files, waiting for a command to be provided.archive_command to a no-op command that always returns true (e.g., true on Unix or rem on Windows) effectively disables archiving and breaks the WAL chain required for archive recovery, so it should only be used in rare cases.| Data Type | Default Value | Context |
|---|---|---|
| string | segments; system; reload |
When archive_mode is enabled, completed WAL segments can be sent to archive storage via the archive_command.
off, two modes are available: on and always. During normal operation, there is no difference between them. However, when set to always, the WAL archiver is also active during archive recovery or standby mode. In always mode, all files restored from archive or received via streaming replication are archived again.archive_mode and archive_command are independent variables, allowing archive_command to be modified without affecting archiving mode.wal_level (see above) is set to minimal, archive_mode cannot be enabled.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| enum | off | off / on / always | segments; system; restart |
Force switch to a new WAL segment file after this time interval (in seconds).
archive_command is only invoked on completed WAL segments. Therefore, if the server generates little WAL traffic (or infrequent bursts), there may be a long delay between transaction completion and safe archival. To limit the lifetime of unarchived data, set archive_timeout to force periodic WAL segment switches.archive_timeout is unwise—it consumes excessive archive storage. A setting around one minute is typically reasonable.0, which disables this feature.| Data Type | Default Value | Range | Context |
|---|---|---|---|
| int | 0 | 0 ~ INT_MAX/2 | segments; system; reload |
These parameters apply only in recovery mode. They must be reset if further recovery operations are intended.
"Recovery" includes running the server as a standby (Standby) server or performing point-in-time recovery. Typically, standby mode provides high availability and/or read scalability, while point-in-time recovery is used to recover from data loss.
To start the server in standby mode, create a file named standby.signal in the data directory. The server enters recovery mode and does not stop when archived WAL ends, but attempts to continue by connecting to the sender server specified in primary_conninfo and/or fetching new WAL segments via restore_command.
To start the server in point-in-time recovery mode, create a file named recovery.signal in the data directory. If both standby.signal and recovery.signal are present, standby mode takes precedence. Point-in-time recovery ends when all archived WAL is replayed or when recovery_target is reached.
Shell command to execute at each restart point.
%r is replaced with the name of the file containing the last available restart point—the earliest file that must be retained for restartable recovery. All files older than %r can be safely removed. To embed a literal %, use %%. This information can be used to truncate the archive to the minimum required for current recovery restart. In single-standby setups, the pg_archivecleanup module is often used in archive_cleanup_command, e.g., archive_cleanup_command = 'pg_archivecleanup /mnt/server/archivedir %r'.| Data Type | Default Value | Context |
|---|---|---|
| string | segments; system; reload |
Shell command executed once when recovery finishes.
archive_cleanup_command, any %r is replaced with the name of the file containing the last available restart point.| Data Type | Default Value | Context |
|---|---|---|
| string | segments; system; reload |
Local shell command to retrieve an archived WAL segment from the WAL file series.
%p is replaced with the path to the file to be restored, %f with the file name (relative to the cluster's data directory). To embed a literal %, use %%. Any %r is replaced with the name of the file containing the last available restart point. Example: restore_command = 'cp /mnt/server/archivedir/%f "%p"'.0 on success. It will be queried for filenames not present in the archive; in such cases, it must return non-zero. Example:restore_command = 'cp /mnt/server/archivedir/%f "%p"'
restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"' # Windows
| Data Type | Default Value | Context |
|---|---|---|
| string | segments; system; restart |
These parameters are used only during targeted recovery operations. By default, recovery proceeds to the end of the WAL log. These parameters allow specifying an earlier stopping point.
Note!
At most one ofrecovery_target,recovery_target_name,recovery_target_time,recovery_target_xid, andrecovery_target_lsnmay be used. Using multiple in the configuration file results in an error.
--- SPLIT ---
recovery_target
This parameter specifies that recovery should end as soon as a consistent state is reached.
immediate is currently the only allowed value.| Data Type | Default Value | Range | Setting Scope |
|---|---|---|---|
| string | immediate | segments; system; restart |
recovery_target_actionSpecifies the action the server should take immediately upon reaching the recovery target.
pause, which means recovery will be paused. promote means recovery will end and the server will start accepting connections. shutdown means the server will stop after reaching the recovery target.pause setting is useful if the recovery target is the desired stopping point, allowing queries to be run on the database. The paused state can be resumed using pg_wal_replay_resume(), which will terminate recovery. If the recovery target is not the desired stopping point, shut down the server, change the recovery target setting to a later point, and restart to continue recovery.shutdown setting can be helpful to prepare the instance at the desired replay point. The instance will still be able to replay additional WAL records (and will in fact have to replay WAL records from the last checkpoint onward the next time it starts).recovery_target_action is set to shutdown, the recovery.signal file will not be removed. Any subsequent startup will result in immediate shutdown unless the configuration is changed or the recovery.signal file is manually removed.hot_standby is not enabled, the behavior of promote is the same as pause.| Data Type | Default Value | Range | Setting Scope |
|---|---|---|---|
| enum | pause | pause / promote / shutdown | segments; system; restart |
recovery_target_inclusiveSpecifies whether recovery stops after (on) or before (off) the specified recovery target.
recovery_target_lsn, recovery_target_time, or recovery_target_xid is specified.| Data Type | Default Value | Setting Scope |
|---|---|---|
| boolean | on | segments; system; restart |
recovery_target_lsnSpecifies the WAL LSN up to which recovery will proceed.
recovery_target_inclusive (see above).pg_lsn.| Data Type | Default Value | Setting Scope |
|---|---|---|
pg_lsn |
segments; system; restart |
recovery_target_nameSpecifies a named recovery point (created by pg_create_restore_point()) to which recovery will proceed.
| Data Type | Default Value | Setting Scope |
|---|---|---|
| string | segments; system; restart |
recovery_target_timeSpecifies the timestamp up to which recovery will proceed.
recovery_target_inclusive (see above).| Data Type | Default Value | Setting Scope |
|---|---|---|
| timestamp | segments; system; restart |
recovery_target_timelineSpecifies recovery into a specific timeline.
current recovers along the same timeline used during the base backup; latest recovers to the latest timeline found in the archive, which is useful for standby servers.| Data Type | Default Value | Range | Setting Scope |
|---|---|---|---|
| string | latest | current / latest / Timeline ID | segments; system; restart |
recovery_target_xidSpecifies the transaction ID at which recovery will stop.
recovery_target_inclusive (see above).| Data Type | Default Value | Setting Scope |
|---|---|---|
| string | segments; system; restart |