PostgreSQL重要参数辨析-PostgreSQL数据库-樱桃溪学院

PostgreSQL重要参数辨析

xiaobu 5月前 179

数据库作为仅次于操作系统之后最复杂最重要的软件，需要有专门的管理员伺候它。数据库是承接底层基础设施和应用逻辑的桥梁，它不仅技术复杂，也类似政治课，需要死记硬背一些知识。每个数据库都有成百上千个参数，面试时往往要考察你对一些重要参数的理解。我在本贴中，针对每个重要的，常考的参数做一些辨析工作，希望大家在理解的基础上进行适当记忆。

shared_buffers

这可能是PG里面按重要性排名第一的参数了。它控制这共享池(shared buffer)的大小，而共享池的大小又往往占据整个共享内存的绝大部分，所以这个参数也影响着整个共享内存的体积。

我们可以把共享内存想象为一个巨大的一维数组，里面每个单元都是一个字节。这个共享内存可以被所有的由postmaster主进程通过fork()系统调用而产生的子进程访问，及读和写(read and write)。在源代码中，这块内存是通过mmap()系统调用产生，在数据库实例关闭之前通过munmap()释放资源。关于这两个系统调用的用法，如果你不熟悉，不妨去查查相关文档，编写一个小的C语言程序测试一下它们的用法。

在共享内存中最大体积的就是共享池，共享池的作用非常简单粗暴，就是从磁盘上读取数据文件的数据块到本池中。因为数据文件的数据块都是尺寸一样，通常情况下是8192字节，所以共享池也是按照8192字节划分，这个数据块的大小，在源代码中是BLCKSZ来控制。

src/include/pg_config.h:#define BLCKSZ 8192

你在编译源码时，第一步就是configure命令，你可以在这个命令中输入参数，控制数据块的大小，最大是32KB。

shared_buffers参数的单位是字节(byte)，它规定了共享池的大小，缺省值为128MB。如果shared_buffers是128MB，每个数据块是8KB，则我们可以计算出共享池中共计包含了128*1204/8 = 16384个数据页。你可以把共享池想象为16384个数据页组成的一个一维数组。

更大的内存带来更好的性能，这是基本常识。所以只要你的数据库服务器的物理内存足够大，shared_buffers的值越大越好。但是根据文档中的建议，这个值最好设置为服务器物理内存的25% ~ 40%。

最新回复 (2)

xiaobu 5月前

引用 2楼
max_wal_size

可以参考这篇文档。

https://www.enterprisedb.com/blog/tuning-maxwalsize-postgresql

The max_wal_size parameter tells PostgreSQL what the approximate maximum total size of the stored WAL segments should be. Once the allowed space is exhausted, a checkpoint is requested so that the space can be recycled.

max_wal_size参数告诉PG数据库实例，在pg_wal目录下保存的WAL文件的总的体积的上限。如果在pg_wal目录下的所有的wal文件的体积超过这个值，就会触发一个检查点，因为检查点完成后，大部分wal文件都可以不需要了，它们要么被直接删除，要么被归档后删除，或者改名后重用。

这段话也很精彩，希望大家在面试时可以说出来，显示你的专业水平。

On a well configured system, the vast majority of checkpoints should be timed (based on the checkpoint_timeout parameter) rather than requested. This ensures that checkpoints happen on a regular, predictable schedule, allowing the load to be evenly spread throughout the normal operation of the system. Requested checkpoints are inherently unpredictable and thus can cause variations in performance by adding additional load when it is not expected. This is particularly disruptive on systems with slow I/O, such as those with magnetic drives or virtual machines with limits on IOPs.

在系统试图pg_stat_checkpointer中有两列，num_timed和num_requested表明因为超时而导致的检查点发生的次数，和因为别的原因而导致的检查点的次数。很显然num_timed越多越好，num_requested越少越好。

如果你读checkpointer进程的源码，你会发现检查点的触发机制就分为两大类，一个是超时触发，一个是某个flag触发。请查阅checkpointer.c中CheckpointerMain()函数中的do_checkpoint变量是在什么情况下变为true的，因为有如下代码：
```
if (do_checkpoint) /// 如果需要做一个检查点。
{
    CreateCheckPoint(flags); /// 这是执行检查点的真正函数
}
```
xiaobu 5月前

引用 3楼
checkpoint_timeout

这个参数的含义非常好理解。官方文档是这么描述这个参数的：

Maximum time between automatic WAL checkpoints. If this value is specified without units, it is taken as seconds. The valid range is between 30 seconds and one day. The default is five minutes (5min). Increasing this parameter can increase the amount of time needed for crash recovery. This parameter can only be set in the postgresql.conf file or on the server command line.

检查点操作有两种类型，第一种是每隔一段时间就自动执行一次，第二种是各种条件满足后会设置共享内存中的某个ckpt_flags为非零值，也会触发检查点。checkpoint_timeout是针对第一种情况的，它是描述时间间隔的，缺省是5分钟，即每隔五分钟就会执行一次检查点。

如果我们阅读checkpointer的源码，我们会看到如下代码：在checkpointer进程启动阶段，会记录当前的时间到last_checkpoint_time这个变量中，它是一个8字节的有符号整数。
```
typedef int64 pg_time_t;
static pg_time_t last_checkpoint_time;
static pg_time_t last_xlog_switch_time;

last_checkpoint_time = last_xlog_switch_time = (pg_time_t) time(NULL); 
```
checkpointer进程完成各种初始化工作后，就会进入一个无限循环for(;;)，在这个循环中，有如下逻辑：
```
bool		do_checkpoint = false;
pg_time_t	now;
int			elapsed_secs;

now = (pg_time_t) time(NULL); /// 获取当前的时间
elapsed_secs = now - last_checkpoint_time; /// elapsed_secs是自上一次检查点以后流逝的秒数。
if (elapsed_secs >= CheckPointTimeout) /// 因为超时，会触发检查点操作。
		{
			if (!do_checkpoint)
				chkpt_or_rstpt_timed = true;
			do_checkpoint = true;
			flags |= CHECKPOINT_CAUSE_TIME; /// 设置标志位，表示是因为超时导致的检查点。
		}
```
上面的逻辑不难理解，每次循环都会获取当前时间和last_checkpoint_time相减，结果是秒数elapsed_secs，如果这个值大于等于CheckPointTimeOut，则设置do_checkpoint为true，表示要执行一次检查点。后面的逻辑如下：
```
		/*
		 * Do a checkpoint if requested.
		 */
		if (do_checkpoint) /// 如果需要做一个检查点。
		{
			bool		ckpt_performed = false;
			bool		do_restartpoint;

			/* Check if we should perform a checkpoint or a restartpoint. */
			do_restartpoint = RecoveryInProgress(); /// 如果处于备库模式，就为true，否则为false

			/*
			 * Do the checkpoint.
			 */
			if (!do_restartpoint)
			{
				CreateCheckPoint(flags);
				ckpt_performed = true;
			}
			else
				ckpt_performed = CreateRestartPoint(flags);
		if (ckpt_performed) /// 如果执行了Checkpoint，true/false
			{
				/*
				 * Note we record the checkpoint start time not end time as
				 * last_checkpoint_time.  This is so that time-driven
				 * checkpoints happen at a predictable spacing.
				 */
				last_checkpoint_time = now;

				if (do_restartpoint)
					PendingCheckpointerStats.restartpoints_performed++;
			}
```
当检查点执行完毕后，last_checkpoint_time的值被设置为now的值，表示本次检查点的发生时间。在下一次循环过程中，继续检查它和now之间的流逝的时间是否大于CheckPointTimeOut，决定是否执行一次检查点。

由于超时而引发的检查点被称为timed checkpoint，这种检查点越多越好。根据不同的PG版本，你可以在pg_stat_checkpointer或者pg_stat_bgwriter系统视图中看到这种检查点发生的次数。

发新帖

xiaobu

主题数
49

帖子数
165

注册排名
19

PostgreSQL重要参数辨析

shared_buffers

max_wal_size

checkpoint_timeout

xiaobu