PostgreSQL对外部压缩方法的诉求

译者: bzhaoopenstack
作者: Amit Dattatray Khandekar
原文链接: https://amitdkhan-pg.blogspot.com/2020/08/need-for-external-compression-methods.html

Amit PSQL专家分析压缩库在PostgreSQL中的迫切诉求。

现今的每个数据库系统都有一定程度的数据压缩方法。很明显,这是为了减少数据库的大小,特别是在当今数据呈指数增长的时代。另外的原因是为了提高查询性能; 其思想是: 更小的数据大小意味着需要扫描的数据页面更少,这意味着更少的磁盘 i/o 和更快的数据访问。因此,无论如何,数据解压缩的速度应该足够快,以免影响查询性能(如果不能提高的话)。

压缩提供了不同的层次: 页面压缩、行压缩、列压缩等。柱型数据库的优点是它的列压缩比很高,因为在一个列中存在连续数据的重复区域。另一种情况是,在面向行的数据库中,列值非常大,因此有必要压缩列的单个值。如果这些值不适合放在单个页面中,甚至可以单独保留它们,并且该行具有指向行外压缩数据的指针。在 PostgreSQL 中,这种技术被称为 TOAST (超大型属性存储技术) ,对于可以包含可变长度数据的列,数据被明显地压缩并存储在同一行中,或者如果数据仍然太大,则将数据以较小的块形式单独的存储在称为 TOAST table表的行中,这些块本身可能被压缩,也可能不被压缩。

压缩为不同的数据操作提供了可能性。它可能不会被限制只有几秒数据压缩。例如,在容灾系统中,把redo logs从主服务器到从服务器的传输可能成为一个巨大的网络瓶颈,因此许多 RDBMS 提供压缩redo logs的功能。

然后是 RDBMS 使用或提供选项可选的压缩算法。这一点尤其适用于数据压缩。由于数据是用户的数据,用户数据中的特定格式可能适合特定的压缩算法,而不同的存储格式可能适合另一种压缩算法。此外,这意味着,如果 RDBMS 提供一个选项,为特定列选择特定的压缩算法,或者从众所周知的标准压缩库列表(如 zlib、 lz4、 zstd、 snappy、 gzip 等)中选择特定的用户定义类型,那么这种方法将更加有益。或者,库算法都可以是完全定制的。

并且提供了与 CPU 内核紧密耦合的压缩、加密和 SIMD 硬件加速器,这些硬件加速器可以通过压缩或加密算法加以利用。其中一个例子是Kunpeng Zlib Acceleration Engine, 它提供了一个支持硬件的基础设施,用于在“ Kunpeng 920” ARM64处理器上进行压缩。我还没有机会测试这种能力,但它听起来很有希望。

此外,压缩/加密算法在数据上执行重复的任务,这是利用 SIMD 向量化的自然选择。已经有一些独立的项目在 ARM64和 Intel 上进行,以便在 zlib、 lz4等著名的压缩库中进行这种特定于平台的增强。参看NEON Intrinsics case study 关于优化 zlib 的 adler-32算法的 NEON intrinsic 案例研究。

所有这些都直接表明,RDBMS 服务器迫切需要为用户提供针对特定表或特定列的本地压缩算法/库的选择。在写这篇文章的时候,PostgreSQL 使用基于 LZ它自己的内建压缩算法 来压缩Toast 表。想象一下,如果有一个用于选择 zlib 的接口,而不是内置的算法。进一步,选择 zlib 压缩级别。更进一步,为用户添加一个界面来创建一个扩展,该扩展使用特定平台的自定义算法,该平台使用硬件加速。

OK,我们正在实现一个这样的特性。在 PostgreSQL 黑客社区中查看这个 讨论主题。这个特性可能还有很长的路要走(截至本文撰写之时) ,但是我对这个特性充满希望,因为如上所示,用例足够强大,对这个功能没有反对意见,并且提交了work-in-progress的补丁。

我查看了这个补丁,玩了一下。粗略地说,下面是操作界面的样子。在补丁集完全具体化之后,接口可能会有所不同,但我认为它的本质或多或少会保持不变。下面是我的测试结果; 请注意,这只是为了通过例子强调这个功能是多么的酷和有用,并且使我在这个博客中解释的任何东西都有意义。

CREATE TABLE zlibtab(t TEXT COMPRESSION zlib WITH (level ‘4’));
CREATE TABLE lztab(t TEXT);
ALTER TABLE lztab ALTER COLUMN t SET COMPRESSION pglz;

pgg:s2:pg$ time psql -c “\copy zlibtab from text.data”
COPY 13050

real 0m1.344s
user 0m0.031s
sys 0m0.026s

pgg:s2:pg$ time psql -c “\copy lztab from text.data”
COPY 13050

real 0m2.088s
user 0m0.008s
sys 0m0.050s

pgg:s2:pg$ time psql -c “select pg_table_size(‘zlibtab’::regclass), pg_table_size(‘lztab’::regclass)”
pg_table_size | pg_table_size
—————+—————
1261568 | 1687552

pgg:s2:pg$ time psql -c “select NULL from zlibtab where t like ‘0000’” > /dev/null

real 0m0.127s
user 0m0.000s
sys 0m0.002s

pgg:s2:pg$ time psql -c “select NULL from lztab where t like ‘0000’” > /dev/null

real 0m0.050s
user 0m0.002s
sys 0m0.000s

注意两种不同的压缩算法在压缩大小、插入数据(压缩)和选择数据(解压)的速度上是如何不同的。

你甚至可以创建一个新的压缩访问函数,就像我们创建一个新的索引一样:

CREATE ACCESS METHOD pglz1 TYPE COMPRESSION HANDLER my_compression_handler;
其中my_compression_handler 应该是一个 PostgreSQL C 函数,可以使用 PostgreSQL 扩展创建。这个函数为一组预定义的钩子分配它自己的实现函数,这些钩子定义了 PostgreSQL 核心使用压缩访问方法所需要知道的一切:

Datum
my_compression_handler(PG_FUNCTION_ARGS)
{
CompressionAmRoutine *routine = makeNode(CompressionAmRoutine);

​ routine->cmcheck = my_cmcheck;
​ routine->cminitstate = my_cminitstate;
​ routine->cmcompress = my_cmcompress;
​ routine->cmdecompress = my_cmdecompress;
​ routine->cmdecompress_slice = NULL;

​ PG_RETURN_POINTER(routine);
}

这是 PostgreSQL 高度可扩展的方式: 允许用户使用内置方法,但也为用户提供了一种方法来定义他/她自己的方法来完成相同的工作。上面的所有函数都在一个 PostgreSQL 扩展中,可以使用:
CREATE EXTENSION my_compression;

Every modern database system has some way to compress its data at some level. The obvious reason for this feature is to reduce the size of it’s database, especially in today’s world where the data is growing exponentially. The less obvious reason is to improve query performance; the idea is: smaller data size means less data pages to scan, which means lesser disk i/o and faster data access. So, in any case, data de-compression should be fast enough so as not to hamper the query performance, if not improve it.

Compression is offered at different levels : page compression, row compression, column compression, etc. Columnar databases have the advantage of a very high compression ratio of its column because of presence of a repetetive pattern of contiguous data in a column. Another case is when, in a row oriented database, the column values are so large that it makes sense to compress individual values of the column. Such values can even be kept separately if they do not fit in a single page. And the row has pointers to the out-of-line compressed data. In PostgreSQL, such technique is called TOAST (The Oversized-Attribute Storage Technique), where, for columns that can contain variable-length data, the data is transparently compressed and stored in the same row, or else if it is still too large, it is stored in smaller chunks as rows in a separate table called a toast table, where these chunks themselves may or may not be compressed.

Compression is offered for different purposes. It may not be restricted for only data compression. E.g. in a replication system, the transfer of redo logs from the master to slave can become a huge network bottleneck, so many RDBMS offer to compress redo logs.

And then comes the compression algorithms that the RDBMS uses or gives options to choose. This applies especially more to data compression. Since data is user’s data, a specific pattern in the user data might suit a particular compression algorithm, while a different pattern might be suitable for another compression algorithm. Moreover, this implies that it would be far more beneficial if the RDBMS gives an option to choose a specific compression algorithm for a specific column or a specific user-defined type out of a list of well-known standard compression libraries such as zlib, lz4, ztd, snappy, gzip, etc. Or, the library algorithm may very well be a completely customized one.

Secondly, there has been a lot of advancements to optimize compression algorithms for specific platforms, and provide hardware accelerators for Compression, Encryption and SIMD that are closely coupled to CPU cores, which can then be levergaed by compression or encryption algorithms. One such example is the Kunpeng Zlib Acceleration Engine, which offers a hardware-enabled infrastructure for compression on a “Kunpeng 920” ARM64 processor. I haven’t got a chance to test this capability, but it does sound promising.

Furthermore, the compression/encryption algorithms inherently do repetitive tasks over the data, which is a natural fit for leveraging SIMD vectorization. There has been independent projects going on on both ARM64 and Intel to do such platform-specific enhancements in well known libraries like zlib, lz4 etc. Check out this NEON Intrinsics case study that optimizes zlib’s adler-32 algorithm using NEON intrinsics.

All this directly points to an urgent need for RDBMS servers to give users a choice for specific native compression algorithms/libraries for specific tables or specific columns. As of this writing, PostgreSQL uses its own built-in compression algorithm based on LZ for toast table compression. Imagine if there were an interface to select zlib instead of the built-in algorithm. Further, select the zlib compression level. Still further, add an interface for users to create an extension that uses a customized algorithm native to a specific platform that uses hardware acceleration.

Well, there is exactly such a proposed feature in the making. Check out this discussion thread in the PostgreSQL hackers community. It may be a long way to go (as of this writing), but I am very hopeful of this feature going in, because the use-cases are strong enough as shown above, there are no fundamental objections to this functionality, and there are work-in-progress patches submitted.

I went ahead and applied this patch, and played around it. Roughly, below is how the interface looks like. After the patch-set fully materializes, the interface might be different, but I think the essence of it would remain more or less the same. Below is the output of my tests; please note that it is just to emphasize with examples how cool and useful this feature would be, and to make sense of whatever I explained above in this blog.

CREATE TABLE zlibtab(t TEXT COMPRESSION zlib WITH (level ‘4’));
CREATE TABLE lztab(t TEXT);
ALTER TABLE lztab ALTER COLUMN t SET COMPRESSION pglz;

pgg:s2:pg$ time psql -c “\copy zlibtab from text.data”
COPY 13050

real 0m1.344s
user 0m0.031s
sys 0m0.026s

pgg:s2:pg$ time psql -c “\copy lztab from text.data”
COPY 13050

real 0m2.088s
user 0m0.008s
sys 0m0.050s

pgg:s2:pg$ time psql -c “select pg_table_size(‘zlibtab’::regclass), pg_table_size(‘lztab’::regclass)”
pg_table_size | pg_table_size
—————+—————
1261568 | 1687552

pgg:s2:pg$ time psql -c “select NULL from zlibtab where t like ‘0000’” > /dev/null

real 0m0.127s
user 0m0.000s
sys 0m0.002s

pgg:s2:pg$ time psql -c “select NULL from lztab where t like ‘0000’” > /dev/null

real 0m0.050s
user 0m0.002s
sys 0m0.000s

Notice how two different compression algorithms differ in the compressed size, and the speed of inserting data (compression) and selecting data (decompression).

You would even be able to create a new compression access method using the same way as we do for creating a new index :
CREATE ACCESS METHOD pglz1 TYPE COMPRESSION HANDLER my_compression_handler;
where my_compression_handler should be a PostgreSQL C function that could be created using a PostgreSQL extension. This function assigns its own implementation functions for a set of pre-defined hooks that define everything that the PostgreSQL core needs to know to make use of the compression access method :

Datum
my_compression_handler(PG_FUNCTION_ARGS)
{
CompressionAmRoutine *routine = makeNode(CompressionAmRoutine);

​ routine->cmcheck = my_cmcheck;
​ routine->cminitstate = my_cminitstate;
​ routine->cmcompress = my_cmcompress;
​ routine->cmdecompress = my_cmdecompress;
​ routine->cmdecompress_slice = NULL;

​ PG_RETURN_POINTER(routine);
}

This is PostgreSQL’s way of being highly extensible : Allow user to use built-in methods, but also provide a way for the user to define his/her own methods for doing the same job. All the above functions would be inside an PostgreSQL extension, that could be created using:
CREATE EXTENSION my_compression;

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×