Zram and zswap: choosing compression for constrained systems

I've built systems that swap differently depending on what dies first: the CPU or the storage. Zram and zswap solve adjacent problems, and picking the wrong one costs you either write cycles or latency.

Zram and zswap: which memory compression fits your homelab

The kernel handles memory pressure differently depending on which compression layer you put in front of swap, and the wrong choice for your setup will cost you either write cycles, CPU time, or both. Zram and zswap solve adjacent problems but are not interchangeable, and running them together without care produces a configuration that silently compresses pages twice.

How the kernel routes pages under each approach

Zram presents a fixed-size compressed block device. When the kernel decides a page should be swapped out, it writes that page to /dev/zram0 (or whichever device is active), where it is compressed in RAM and stored there permanently until read back or discarded. There is no automatic disk fallback. If the zram device fills, the kernel either OOM-kills processes or starts using any other lower-priority swap target you have configured. Zram has no write-back path of its own.

Zswap works as a write-back cache sitting in front of a backing swap device, a partition or swapfile on real storage. Pages heading to disk are intercepted, compressed, and held in a dynamically sized RAM pool capped at zswap.max_pool_percent of physical memory. When that pool fills, the coldest pages (LRU order) are written out in their compressed form to the backing device to make room. Zswap therefore never blocks on a full pool; it degrades gracefully by spilling to disk.

The double-compression trap

If zswap is enabled and zram is configured as a swap device, pages evicted from the zswap pool go to disk via zram. Zswap compresses the page once; zram compresses it again on write. The second compression pass burns CPU cycles for no gain, since the data is already a compressed stream and will not shrink further. On a single-core or low-clock VM, that overhead is measurable. Disable zswap before using zram as your sole swap target:

zswap.enabled=0

Set this as a kernel parameter in /etc/default/grub or, on a systemd-boot host, in the loader entry under /boot/loader/entries/.

LZ4 versus zstd

LZ4 compresses and decompresses faster than zstd but produces larger output. On a single-core VM running at 1–2 GHz, LZ4 is the better choice: the CPU cost of zstd at peak swap pressure can introduce visible latency. On a host with spare CPU headroom, zstd’s better ratio means fewer pages hit the backing device. For zswap, the default in recent upstream kernels is zstd; change it at boot with zswap.compressor=lz4 if you are CPU-constrained. For zram, set it per-device:

echo lz4 > /sys/block/zram0/comp_algorithm

Do this before activating the device with mkswap. Once the device is active the algorithm is locked.

vm.swappiness and zram priority

When zram and a disk-backed swap coexist, vm.swappiness controls how aggressively the kernel reclaims anonymous pages at all. Priority on the swap devices controls which target receives pages when the kernel does decide to swap. Set zram to a higher priority than any disk swap so that pages go to compressed RAM before touching storage:

swapon /dev/zram0 -p 100
swapon /swapfile -p 10

With vm.swappiness=100 on a RAM-constrained host you get maximum use of zram before disk is touched. Drop it to 60 on hosts where you want the kernel to be less aggressive about reclaiming anonymous memory in favour of keeping the page cache warm.

Sizing for your actual setup

Under 2 GB RAM

Size zram at twice physical memory and disable zswap entirely. A 2:1 ratio is safe for typical workloads; real compression ratios on most process memory hover between 2:1 and 3:1 with LZ4. Disable zswap via the kernel command line, not just by not loading the module, because some distributions enable it by default in the compiled kernel config.

With systemd-zram-generator, add /etc/systemd/zram-generator.conf:

ini
[zram0] zram-size = ram * 2
compression-algorithm = lz4
swap-priority = 100

Reload with systemctl daemon-reload then systemctl start systemd-zram-setup@zram0.service.

2 GB to 8 GB with NVMe backing

Zswap at 20 percent of physical memory with zstd and a swapfile as the backing device works well here. The pool absorbs short bursts; the NVMe catches the overflow without the write latency of eMMC or SD. Configure the kernel parameters:

zswap.enabled=1
zswap.compressor=zstd
zswap.maxpoolpercent=20
zswap.zpool=zsmalloc

Create a swapfile at 2–4 GB and activate it at low priority (swapon /swapfile -p 10). The swapfile exists purely as overflow; under normal pressure it should stay near empty.

Flash storage and SD card hosts

On Raspberry Pi nodes, eMMC-backed Proxmox hosts, or any host where the backing storage has a finite write endurance, keep pages in compressed RAM and avoid touching the device at all. Use zram-only, set zswap.enabled=0, and size the zram device conservatively enough that OOM-kill is a last resort rather than a routine event. A Pi 4 with 4 GB RAM is fine with a 4 GB zram device using lz4; 8 GB zram on a 4 GB host is asking for thrash.

Proxmox and KVM guests

On the Proxmox host itself, zswap with zsmalloc and zstd is sensible if you have NVMe backing. For individual KVM guests, set swappiness per-VM by writing to /proc/sys/vm/swappiness at guest boot, or drop a file in /etc/sysctl.d/ inside the guest:

vm.swappiness = 100

Size zram inside the guest relative to the guest’s allocated vRAM, not the host’s physical RAM. A guest with 1 GB vRAM and a 2 GB zram device and no disk swap is a clean, low-latency configuration for light workloads.

Checking what is actually happening

Run zramctl with no arguments to see active devices, their algorithm, disk size, compressed data size, and memory consumed:

zramctl

The DATA column shows uncompressed bytes stored; COMPR shows the compressed bytes in RAM. A ratio of DATA/COMPR below 1.5 on LZ4 suggests either the workload produces mostly incompressible data or the device is close to empty and the sample is not representative.

For zswap, read the counters from the debugfs interface:

grep -r ” /sys/kernel/debug/zswap/

Key values: pool_total_size is bytes currently in the compressed pool; stored_pages is the page count; written_back_pages shows how many pages have been evicted to the backing device. If written_back_pages is climbing under normal load with a 20 percent pool, raise zswap.max_pool_percent or switch to zstd to fit more into the same pool budget.

If written_back_pages is zero and stored_pages is healthy, the configuration is doing exactly what it should: absorbing swap pressure in RAM and never touching disk.