Thanks for the heads up, I was scheduled to upgrade my PBS next weekend.
Proxmox
Proxmox VE is a complete, open-source server management platform for enterprise virtualization. It tightly integrates the KVM hypervisor and Linux Containers (LXC), software-defined storage and networking functionality, on a single platform. With the integrated web-based user interface you can manage VMs and containers, high availability for clusters, or the integrated disaster recovery tools with ease.
I'm also concerned. This appears to be serious. I've frozen upgrades as well.
This is the latest from the staff.
- My guess is that you are writing and pruning backups faster than the GC is able to keep up with. New backups might be created faster (using fast incremental mode) than the GC phase 1can keep up with due to cache capacity limit. How big are such backup snapshots typically in your case? Are they in the TiB range? In previous versions new snapshot indices were not considered during GC, which could however lead to possibly not touching all chunks in very specific edge cases with long running GC and high frequency pruning setups.
- You can try and increase the gc-cache-capacity in the datastore tuning options to the maximum value of 8388608 and restart GC, see https://pbs.proxmox.com/docs/storage.html#tuning
- In general, you should consider adding a special device for such a storage setup, see https://pbs.proxmox.com/docs/sysadmin.html#zfs-special-device
My interpretation is that this issue will affect setups with slower disks and larger backups.
The recommendation to add an SSD special vdev is an interesting aside. Like, are they actually pushing that now? That's some ZFS rocket surgery. (I like ZFS.)