You can explicitly set windowLog with --zstd=windowLog=...
It's sometimes useful to combine low-ish compression level with high window size. E.g. when the input data contains multiple similar large chunks that do not fit into the low-compression-level window.
At work we've recently been using zstd as a better-compressing alternative to gzip, and overall I've been pretty happy with it. A minor documentation gripe, though, is that the behavior around multithreaded compression is a bit unclear. I understand it's chunking the work and sending chunks to different threads to parallelize the compression process, and this means that I should expect to see better use of threads on larger files because there are more chunks to spread around, but what is the relationship?
When I look in
man zstdI see that you can set-B<num>to specify the size of the chunks, and it's documented as "generally 4 * windowSize". Except the documentation doesn't say howwindowSizeis set.From a bit of poking at the source, it looks to me like the way this works is that
windowSizeis2**windowLog, andwindowLogdepends on your compression level. If I know I'm doingzstd -15, though, how doescompressionLevel=15translate into a value forwindowLog? There's a table inlib/compress/clevels.hwhich covers inputs >256KB:See the source if you're interested in other sizes.
So it looks like
windowSizeis:≤1: 524k2: 1M3-8(default): 2M9-16: 4M17-19: 8M20: 32M21: 64M22: 128MProbably best not to rely on any of this, but it's good to know what
zstd -<level>is doing by default!Comment via: facebook, mastodon, bluesky