It would be a fun test to run. But I'm not encouraged by the fact that the existing brotli dictionary already contains a bunch of javascript specific stuff:
brotli literally already has a tokens for function/return/throw/indexOf(/.match/.length/etc.
Also verify after decompress is not without tradeoffs. On one hand we have folks like github who can't change the version of zlib because people rely on identical .tar.gz. https://hackernews.hn/item?id=34586917
On the other hand we have a whole lot of iffy stuff you can do to make programs decompressing content use large amounts of resources https://en.wikipedia.org/wiki/Zip_bomb which makes "decompress this potentially untrusted file so that I can validate it's safe to use" hard.
> brotli literally already has a tokens for function/return/throw/indexOf(/.match/.length/etc.
Yeah, I see it already has a lot of JavaScript, HTML and CSS content. Interesting. I didn't realize it had an existing web-focused token library, and figured it was more like zstd, 7z and zlib, which I believe have none.
I would love to do the experiment if I had time. I wonder what is the laziest way to do it?
https://gist.github.com/klauspost/2900d5ba6f9b65d69c8e
brotli literally already has a tokens for function/return/throw/indexOf(/.match/.length/etc.
Also verify after decompress is not without tradeoffs. On one hand we have folks like github who can't change the version of zlib because people rely on identical .tar.gz. https://hackernews.hn/item?id=34586917
On the other hand we have a whole lot of iffy stuff you can do to make programs decompressing content use large amounts of resources https://en.wikipedia.org/wiki/Zip_bomb which makes "decompress this potentially untrusted file so that I can validate it's safe to use" hard.