Optimizing file size

From Unvanquished
Jump to: navigation, search
🧠️

Brain dump

This article is a brain dump: its content has been quickly written to not lose the precious knowledge but the content has to be rewritten properly and maybe moved to more appropriate pages.

Those are tricks that can be used to save file space in released archives, packages and repositories.

Binaries

Extracting debug files from binaries means the binary is smaller in memory and faster to extract from the archive storing it. (For example if a mod ships some .nexe file, the symbols can be saved in another file using a better compression algorithm (for example, .7z) in the same zip the .nexe files are, given the engine will never have to load the symbols it will both save file space, game memory usage, and binary extraction time.

Images

XCF

XCF files to be stored in repositories can be compressed to xz. GIMP knows how to open and save .xcf.xz files.

One can simply write .xcf.xz file extension in the GIMP saving dialog when saving the file, or compress with the xz tool an existing uncompressed .xcf file.

This compression is better than the builtin compression checkbox available in the save dialog GIMP (the builtin one is believed to use zlib and is less efficient). Also, .xcf.xz files can just be decompressed with the xz tool to load them in third-party tools that may not support .xcf.xz.

Recent GIMP versions offer an option to compress some internal .xcf data but it compresses less than .xcf.xz.

To compress an uncompressed .xcf file with maximum compression, one can do:

 xz -9 file.xcf

WebP

To convert a PNG to truly lossles WebP with maximum compression, one can do:

 cwebp -mt -exact -lossless -z 9 file.png -o file.webp

The -exact option is mandatory otherwise data still be modified despite the {{code|-lossless} option is used. For example with a full transparent image with no RGB color being seen, without this option RGB colors may be destroyed because it still produces the same result bit-to-bit when the transparency is applied, but in a game image channels can store many different kind of data and not colors, and in a repository, we don't want to loose data at all.

PNG

It's now recommended to use WebP instead of PNG anyway.

PNG images can be compressed further by removing some metadata (like metadata telling what tool produced it), and some tools like oxipng can be used to recompress the PNG using the best PNG storage profile. The PNG efficency doesn't really come from the compression algorithm (it is not better than zip), but comes from the various storage profiles and the way to code or not code RGB, palettes, alpha channels…

The pngwolf tool uses a genetic algorithm to find filter combinations that produces file that compresses well.

Then, the internal zlib compression can be optimized using the Zopfli algorithm, like when using the pngwolf-zopfli tool which adds zopfli-optimized zlib compression over the storage optimization of pngwolf. The oxipng tool also supports zopfli but is not as optimal as pngwolf-zopfli.

Since those tools keep the original file if they fail to compress more, and sometime one can do better than the other, it is a good idea to first strip the PNG from useless metadata, then re compress the PNG with oxipng then recompress it with pngwolf-zopfli to produce the smallest possible files.

Existing TGA files better be compressed into optimized PNG before saving them in repositories, unless they are very small.

TGA

TGA is a lossless image format. It is uncompressed but the format is very terse on metadata, and the ID field itself can be omitted. This format being supported by both production tools like NetRadiant and q3map2 and by the engine, this is the best format to store 1-pixel color images both in repository and in DPK archive for distribution. Without the optional ID field, an opaque 1-pixel RGB color image weights only 21 bytes and a translucent 1-pixel RGBA color image only weights 22 bytes.

JPG

A tool similar to zopfli exists for .jpg files, it is named guetzli but there is no reason to use it.

The reason why there is no reason to use zopfli is that JPEG is already a lossy format, so recompressing a JPG to a JPG would just degrade more the image while keeping other JPG limitations.

Lossy JPG are recommended to be recompressed to CRN format as well (as CRN is better for GPU memory usage, so since we recommend converting JPG to CRN when recompressing JPG to another lossy format, there is no reason to convert JPG to JPG.

For lossless images like PNG or TGA that have to be recompressed into a lossy format, prefer compressing into the CRN format instead of JPG.

So we don't have usage for guetzli for the game. It may have some usage for some web resources though.

CRN

For CRN normal maps without height maps, better use the -dxn option that shifts some channels to make lossy compression more efficient. This option reuses the alpha channel for non-alpha data (destructing the alpha channel in the process) that's why the -dxn option must only be used with normal maps without height map in alpha channel. For normal maps with height map in alpha channel, the standard crunch compression should be used instead.

Audio

FLAC

FLAC is used to store lossless audio files in repositories. To convert a WAV to FLAC with maximum compression, one can do:

 flac --exhaustive-model-search --best --no-padding --no-seektable file.wav --output-name=file.flac

The seektable is not mandatory to seek a FLAC, so we can drop it.

Models

Blender supports a compressed .blend variant that is just a .blend file compressed with gzip to produce a .blend.gz file renamed as .blend again.

To compress an uncompressed .blend file with maximum compression, one can do:

 gzip -9 model.blend
 mv file.blend.gz file.blend

Zip

It's possible to store files in zip without storing the parent directories, this save zip file space for a negligible cost. This would mean directories permissions and change date will not be stored but that's usually useless.

It's possible to set all the files to the same date, time and permission. This will probably not reduce the zip file size itself, but it will reduce the zip file size containing the zip file. So, for example, setting all the files to the same date, time and permission when creating a .dpk archive and an engine zip will reduce the size of the release zip containing all of them.

The best zip compressor that is 7zip in zip mode, 7z -tzip -mx=9 a <archive> <file> produces good result, it's possible to achieve more compression by increasing other options like 7z -mx=9 -mfb=258 -mpass=15 <archive> <file>. By default 7z -tzip <archive> <file> is equivalent to 7z -tzip -mx5 -mfb32 -mpass=1 <archive> <file>.

It's possible to then produce smaller zip using zopfli-enabled compressors like advzip. It's still a good idea to compress with 7zip before compressing with advzip because 7z is really efficient and may stometime produce smaller files than zopfli, so advzip will just keep the smallest compression for each file.

To recompress a zip with advzip and zopfli, one can do advzip -z -4 <archive>, unfortunately advzip processes every file in the archive sequentially. One may want to compress every file in a separate zip, optimize every zip then produce the final zip with zipmerge.