减小包体积 | Ask WWDC

To reduce package size, Apple has introduced several techniques and workflows, particularly in the context of machine learning models and 3D assets, as discussed in various WWDC sessions.

Machine Learning Model Compression

In the session "Bring your machine learning and AI models to Apple silicon," several techniques for reducing model size were discussed:

Palletization: This technique clusters weights with similar values and represents them using cluster centroids stored in a lookup table. This can significantly reduce the model size by compressing weights to fewer bits.
Quantization: This involves mapping float weight values into an integer range, which are then stored with quantization parameters. This can reduce the model size while maintaining performance.
Pruning: This technique involves setting the smallest weight values to zero, storing only the non-zero values and a bitmask, which helps in efficiently packing model weights.
Sparse Palletization and Quantization: These techniques allow for further compression by combining sparsity with other compression modes.
Post-Training Compression with Calibration Data: This new workflow offers a trade-off between data-free and fine-tuning approaches, requiring limited data for calibration without the need for fine-tuning.

For more details, you can refer to the session Bring your machine learning and AI models to Apple silicon (02:47).

3D Asset Optimization

In the session "Optimize your 3D assets for spatial computing," texture packing was discussed as a method to reduce the size of 3D assets:

Texture Packing: This involves combining texture data from separate files into one larger file by utilizing different channels of a color texture. For example, roughness, metallic, and ambient occlusion textures can be combined into a single RGB texture file, reducing the total size of a PBR asset by up to 40%.

For more details, you can refer to the session Optimize your 3D assets for spatial computing (06:25).

These techniques are part of Apple's efforts to optimize performance and reduce the storage requirements of applications and models on their devices.