What models does apple intelligence rely on?

Apple Intelligence relies on several models and techniques to deliver its capabilities:

Foundation Model: Apple Intelligence starts with an on-device foundation model, which is a highly capable large language model. This model is designed to be powerful enough for the desired experiences while being small enough to run on a device (Platforms State of the Union).
Adapters: These are small collections of model weights that overlay onto the common base foundation model. They can be dynamically loaded and swapped, allowing the foundation model to specialize itself for specific tasks (Platforms State of the Union).
Quantization Techniques: Apple uses state-of-the-art quantization techniques to compress the model, reducing a 16-bit parameter model to less than four bits per parameter while maintaining model quality (Platforms State of the Union).
Inference Optimization: Techniques such as speculative decoding, context pruning, and group query attention are used to optimize inference performance and efficiency (Platforms State of the Union).
Diffusion Models: These models generate images using adapters for different styles, such as in the Genmoji feature (Platforms State of the Union).
Private Cloud Compute: For more advanced features requiring larger models, Apple extends its intelligence to the cloud with private cloud compute, ensuring privacy and security (Platforms State of the Union).
Semantic Index and App Intents Toolbox: Apple Intelligence includes an on-device semantic index to organize personal information and an app intents toolbox to understand app capabilities and take actions on behalf of the user (Platforms State of the Union).
ML-powered APIs: Apple provides APIs powered by their models, allowing developers to integrate intelligent features into their apps (Explore machine learning on Apple platforms).
Open Source Tools: Apple supports the open-source community with tools like MLX, designed for machine learning research and exploration on Apple silicon (Explore machine learning on Apple platforms).

These models and techniques are designed to run efficiently on Apple devices, leveraging the power of Apple Silicon to deliver low latency and high performance while maintaining user privacy.