tell me about the new framework to run LLM local on device

Apple introduced a new framework called the Foundation Models Framework at WWDC 2025, which allows developers to run large language models (LLMs) directly on devices. This framework is designed to enhance app features by providing access to on-device LLMs with a convenient Swift API. It is available on macOS, iOS, iPadOS, and VisionOS.

The Foundation Models Framework is optimized for tasks such as content generation, text summarization, and user input analysis, all while ensuring user data privacy since everything runs on-device. This means that the models can operate offline, and there's no need for cloud API costs or data to be sent off-device.

The framework supports a large language model with 3 billion parameters, quantized to 2 bits, making it efficient for device-scale applications. It is not intended for tasks requiring extensive world knowledge or advanced reasoning, which are typically handled by server-scale LLMs.

For developers looking to integrate these capabilities, the framework provides a simple API to prompt the model and generate responses. Additionally, the Core ML framework can be used to import and run other AI models on Apple devices, optimizing performance across the CPU, GPU, and neural engine.

For more detailed information, you can refer to the session Meet the Foundation Models framework (00:00:00).