Can I run an LLM on my phone?
Generated on 7/16/2024
7 searches

This post is brought to you by Tailscan,
the best devtool for Tailwind CSS.
Yes, you can run a large language model (LLM) on your phone. Apple provides the necessary tools and frameworks to facilitate this. You can import and run on-device AI models, including large language models, using Core ML. This framework optimizes hardware-accelerated execution across the CPU, GPU, and neural engine, making it possible to run models like Whisper, Stable Diffusion, and Mistral on Apple devices, including iPhones.
To get started, you can convert your PyTorch models into the Core ML format using Core ML tools, which offer various optimization techniques. Once converted, you can integrate and run these models within your app using the Core ML framework.
For more detailed information, you can refer to the following sessions:
- Explore machine learning on Apple platforms (07:32)
- Platforms State of the Union (16:37)
- Bring your machine learning and AI models to Apple silicon (01:00)
These sessions cover the steps and tools required to run LLMs on Apple devices, including iPhones.

Platforms State of the Union
Discover the newest advancements on Apple platforms.

Deploy machine learning and AI models on-device with Core ML
Learn new ways to optimize speed and memory performance when you convert and run machine learning and AI models through Core ML. We’ll cover new options for model representations, performance insights, execution, and model stitching which can be used together to create compelling and private on-device experiences.

Explore machine learning on Apple platforms
Get started with an overview of machine learning frameworks on Apple platforms. Whether you’re implementing your first ML model, or an ML expert, we’ll offer guidance to help you select the right framework for your app’s needs.

Train your machine learning and AI models on Apple GPUs
Learn how to train your models on Apple Silicon with Metal for PyTorch, JAX and TensorFlow. Take advantage of new attention operations and quantization support for improved transformer model performance on your devices.

Bring your machine learning and AI models to Apple silicon
Learn how to optimize your machine learning and AI models to leverage the power of Apple silicon. Review model conversion workflows to prepare your models for on-device deployment. Understand model compression techniques that are compatible with Apple silicon, and at what stages in your model deployment workflow you can apply them. We’ll also explore the tradeoffs between storage size, latency, power usage and accuracy.