Can I run an LLM on the device itself

Asked on 2024-08-11

1 search

Tailscan

This post is brought to you by Tailscan,
the best devtool for Tailwind CSS.

Want to sponsor? We'd love to chat →

Yes, you can run a large language model (LLM) on the device itself using Apple's platforms. Apple provides tools and frameworks such as Core ML to facilitate the deployment and execution of machine learning models on their devices. You can start with a model developed in PyTorch, convert it to the Core ML format using Core ML tools, and then integrate it into your app. Core ML optimizes the model for hardware-accelerated execution across the CPU, GPU, and neural engine, making it possible to run models like Whisper, Stable Diffusion, and Mistral on Apple devices.

For more details on running models on device, you can refer to the session Explore machine learning on Apple platforms (07:16).

Explore machine learning on Apple platforms

Explore machine learning on Apple platforms

Get started with an overview of machine learning frameworks on Apple platforms. Whether you’re implementing your first ML model, or an ML expert, we’ll offer guidance to help you select the right framework for your app’s needs.

Platforms State of the Union

Platforms State of the Union

Discover the newest advancements on Apple platforms.

Bring your machine learning and AI models to Apple silicon

Bring your machine learning and AI models to Apple silicon

Learn how to optimize your machine learning and AI models to leverage the power of Apple silicon. Review model conversion workflows to prepare your models for on-device deployment. Understand model compression techniques that are compatible with Apple silicon, and at what stages in your model deployment workflow you can apply them. We’ll also explore the tradeoffs between storage size, latency, power usage and accuracy.