What’s new with neural engine

At WWDC, Apple introduced several updates related to the Neural Engine, particularly in the context of machine learning and AI frameworks. Here are some key highlights:

Neural Engine Access Entitlement: In the session "Introducing enterprise APIs for visionOS," Apple discussed the neural engine access entitlement, which allows models to run on the Neural Engine. This entitlement enables powerful machine learning capabilities by unlocking the Neural Engine as a compute device for apps, allowing Core ML to dynamically decide the most efficient processing unit for a model (Introducing enterprise APIs for visionOS).
Core ML Optimization: Core ML optimizes execution across the CPU, GPU, and Neural Engine, making use of all available compute resources. This optimization is part of Apple's effort to enhance performance and efficiency for on-device machine learning tasks (Discover machine learning & AI frameworks on Apple platforms).
Inference Performance: In the "Platforms State of the Union" session, Apple highlighted improvements in inference performance and efficiency, optimizing models to get the shortest time to process a prompt and produce a response. This includes technologies like speculative decoding and context pruning, all tuned to maximize the capabilities of the Neural Engine (Platforms State of the Union).

These updates reflect Apple's ongoing commitment to enhancing the capabilities of the Neural Engine, making it a central component of their machine learning and AI strategy across devices.