what is new in the vision or VisionKit

Asked on 2025-06-12

1 search

Tailscan

This post is brought to you by Tailscan,
the best devtool for Tailwind CSS.

Want to sponsor? We'd love to chat →

At WWDC 2025, Apple introduced several new features and enhancements to the Vision framework and VisionKit. Here are the key updates:

New APIs in Vision: Vision has added two new APIs:
- Document Recognition: This feature allows for structured document understanding, making it easier to process and understand documents by grouping different document structures.
- Lens Smudge Detection: This new mode helps identify smudges on camera lenses that could potentially ruin images. For more details, you can refer to the session Read documents using the Vision framework.
Hand Pose Detection: The hand pose detection model has been updated, providing enhanced capabilities for tracking hand movements and poses.
Swift Enhancements: The Vision framework now includes a new API with streamlined syntax designed for Swift, along with full support for Swift concurrency and Swift 6. This makes it easier to integrate computer vision capabilities into apps.

For more detailed information on these updates, you can check out the session Discover machine learning & AI frameworks on Apple platforms and Discover Swift enhancements in the Vision framework.

Discover machine learning & AI frameworks on Apple platforms

Discover machine learning & AI frameworks on Apple platforms

Tour the latest updates to machine learning and AI frameworks available on Apple platforms. Whether you are an app developer ready to tap into Apple Intelligence, an ML engineer optimizing models for on-device deployment, or an AI enthusiast exploring the frontier of what is possible, we’ll offer guidance to help select the right tools for your needs.

What’s new in visionOS 26

What’s new in visionOS 26

Explore exciting new features in visionOS 26. Discover enhanced volumetric APIs and learn how you can combine the power of SwiftUI, RealityKit and ARKit. Find out how you can build more engaging apps and games using faster hand tracking and input from spatial accessories. Get a sneak peek at updates to SharePlay, Compositor Services, immersive media, spatial web, Enterprise APIs, and much more.

Read documents using the Vision framework

Read documents using the Vision framework

Learn about the latest advancements in the Vision framework. We’ll introduce RecognizeDocumentsRequest, and how you can use it to read lines of text and group them into paragraphs, read tables, etc. And we’ll also dive into camera lens smudge detection, and how to identify potentially smudged images in photo libraries or your own camera capture pipeline.