I want to extract text from an imafe. The image might contain table and text and can be formatted in disticnt ways

Asked on 2025-06-12

1 search

Tailscan

This post is brought to you by Tailscan,
the best devtool for Tailwind CSS.

Want to sponsor? We'd love to chat →

To extract text from an image, especially when the image contains structured formats like tables and lists, you can use the Vision framework's new API called Recognized Documents Request. This API is designed to detect and extract structural elements from documents, such as tables, lists, and paragraphs, and it can recognize text in 26 languages. It provides a hierarchical structure of the document, allowing you to understand not just the text but also its formatting and structure.

For example, if you have a document with a table, the API can detect the table structure, including rows and columns, and provide the content of each cell. This is particularly useful for extracting information from documents where the layout is important, such as forms or spreadsheets.

You can learn more about this feature in the session titled "Read documents using the Vision framework" from WWDC 2025. Here's a relevant section from the session: Read documents using the Vision framework (00:01:29).

Read documents using the Vision framework

Read documents using the Vision framework

Learn about the latest advancements in the Vision framework. We’ll introduce RecognizeDocumentsRequest, and how you can use it to read lines of text and group them into paragraphs, read tables, etc. And we’ll also dive into camera lens smudge detection, and how to identify potentially smudged images in photo libraries or your own camera capture pipeline.

Develop for Shortcuts and Spotlight with App Intents

Develop for Shortcuts and Spotlight with App Intents

Learn about how building App Intents that make actions available and work best with the new features in Shortcuts and Spotlight on Mac. We’ll show you how your actions combine in powerful ways with the new Apple Intelligence actions available in the Shortcuts app. We’ll deep-dive into how the new “Use Model” action works, and how it interacts with your app’s entities. And we’ll discuss how to use the App Intents APIs to make your actions available in Spotlight.

Discover Swift enhancements in the Vision framework

Discover Swift enhancements in the Vision framework

The Vision Framework API has been redesigned to leverage modern Swift features like concurrency, making it easier and faster to integrate a wide array of Vision algorithms into your app. We’ll tour the updated API and share sample code, along with best practices, to help you get the benefits of this framework with less coding effort. We’ll also demonstrate two new features: image aesthetics and holistic body pose.