I want to extract text from an imafe. The image might contain table and text and can be formatted in disticnt ways

Asked on 06/12/2025

1 search

To extract text from an image, especially when the image contains structured formats like tables and lists, you can use the Vision framework's new API called Recognized Documents Request. This API is designed to detect and extract structural elements from documents, such as tables, lists, and paragraphs, and it can recognize text in 26 languages. It provides a hierarchical structure of the document, allowing you to understand not just the text but also its formatting and structure.

For example, if you have a document with a table, the API can detect the table structure, including rows and columns, and provide the content of each cell. This is particularly useful for extracting information from documents where the layout is important, such as forms or spreadsheets.

You can learn more about this feature in the session titled "Read documents using the Vision framework" from WWDC 2025. Here's a relevant section from the session: Read documents using the Vision framework (00:01:29).