extract table in images

Asked on 06/16/2025

2 searches

To extract tables from images, you can use the Vision framework's Recognized Documents request. This API allows you to detect and extract structured elements like tables from documents. Here's a brief overview of how it works:

  1. Capture the Image: Take a photo of the document using a device like an iPad.

  2. Create a Recognized Documents Request: Use this request to detect tables within the image. The request will return a document observation, which contains the structure of the document, including tables.

  3. Extract Table Structure: Access the tables property on the document observation to extract tables. Each table is composed of a 2D array of cells, which can be accessed by rows or columns. The table's boundary is defined as the bounding region, providing coordinates relative to the image.

  4. Access Table Content: Each cell in the table has properties indicating its row and column. The content of a cell can include text, tables, lists, or barcodes. You can extract text from each cell using the transcript, which provides all text in a cell as a single string.

For more detailed information, you can refer to the session "Read documents using the Vision framework" from WWDC 2025. Here's a link to the session where table extraction is discussed in detail.