Splits a PDF file into per-page images and returns each page as a row.
Syntax
PDF_TO_IMAGES(content [, image_format] [, dpi] [, start_page] [, pages])Parameters
Parameter | Type | Required | Description |
| VARBINARY | Yes | The binary content of the PDF file. Use |
| STRING | No | The output image format. Supported values: |
| INT | No | The rendering resolution in dots per inch (DPI), which controls image sharpness. Default: |
| INT | No | The first page to process. Page numbers are 0-indexed. Default: |
| INT | No | The number of pages to process, starting from |
Return parameters
The function returns one row per page, with the following columns:
Parameter | Type | Description |
| STRING | The MIME type of the output image, such as |
| INT | The PDF page number, 0-indexed. |
| VARBINARY | The binary content of the page image. |
Example
The following query fetches a PDF from a URL and converts each page to a JPEG image at 150 DPI. The LATERAL TABLE syntax calls PDF_TO_IMAGES as a table-valued function and joins its output rows with the input.
SELECT
p.mime_type AS mime_type,
p.page_no AS page_no
FROM (
SELECT FETCH_CONTENT(pdf_url) AS pdf_content
FROM (
VALUES ('https://example.com/sample.pdf')
) T (pdf_url)
) AS t1,
LATERAL TABLE(PDF_TO_IMAGES(t1.pdf_content, 'jpg', 150)) AS p(mime_type, page_no, image_content);Sample output:
mime_type(STRING) | page_no(INT) |
image/jpeg | 0 |
image/jpeg | 1 |
image/jpeg | 2 |
image/jpeg | 3 |